Text-to-Speech
Fairseq
English
audio
patrickvonplaten commited on
Commit
17530a7
1 Parent(s): d197f02
Files changed (6) hide show
  1. README.md +12 -0
  2. config.yaml +24 -0
  3. fbank_mfa_gcmvn_stats.npz +0 -0
  4. pytorch_model.pt +3 -0
  5. run_fast_speech_2.py +5 -0
  6. vocab.txt +71 -0
README.md ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Example to download fastspeech2 from fairseq
2
+
3
+ Weights are downloaded from:
4
+
5
+ We still need to git clone this repo first before being able to download it.
6
+ Having `cd`'ed into the repo we can do the following:
7
+
8
+ ```python
9
+ from fairseq.checkpoint_utils import load_model_ensemble_and_task_from_hf
10
+
11
+ model = load_model_ensemble_and_task_from_hf("patrickvonplaten/fairseq-fastspeech2")
12
+ ```
config.yaml ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ features:
2
+ energy_max: 3.2244551181793213
3
+ energy_min: -4.9544901847839355
4
+ eps: 1.0e-05
5
+ f_max: 8000
6
+ f_min: 0
7
+ hop_len_t: 0.011609977324263039
8
+ hop_length: 256
9
+ n_fft: 1024
10
+ n_mels: 80
11
+ n_stft: 513
12
+ pitch_max: 5.733940816898645
13
+ pitch_min: -4.660287183665281
14
+ sample_rate: 22050
15
+ type: spectrogram+melscale+log
16
+ win_len_t: 0.046439909297052155
17
+ win_length: 1024
18
+ window_fn: hann
19
+ global_cmvn:
20
+ stats_npz_path: fbank_mfa_gcmvn_stats.npz
21
+ transforms:
22
+ '*':
23
+ - global_cmvn
24
+ vocab_filename: vocab.txt
fbank_mfa_gcmvn_stats.npz ADDED
Binary file (1.14 kB). View file
 
pytorch_model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a48d454fe66939079d0ddb70f1c062ec669f521a7cfadc608968746e312986ab
3
+ size 494816801
run_fast_speech_2.py ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ from fairseq.checkpoint_utils import load_model_ensemble_and_task
3
+
4
+ # model = load_model_ensemble_and_task(["./pytorch_model.pt"], arg_overrides={"config_yaml": "./config.yaml", "data": "./"})
5
+ model = load_model_ensemble_and_task(["./pytorch_model.pt"], arg_overrides={"data": "./"})
vocab.txt ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ AH0 71007
2
+ N 63410
3
+ T 60842
4
+ S 40263
5
+ D 39886
6
+ R 35965
7
+ L 30358
8
+ sp 27584
9
+ IH0 27113
10
+ DH 26584
11
+ K 25851
12
+ IH1 25683
13
+ Z 25387
14
+ EH1 21690
15
+ AE1 21648
16
+ M 21537
17
+ W 18760
18
+ P 18458
19
+ ER0 18446
20
+ V 18169
21
+ IY0 17832
22
+ AH1 16995
23
+ F 15549
24
+ B 14227
25
+ HH 13468
26
+ IY1 12751
27
+ EY1 12141
28
+ AO1 11595
29
+ AA1 10589
30
+ AY1 9624
31
+ UW1 8865
32
+ SH 7449
33
+ OW1 7441
34
+ NG 6705
35
+ G 5472
36
+ ER1 4898
37
+ Y 4548
38
+ JH 4486
39
+ CH 4355
40
+ TH 3980
41
+ AW1 3607
42
+ UH1 2469
43
+ EH2 1881
44
+ spn 1774
45
+ AO0 1357
46
+ OW0 1328
47
+ EY2 1258
48
+ IH2 1251
49
+ AE2 1104
50
+ UW0 1077
51
+ AY2 1062
52
+ AA2 774
53
+ OY1 771
54
+ AO2 622
55
+ ZH 587
56
+ EH0 568
57
+ OW2 557
58
+ EY0 443
59
+ IY2 435
60
+ UW2 431
61
+ AY0 390
62
+ AE0 374
63
+ AH2 316
64
+ AW2 290
65
+ AA0 259
66
+ ER2 136
67
+ UH2 127
68
+ OY2 44
69
+ UH0 36
70
+ AW0 35
71
+ OY0 4