chaanks commited on
Commit
3d31820
1 Parent(s): ed516fe

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +91 -1
README.md CHANGED
@@ -1,3 +1,93 @@
1
  ---
2
- license: apache-2.0
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language: "en"
3
+ inference: false
4
+ tags:
5
+ - Vocoder
6
+ - HiFIGAN
7
+ - speech-synthesis
8
+ - speechbrain
9
+ license: "apache-2.0"
10
+ datasets:
11
+ - LJSpeech
12
  ---
13
+
14
+
15
+ <iframe src="https://ghbtns.com/github-btn.html?user=speechbrain&repo=speechbrain&type=star&count=true&size=large&v=2" frameborder="0" scrolling="0" width="170" height="30" title="GitHub"></iframe>
16
+ <br/><br/>
17
+
18
+ # Vocoder with HiFIGAN Unit
19
+
20
+ ## <font color="red"> Work In Progress .... </font>
21
+
22
+ ## Install SpeechBrain
23
+
24
+ First of all, please install tranformers and SpeechBrain with the following command:
25
+
26
+ ```
27
+ pip install speechbrain transformers
28
+ ```
29
+
30
+ Please notice that we encourage you to read our tutorials and learn more about
31
+ [SpeechBrain](https://speechbrain.github.io).
32
+
33
+
34
+ ### Using the Vocoder
35
+
36
+ ```python
37
+ import torch
38
+ from speechbrain.pretrained import UnitHIFIGAN
39
+
40
+ hifi_gan_unit = UnitHIFIGAN.from_hparams(source="speechbrain/tts-hifigan-unit-hubert-l6-k100-ljspeech", savedir="tmpdir_vocoder")
41
+ codes = torch.randint(0, 99, (100,))
42
+ waveform = hifi_gan_unit.decode_unit(codes)
43
+
44
+ ```
45
+
46
+ ### Using the Vocoder with the S2UT
47
+ ```python
48
+ import torch
49
+ import torchaudio
50
+ from speechbrain.pretrained import EncoderDecoderS2UT
51
+ from speechbrain.pretrained import UnitHIFIGAN
52
+
53
+ # Intialize S2UT (Transformer) and Vocoder (HiFIGAN Unit)
54
+ s2ut = EncoderDecoderS2UT.from_hparams(source="speechbrain/s2st-transformer-fr-en-hubert-l6-k100-cvss", savedir="tmpdir_s2ut")
55
+ hifi_gan_unit = UnitHIFIGAN.from_hparams(source="speechbrain/tts-hifigan-unit-hubert-l6-k100-ljspeech", savedir="tmpdir_vocoder")
56
+
57
+ # Running the S2UT model
58
+ codes = s2ut.translate_file("speechbrain/s2st-transformer-fr-en-hubert-l6-k100-cvss/example-fr.wav")
59
+ codes = torch.IntTensor(codes)
60
+
61
+ # Running Vocoder (units-to-waveform)
62
+ waveforms = hifi_gan_unit.decode_unit(codes)
63
+
64
+ # Save the waverform
65
+ torchaudio.save('example.wav',waveforms.squeeze(1), 16000)
66
+ ```
67
+
68
+ ### Inference on GPU
69
+ To perform inference on the GPU, add `run_opts={"device":"cuda"}` when calling the `from_hparams` method.
70
+
71
+
72
+ ### Limitations
73
+ The SpeechBrain team does not provide any warranty on the performance achieved by this model when used on other datasets.
74
+
75
+ #### Referencing SpeechBrain
76
+
77
+ ```
78
+ @misc{SB2021,
79
+ author = {Ravanelli, Mirco and Parcollet, Titouan and Rouhe, Aku and Plantinga, Peter and Rastorgueva, Elena and Lugosch, Loren and Dawalatabad, Nauman and Ju-Chieh, Chou and Heba, Abdel and Grondin, Francois and Aris, William and Liao, Chien-Feng and Cornell, Samuele and Yeh, Sung-Lin and Na, Hwidong and Gao, Yan and Fu, Szu-Wei and Subakan, Cem and De Mori, Renato and Bengio, Yoshua },
80
+ title = {SpeechBrain},
81
+ year = {2021},
82
+ publisher = {GitHub},
83
+ journal = {GitHub repository},
84
+ howpublished = {\\\\url{https://github.com/speechbrain/speechbrain}},
85
+ }
86
+ ```
87
+
88
+ #### About SpeechBrain
89
+ SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to be simple, extremely flexible, and user-friendly. Competitive or state-of-the-art performance is obtained in various domains.
90
+
91
+ Website: https://speechbrain.github.io/
92
+
93
+ GitHub: https://github.com/speechbrain/speechbrain