Xenova HF staff commited on
Commit
6260458
1 Parent(s): 37c0a4a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +73 -0
README.md CHANGED
@@ -5,4 +5,77 @@ pipeline_tag: text-to-speech
5
 
6
  https://huggingface.co/microsoft/speecht5_tts with ONNX weights to be compatible with Transformers.js.
7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [🤗 Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
 
5
 
6
  https://huggingface.co/microsoft/speecht5_tts with ONNX weights to be compatible with Transformers.js.
7
 
8
+
9
+ ## Usage (Transformers.js)
10
+
11
+ If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@xenova/transformers) using:
12
+ ```bash
13
+ npm i @xenova/transformers
14
+ ```
15
+
16
+ **Example:** Text-to-speech pipeline.
17
+ ```js
18
+ import { pipeline } from '@xenova/transformers';
19
+
20
+ // Create a text-to-speech pipeline
21
+ const synthesizer = await pipeline('text-to-speech', 'Xenova/speecht5_tts', { quantized: false });
22
+
23
+ // Generate speech
24
+ const speaker_embeddings = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/speaker_embeddings.bin';
25
+ const result = await synthesizer('Hello, my dog is cute', { speaker_embeddings });
26
+ console.log(result);
27
+ // {
28
+ // audio: Float32Array(26112) [-0.00005657337896991521, 0.00020583874720614403, ...],
29
+ // sampling_rate: 16000
30
+ // }
31
+ ```
32
+
33
+ Optionally, save the audio to a wav file (Node.js):
34
+ ```js
35
+ import wavefile from 'wavefile';
36
+ import fs from 'fs';
37
+
38
+ const wav = new wavefile.WaveFile();
39
+ wav.fromScratch(1, result.sampling_rate, '32f', result.audio);
40
+ fs.writeFileSync('result.wav', wav.toBuffer());
41
+ ```
42
+
43
+ <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/on1ij9Y269ne9zlYN9mdb.wav"></audio>
44
+
45
+ **Example:** Load processor, tokenizer, and models separately.
46
+ ```js
47
+ // Load the tokenizer and processor
48
+ const tokenizer = await AutoTokenizer.from_pretrained('Xenova/speecht5_tts');
49
+ const processor = await AutoProcessor.from_pretrained('Xenova/speecht5_tts');
50
+
51
+ // Load the models
52
+ // NOTE: We use the unquantized versions as they are more accurate
53
+ const model = await SpeechT5ForTextToSpeech.from_pretrained('Xenova/speecht5_tts', { quantized: false });
54
+ const vocoder = await SpeechT5HifiGan.from_pretrained('Xenova/speecht5_hifigan', { quantized: false });
55
+
56
+ // Load speaker embeddings from URL
57
+ const speaker_embeddings_data = new Float32Array(
58
+ await (await fetch('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/speaker_embeddings.bin')).arrayBuffer()
59
+ );
60
+ const speaker_embeddings = new Tensor(
61
+ 'float32',
62
+ speaker_embeddings_data,
63
+ [1, speaker_embeddings_data.length]
64
+ )
65
+
66
+ // Run tokenization
67
+ const { input_ids } = tokenizer('Hello, my dog is cute');
68
+
69
+ // Generate waveform
70
+ const { waveform } = await model.generate_speech(input_ids, speaker_embeddings, { vocoder });
71
+ console.log(waveform)
72
+ // Tensor {
73
+ // dims: [ 26112 ],
74
+ // type: 'float32',
75
+ // size: 26112,
76
+ // data: Float32Array(26112) [ -0.00043630177970044315, -0.00018082228780258447, ... ],
77
+ // }
78
+ ```
79
+ ---
80
+
81
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [🤗 Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).