Spaces:
Running
Running
Update index.html (#1)
Browse files- Update index.html (32e614b40ff9d972cd1cee89e55a10072455a186)
- index.html +38 -6
index.html
CHANGED
@@ -8,12 +8,44 @@
|
|
8 |
</head>
|
9 |
<body>
|
10 |
<div class="card">
|
11 |
-
<h1>
|
12 |
-
|
13 |
-
|
14 |
-
|
15 |
-
|
16 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
17 |
</div>
|
18 |
</body>
|
19 |
</html>
|
|
|
8 |
</head>
|
9 |
<body>
|
10 |
<div class="card">
|
11 |
+
<h1>Hugging Face for Audio: Resources ✨</h1>
|
12 |
+
<br>
|
13 |
+
<br>
|
14 |
+
<b>Audio transformers course</b>: https://huggingface.co/learn/audio-course/chapter0/introduction#course-structure. This covers the standard tasks (ASR, TTS, audio classification) with notes on using pre-trained models and fine-tuning. See also Unit 7 for a speaker diarization application.
|
15 |
+
<br>
|
16 |
+
<br>
|
17 |
+
<h2>Using pre-trained models</h2>
|
18 |
+
<ul>
|
19 |
+
<li>With pipelines: https://www.reddit.com/r/MachineLearning/comments/16xshji/d_the_most_complete_audio_ml_toolkit/</li>
|
20 |
+
<li>Transformers docs: https://huggingface.co/docs/transformers/model_doc/audio-spectrogram-transformer</li>
|
21 |
+
</ul>
|
22 |
+
<br>
|
23 |
+
<br>
|
24 |
+
<h2>Training</h2>
|
25 |
+
<ul>
|
26 |
+
<li>Datasets https://huggingface.co/blog/audio-datasets</li>
|
27 |
+
<li>Fine-tune Whisper for ASR https://huggingface.co/blog/fine-tune-whisper</li>
|
28 |
+
<li>Distil Whisper for ASR https://github.com/huggingface/distil-whisper/tree/main/training</li>
|
29 |
+
<li>Fine-tune VITS for TTS https://twitter.com/yoachlacombe/status/1735348885369889264</li>
|
30 |
+
<li>Fine-tune Wav2Vec2 for audio class https://github.com/huggingface/transformers/tree/main/examples/pytorch/audio-classification</li>
|
31 |
+
</ul>
|
32 |
+
<br>
|
33 |
+
<br>
|
34 |
+
<h2>Optimisation</h2>
|
35 |
+
<ul>
|
36 |
+
<li>Whisper JAX for ASR https://github.com/sanchit-gandhi/whisper-jax</li>
|
37 |
+
<li>Distil Whisper for ASR https://github.com/huggingface/distil-whisper/tree/main</li>
|
38 |
+
<li>Insanely Fast Whisper for ASR https://github.com/Vaibhavs10/insanely-fast-whisper</li>
|
39 |
+
<li>Speculative decoding with Whisper for ASR https://huggingface.co/blog/whisper-speculative-decoding</li>
|
40 |
+
<li>Bark for TTS https://huggingface.co/blog/optimizing-bark</li>
|
41 |
+
</ul>
|
42 |
+
<br>
|
43 |
+
<br>
|
44 |
+
<h2>Deployment</h2>
|
45 |
+
<ul>
|
46 |
+
<li>Endpoint https://huggingface.co/blog/run-musicgen-as-an-api</li>
|
47 |
+
<li>Gradio client https://www.gradio.app/docs/client (e.g. for Whisper https://huggingface.co/spaces/hf-audio/whisper-large-v3)</li>
|
48 |
+
</ul>
|
49 |
</div>
|
50 |
</body>
|
51 |
</html>
|