hexgrad commited on
Commit
3095858
1 Parent(s): efb9d1b

Upload 2 files

Browse files
Files changed (2) hide show
  1. README.md +5 -2
  2. kokoro-v0_19.onnx +3 -0
README.md CHANGED
@@ -12,7 +12,7 @@ pipeline_tag: text-to-speech
12
 
13
  **Kokoro** is a frontier TTS model for its size of **82 million parameters** (text in/audio out).
14
 
15
- On 25 Dec 2024, Kokoro v0.19 weights were permissively released in full fp32 precision under an Apache 2.0 license. As of 31 Dec 2024, 10 unique Voicepacks have been released.
16
 
17
  In the weeks leading up to its release, Kokoro v0.19 was the #1🥇 ranked model in [TTS Spaces Arena](https://huggingface.co/hexgrad/Kokoro-82M#evaluation). Kokoro had achieved higher Elo in this single-voice Arena setting over other models, using fewer parameters and less data:
18
  1. **Kokoro v0.19: 82M params, Apache, trained on <100 hours of audio**
@@ -63,7 +63,9 @@ from IPython.display import display, Audio
63
  display(Audio(data=audio, rate=24000, autoplay=True))
64
  print(out_ps)
65
  ```
66
- The inference code was quickly hacked together on Christmas Day. It is not clean code and leaves a lot of room for improvement. If you'd like to contribute, feel free to open a PR.
 
 
67
 
68
  ### Model Facts
69
 
@@ -88,6 +90,7 @@ No affiliation can be assumed between parties on different lines.
88
  - 28 Dec 2024: `bf_emma`, `bf_isabella`, `bm_george`, `bm_lewis`
89
  - 30 Dec 2024: `af_nicole`
90
  - 31 Dec 2024: `af_sky`
 
91
 
92
  ### Licenses
93
  - Apache 2.0 weights in this repository
 
12
 
13
  **Kokoro** is a frontier TTS model for its size of **82 million parameters** (text in/audio out).
14
 
15
+ On 25 Dec 2024, Kokoro v0.19 weights were permissively released in full fp32 precision under an Apache 2.0 license. As of 2 Jan 2025, 10 unique Voicepacks have been released, and a `.onnx` version of v0.19 is available.
16
 
17
  In the weeks leading up to its release, Kokoro v0.19 was the #1🥇 ranked model in [TTS Spaces Arena](https://huggingface.co/hexgrad/Kokoro-82M#evaluation). Kokoro had achieved higher Elo in this single-voice Arena setting over other models, using fewer parameters and less data:
18
  1. **Kokoro v0.19: 82M params, Apache, trained on <100 hours of audio**
 
63
  display(Audio(data=audio, rate=24000, autoplay=True))
64
  print(out_ps)
65
  ```
66
+ If you have trouble with `espeak-ng`, see this [github issue](https://github.com/bootphon/phonemizer/issues/44#issuecomment-1540885186). [Mac users also see this](https://huggingface.co/hexgrad/Kokoro-82M/discussions/12#677435d3d8ace1de46071489), and [Windows users see this](https://huggingface.co/hexgrad/Kokoro-82M/discussions/12#67742594fdeebf74f001ecfc).
67
+
68
+ For ONNX usage, see [#14](https://huggingface.co/hexgrad/Kokoro-82M/discussions/14).
69
 
70
  ### Model Facts
71
 
 
90
  - 28 Dec 2024: `bf_emma`, `bf_isabella`, `bm_george`, `bm_lewis`
91
  - 30 Dec 2024: `af_nicole`
92
  - 31 Dec 2024: `af_sky`
93
+ - 2 Jan 2025: ONNX v0.19 `ebef4245`
94
 
95
  ### Licenses
96
  - Apache 2.0 weights in this repository
kokoro-v0_19.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ebef42457f7efee9b60b4f1d5aec7692f2925923948a0d7a2a49d2c9edf57e49
3
+ size 345554732