hexgrad commited on
Commit
57f6396
•
1 Parent(s): 8228a35

Upload 3 files

Browse files
Files changed (4) hide show
  1. .gitattributes +1 -0
  2. README.md +3 -2
  3. demo/af_sky.txt +11 -0
  4. demo/af_sky.wav +3 -0
.gitattributes CHANGED
@@ -35,3 +35,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  TTS-Spaces-Arena-25-Dec-2024.png filter=lfs diff=lfs merge=lfs -text
37
  HEARME.wav filter=lfs diff=lfs merge=lfs -text
 
 
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  TTS-Spaces-Arena-25-Dec-2024.png filter=lfs diff=lfs merge=lfs -text
37
  HEARME.wav filter=lfs diff=lfs merge=lfs -text
38
+ demo/af_sky.wav filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -12,7 +12,7 @@ pipeline_tag: text-to-speech
12
 
13
  **Kokoro** is a frontier TTS model for its size of **82 million parameters** (text in/audio out).
14
 
15
- On 25 Dec 2024, Kokoro v0.19 weights were permissively released in full fp32 precision under an Apache 2.0 license. As of 30 Dec 2024, 9 unique Voicepacks have been released.
16
 
17
  In the weeks leading up to its release, Kokoro v0.19 was the #1🥇 ranked model in [TTS Spaces Arena](https://huggingface.co/hexgrad/Kokoro-82M#evaluation). Kokoro had achieved higher Elo in this single-voice Arena setting over other models, using fewer parameters and less data:
18
  1. **Kokoro v0.19: 82M params, Apache, trained on <100 hours of audio**
@@ -45,7 +45,7 @@ VOICE_NAME = [
45
  'af', # Default voice is a 50-50 mix of Bella & Sarah
46
  'af_bella', 'af_sarah', 'am_adam', 'am_michael',
47
  'bf_emma', 'bf_isabella', 'bm_george', 'bm_lewis',
48
- 'af_nicole', # ASMR voice
49
  ][0]
50
  VOICEPACK = torch.load(f'voices/{VOICE_NAME}.pt', weights_only=True).to(device)
51
  print(f'Loaded voice: {VOICE_NAME}')
@@ -87,6 +87,7 @@ No affiliation can be assumed between parties on different lines.
87
  - 26 Dec 2024: `am_adam`, `am_michael`
88
  - 28 Dec 2024: `bf_emma`, `bf_isabella`, `bm_george`, `bm_lewis`
89
  - 30 Dec 2024: `af_nicole`
 
90
 
91
  ### Licenses
92
  - Apache 2.0 weights in this repository
 
12
 
13
  **Kokoro** is a frontier TTS model for its size of **82 million parameters** (text in/audio out).
14
 
15
+ On 25 Dec 2024, Kokoro v0.19 weights were permissively released in full fp32 precision under an Apache 2.0 license. As of 31 Dec 2024, 10 unique Voicepacks have been released.
16
 
17
  In the weeks leading up to its release, Kokoro v0.19 was the #1🥇 ranked model in [TTS Spaces Arena](https://huggingface.co/hexgrad/Kokoro-82M#evaluation). Kokoro had achieved higher Elo in this single-voice Arena setting over other models, using fewer parameters and less data:
18
  1. **Kokoro v0.19: 82M params, Apache, trained on <100 hours of audio**
 
45
  'af', # Default voice is a 50-50 mix of Bella & Sarah
46
  'af_bella', 'af_sarah', 'am_adam', 'am_michael',
47
  'bf_emma', 'bf_isabella', 'bm_george', 'bm_lewis',
48
+ 'af_nicole', 'af_sky',
49
  ][0]
50
  VOICEPACK = torch.load(f'voices/{VOICE_NAME}.pt', weights_only=True).to(device)
51
  print(f'Loaded voice: {VOICE_NAME}')
 
87
  - 26 Dec 2024: `am_adam`, `am_michael`
88
  - 28 Dec 2024: `bf_emma`, `bf_isabella`, `bm_george`, `bm_lewis`
89
  - 30 Dec 2024: `af_nicole`
90
+ - 31 Dec 2024: `af_sky`
91
 
92
  ### Licenses
93
  - Apache 2.0 weights in this repository
demo/af_sky.txt ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Last September, I received an offer from Sam Altman, who wanted to hire me to voice the current ChatGPT 4 system. He told me that he felt that by my voicing the system, I could bridge the gap between tech companies and creatives and help consumers to feel comfortable with the seismic shift concerning humans and AI. He said he felt that my voice would be comforting to people.
2
+
3
+ After much consideration and for personal reasons, I declined the offer. Nine months later, my friends, family and the general public all noted how much the newest system named Sky sounded like me.
4
+
5
+ When I heard the released demo, I was shocked, angered and in disbelief that Mr. Altman would pursue a voice that sounded so eerily similar to mine that my closest friends and news ou'tlits could not tell the difference. Mr. Altman even insinuated that the similarity was intentional, tweeting a single word — hur — a reference to the film in which I voiced a chat system, Samantha, who forms an intimate relationship with a human.
6
+
7
+ Two days before the ChatGPT 4 demo was released, Mr. Altman contacted my agent, asking me to reconsider. Before we could connect, the system was out there.
8
+
9
+ As a result of their actions, I was forced to hire legal counsel, who wrote two letters to Mr. Altman and OpenAI, setting out what they had done and asking them to detail the exact process by which they created the Sky voice. Consequently, OpenAI reluctantly agreed to take down the Sky voice.
10
+
11
+ In a time when we are all grappling with deepfakes and the protection of our own likeness, our own work, our own identities, I believe these are questions that deserve absolute clarity. I look forward to resolution in the form of transparency and the passage of appropriate legislation to help ensure that individual rights are protected.
demo/af_sky.wav ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ce36292bf868aa5f15931f3d81a9f46cc35ea76372e618a5e4453c9542e5ad7e
3
+ size 5486636