53 1 7

Darío Muñoz Prudant PRO

prudant

https://medium.com/@prudant

puppetm4st3r

AI & ML interests

Tech enthusiast, avid AI learner, and perpetual seeker of new knowledge.

Recent Activity

Reacted to reach-vb's post with 👀 22 days ago

Smol TTS models are here! OuteTTS-0.1-350M - Zero shot voice cloning, built on LLaMa architecture, CC-BY license! 🔥 > Pure language modeling approach to TTS > Zero-shot voice cloning > LLaMa architecture w/ Audio tokens (WavTokenizer) > BONUS: Works on-device w/ llama.cpp ⚡ Three-step approach to TTS: > Audio tokenization using WavTokenizer (75 tok per second) > CTC forced alignment for word-to-audio token mapping > Structured prompt creation w/ transcription, duration, audio tokens The model is extremely impressive for 350M parameters! Kudos to the OuteAI team on such a brilliant feat - I'd love to see this be applied on larger data and smarter backbones like SmolLM 🤗 Check out the models here: https://huggingface.co/collections/OuteAI/outetts-6728aa71a53a076e4ba4817c

New activity 25 days ago

OpenGVLab/InternVL2-8B-MPO:awq quant

New activity about 1 month ago

meta-llama/Llama-3.2-11B-Vision:How to use visual grounding with this model ?

View all activity

Articles

¡Lanzamiento de la Comunidad Latinoamericana de NLP en Hugging Face! 🌟

Oct 18

• 7

Organizations

prudant's activity

reacted to reach-vb's post with 👀 22 days ago

Post

1576

Smol TTS models are here! OuteTTS-0.1-350M - Zero shot voice cloning, built on LLaMa architecture, CC-BY license! 🔥

> Pure language modeling approach to TTS
> Zero-shot voice cloning
> LLaMa architecture w/ Audio tokens (WavTokenizer)
> BONUS: Works on-device w/ llama.cpp ⚡

Three-step approach to TTS:

> Audio tokenization using WavTokenizer (75 tok per second)
> CTC forced alignment for word-to-audio token mapping
> Structured prompt creation w/ transcription, duration, audio tokens

The model is extremely impressive for 350M parameters! Kudos to the
OuteAI team on such a brilliant feat - I'd love to see this be applied on larger data and smarter backbones like SmolLM 🤗

Check out the models here: OuteAI/outetts-6728aa71a53a076e4ba4817c