Anurag's picture

Anurag

edwixx

AI & ML interests

TTS , ASR

Recent Activity

updated a model about 6 hours ago
edwixx/9d0d59ef-9c63-4ceb-be44-e71781f4b796
published a model about 6 hours ago
edwixx/9d0d59ef-9c63-4ceb-be44-e71781f4b796
updated a model about 6 hours ago
edwixx/a24a529d-5b2c-46cf-b7f3-53ee1b33dd9f
View all activity

Organizations

Stable Diffusion API's profile picture Tune a video concepts library's profile picture ModelsLab's profile picture Chinese LLMs on Hugging Face's profile picture

edwixx's activity

updated a model about 16 hours ago
published a model about 16 hours ago
reacted to lewtun's post with šŸ”„ 3 days ago
view post
Post
9283
We are reproducing the full DeepSeek R1 data and training pipeline so everybody can use their recipe. Instead of doing it in secret we can do it together in the open!

šŸ§Ŗ Step 1: replicate the R1-Distill models by distilling a high-quality reasoning corpus from DeepSeek-R1.

šŸ§  Step 2: replicate the pure RL pipeline that DeepSeek used to create R1-Zero. This will involve curating new, large-scale datasets for math, reasoning, and code.

šŸ”„ Step 3: show we can go from base model -> SFT -> RL via multi-stage training.

Follow along: https://github.com/huggingface/open-r1
Ā·