On Vacation 🏝️

Vaibhav Srivastav

reach-vb

https://vaibhavs10.github.io

AI & ML interests

AGI

Recent Activity

upvoted a changelog 6 days ago

Agent Traces on the Hub

upvoted an article 6 days ago

How we OCR'ed 30,000 papers using Codex, open OCR models and Jobs

upvoted a changelog about 1 month ago

Introducing Buckets: S3-like storage on the Hub

View all activity

Organizations

repliedto takarajordan's post 7 months ago

Yes! As @victor mentioned https://huggingface.co/mcp?login should help you setup without the need of passing your token explicitly.

So as long as the client support OAuth, it should work auto-magically!

Let us know if you face any issues

posted an update 10 months ago

Post

6972

Excited to onboard FeatherlessAI on Hugging Face as an Inference Provider - they bring a fleet of 6,700+ LLMs on-demand on the Hugging Face Hub 🤯

Starting today, you'd be able to access all those LLMs (OpenAI compatible) on HF model pages and via OpenAI client libraries too! 💥

Go, play with it today: https://huggingface.co/blog/inference-providers-featherless

P.S. They're also bringing on more GPUs to support all your concurrent requests!

1 reply

reactedto jsulz's post with 🔥 11 months ago

Post

2676

Heyo @RichardErkhov the

xet-team at Hugging face was wondering if you wanted to join the fun and jump over to Xet storage. 🤗

We've been onboarding folks https://huggingface.co/blog/xet-on-the-hub know the backend can scale (Llama 4 and Qwen 3 are on Xet), is great for working with quants (see xet-team/quantization-dedup ), and we're pushing on inviting impactful orgs and users on the Hub. You fit the bill.

We'd love to onboard you, get some feedback, and create some excitement 🎉

The steps are pretty straightforward - join the waitlist at hf.co/join/xet and we'll take care of the rest.

The system is fully backward compatible, so you shouldn't notice a thing. BUT to get the best experience when uploading/downloading, make sure you have hf_xet installed alongside the latest huggingface_hub

What do you think?

4 replies

repliedto their post 11 months ago

we're still optimising the > 50GB path, so at least right now, I'd recommend keeping <50 GB shards but this might change soon and then we can work out a plan

repliedto their post 11 months ago

perfect! can you try and join the waitlist via hf.co/join/xet please!

posted an update 11 months ago

Post

4754

hey hey @mradermacher - VB from Hugging Face here, we'd love to onboard you over to our optimised xet backend! 💥

as you know we're in the process of upgrading our storage backend to xet (which helps us scale and offer blazingly fast upload/ download speeds too): https://huggingface.co/blog/xet-on-the-hub and now that we are certain that the backend can scale with even big models like Llama 4/ Qwen 3 - we;re moving to the next phase of inviting impactful orgs and users on the hub over as you are a big part of the open source ML community - we would love to onboard you next and create some excitement about it in the community too!

in terms of actual steps - it should be as simple as one of the org admins to join hf.co/join/xet - we'll take care of the rest.

p.s. you'd need to have a the latest hf_xet version of huggingface_hub lib but everything else should be the same: https://huggingface.co/docs/hub/storage-backends#using-xet-storage

p.p.s. this is fully backwards compatible so everything will work as it should! 🤗

16 replies

reactedto fdaudens's post with 👍 11 months ago

Post

1280

The rapid progress in small audio models is mind-blowing! 🤯 Just tested OuteTTS v0.2 - cloned my voice from a 10s clip with impressive accuracy and natural prosody.

At 500M parameters, it's efficient enough to run on basic hardware but powerful enough for professional use.

This could transform how we produce audio content for new - think instant translated interviews keeping original voices, or scaled audio article production!

Demo and Model on the Hub: OuteAI/OuteTTS-0.2-500M h/t @reach-vb

3 replies

reactedto clem's post with ❤️👀 about 1 year ago

Post

2317

Very interesting security section by @yjernite @lvwerra @reach-vb @dvilasuero & the team replicating R1. Broadly applicable to most open-source models & some to APIs (but APIs have a lot more additional risks because you're not in control of the underlying system):

https://huggingface.co/blog/open-r1/update-4#is-it-safe

1 reply

reactedto lbourdois's post with 🔥❤️ about 1 year ago

Post

3681

We introduce FAT5 (Flash Attention T5) ⚡

An implementation of T5 in PyTorch with UL2 objective optimized for GPGPU for both training and inference thanks to 13 different optimizations.
The main one is that we have designed a CUDA kernel to expand the Flash Attention by @tridao with RPE biases and supports other PE such as RoPE, ALiBi or FIRE.
The result kernel is 2 times faster than a SPDA implementation.
We also use Triton kernels to optimize certain parts of the architecture, such as the cross-entropy and RMSNorm layer.

The various kernels have been carefully built to be compatible with BF16 and torch.compile to go even faster and achieve efficient pretraining.

All other optimizations are described in a 📝 subsequent blog post available on @huggingface 🤗: CATIE-AQ/FAT5-report.

This methodology enabled us to efficiently pretrain as a proof of concept a FAT5 with 147M parameters in French in a reasonable time (1,461H for 419B tokens), with limited resources (1 A100 i.e. a computational budget of ~ €1,900) and a low carbon footprint (13.5kg eq CO2).

The model's weights are also available on Hugging Face: CATIE-AQ/FAT5-small.
Not very useful in practice, it's a PoC and not an instructed model (it's planned for later).

All the code is available on GitHub if you want to pretrain your own model in your own language or for a specific domain: https://github.com/catie-aq/flashT5 ⭐

Ending by indicating that was a joint project with @BorisAlbar at hf.co/CATIE-AQ.

reactedto julien-c's post with 🚀🔥 about 1 year ago

Post

4458

Important notice 🚨

For Inference Providers who have built support for our Billing API (currently: Fal, Novita, HF-Inference – with more coming soon), we've started enabling Pay as you go (=PAYG)

What this means is that you can use those Inference Providers beyond the free included credits, and they're charged to your HF account.

You can see it on this view: any provider that does not have a "Billing disabled" badge, is PAYG-compatible.

9 replies

reactedto PranjaliJoshi's post with ❤️👀 about 1 year ago

Post

768

🌍 Have you tried Cosmos world foundation models on Hugging Face? Because more updates are coming! 🚀

Cosmos world foundation models (WFMs) are generative pretrained models for synthetic data generation for training AI models for robot or autonomous vehicle development.

🛠️ If you are building generative VLMs or foundation models for physical AI like policy models- there are new updates coming at NVIDIA GTC.

GTC is NVIDIA’s biggest annual event (March 17-21) - it will have deep dives, training labs, and researcher-led sessions on Cosmos.

Plus, Jensen Huang’s keynote! 🎤

🎟️ 20% off GTC registration → Use code HUGGINGFACE20
🔗 https://www.nvidia.com/gtc/
📍 Happening in person at the San Jose Convention Center and online.
Explore all Cosmos sessions at GTC: https://nvda.ws/41yBkmY

Try the existing Cosmos WFMs:

🔗 Hugging Face models: nvidia/cosmos-6751e884dc10e013a0a0d8e6

🛠️ Post-training scripts: https://github.com/NVIDIA/Cosmos/blob/main/cosmos1/models/POST_TRAINING.md

1 reply

reactedto AdinaY's post with 🚀🔥😎 about 1 year ago

Post

4105

Exciting releases from the Chinese community this February🔥
👉 https://huggingface.co/collections/zh-ai-community/2025-february-67a35aaa68e97812def5b6ef

MLLM:
✨ Ovis2 by Alibaba
AIDC-AI/ovis2-67ab36c7e497429034874464
✨ Step Audio Chat by StepFun AI
stepfun-ai/step-audio-67b33accf45735bb21131b0b

Audio:
✨ Step Audio TTS by StepFunAI
stepfun-ai/Step-Audio-TTS-3B
✨ InspireMusic by Alibaba

FunAudioLLM
✨ Baichuan Audio by BaichuanAI
baichuan-inc/Baichuan-Audio-Instruct

Video:
✨ Wan2.1 by Alibaba_Wan
Wan-AI/Wan2.1-T2V-14B
✨ Stepvideo-T2V by StepFun AI
stepfun-ai/stepvideo-t2v
✨ SkyReels-V1 by Skywork
Skywork/skyreels-v1-67b34676ff65b4ec02d16307
✨ LLaDA-8B by RenminUniversity
GSAI-ML/LLaDA-8B-Instruct

MoE:
✨ Moonlight-16B by MoonshotAI (Kimi)
moonshotai/Moonlight-16B-A3B-Instruct

Reasoning:
✨ TinyR1-32B by Qihoo360
qihoo360/TinyR1-32B-Preview

Dataset:
✨ Chinese DeepSeek R1-Distill data -110k
Congliu/Chinese-DeepSeek-R1-Distill-data-110k

repliedto lysandre's post about 1 year ago

let's gooo!

reactedto lysandre's post with 🚀 about 1 year ago

Post

8444

SmolVLM-2 and SigLIP-2 are now part of transformers in dedicated releases!

They're added on top of the v4.49.0 release, and can be installed from the following tags: v4.49.0-SmolVLM-2 and v4.49.0-SigLIP-2.

This marks a new beginning for the release process of transformers. For the past five years, we've been doing monthly releases featuring many models (v4.49.0, the latest release, features 9 new architectures).

Starting with SmolVLM-2 & SigLIP2, we'll now additionally release tags supporting new models on a stable branch. These models are therefore directly available for use by installing from the tag itself. These tags will continue to be updated with fixes applied to these models.

Going forward, continue expecting software releases following semantic versioning: v4.50.0 will have ~10 new architectures compared to v4.49.0, as well as a myriad of new features, improvements and bug fixes. Accompanying these software releases, we'll release tags offering brand new models as fast as possible, to make them accessible to all immediately.

1 reply

Vaibhav Srivastav

AI & ML interests

Recent Activity

Organizations

reach-vb's activity