AI & ML interests

Our team builds AI with open models and open source, collaborating privately with security and advanced access controls.

Recent Activity

alvarobarttΒ 
posted an update 5 days ago
view post
Post
3130
Learn how to deploy Microsoft Research VibeVoice ASR on Microsoft Azure Foundry with Hugging Face to generate rich audio transcriptions with Who, When, and What! πŸ’₯

> πŸ•’ 60-minute single-pass processing, no chunking or stitching
> πŸ‘€ Customized hotwords to guide recognition on domain-specific content
> πŸ“ Rich transcription: joint ASR + diarization + timestamping in one pass
> 🌍 50+ languages with automatic detection and code-switching support
> πŸ€— Deployed on Microsoft Foundry via an OpenAI-compatible Chat Completions API

https://huggingface.co/docs/microsoft-azure/foundry/examples/deploy-vibevoice-asr
victorΒ 
posted an update about 1 month ago
view post
Post
1143
Interesting article: use Claude Code to help open models write CUDA kernels (for eg) by turning CC traces into Skills. They made a library out of it πŸ‘€

https://huggingface.co/blog/upskill
alvarobarttΒ 
posted an update about 1 month ago
view post
Post
3128
πŸ’₯ hf-mem v0.4.1 now also estimates KV cache memory requirements for any context length and batch size with the --experimental flag!

uvx hf-mem --model-id ... --experimental will automatically pull the required information from the Hugging Face Hub to include the KV cache estimation, when applicable.

πŸ’‘ Alternatively, you can also set the --max-model-len, --batch-size and --kv-cache-dtype arguments (Γ  la vLLM) manually if preferred.
  • 1 reply
Β·
pcuenqΒ 
posted an update 2 months ago
view post
Post
3685
πŸ‘‰ What happened in AI in 2025? πŸ‘ˆ

We prepared the 2025 version of the HF AI Timeline Grid, highlighting open vs API-based model releases, and allowing you to browse and filter by access, modality, and release type!

Play with it here:
2025-ai-timeline/2025-ai-timeline

Here's my personal quarterly TL;DR:

1️⃣ Q1 β€” Learning to Reason
Deepseek not only releases a top-notch reasoning model, but shows how to train them and compete with closed frontier models. OpenAI debuts Deep Research.

Significant milestones: DeepSeek R1 & R1-Zero, Qwen 2.5 VL, OpenAI Deep Research, Gemini 2.5 Pro (experimental)

2️⃣ Q2 β€” Multimodality and Coding
More LLMs embrace multimodality by default, and there's a surge in coding agents. Strong vision, audio, and generative models emerge.

Significant milestones: Llama 4, Qwen 3, Imagen 4, OpenAI Codex, Google Jules, Claude 4

3️⃣ Q3 β€” "Gold" rush, OpenAI opens up, the community goes bananas
Flagship models get gold in Math olympiads and hard benchmarks. OpenAI releases strong open source models and Google releases the much anticipated nano-banana for image generation and editing. Agentic workflows become commonplace.

Significant milestones: Gemini and OpenAI IMO Gold, gpt-oss, Gemini 2.5 Flash Image, Grok 4, Claude Sonnet 4.5

4️⃣ Q4 β€” Mistral returns, leaderboard hill-climbing
Mistral is back with updated model families. All labs release impressive models to wrap up the year!

Significant milestones: Claude Opus 4.5, DeepSeek Math V2, FLUX 2, GPT 5.1, Kimi K2 Thinking, Nano Banana Pro, GLM 4.7, Gemini 3, Mistral 3, MiniMax M2.1 🀯

Credits
πŸ™ NHLOCAL for the source data https://github.com/NHLOCAL/AiTimeline

🫑 @reach-vb for the original idea, design and recipe

πŸ™Œ @ariG23498 and yours truly for compiling and verifying the 2025 edition

πŸ₯³ Here's to 2026, wishing it becomes the best year ever for open releases and on-device-first use-cases! πŸ₯‚
  • 2 replies
Β·
victorΒ 
posted an update 3 months ago
view post
Post
3441
Nvidia is on a roll lately. Nemotron 3 Nano is my new fav local model, but here's the real flex: they published the entire evaluation setup. Configs, prompts, logs, all of it. This is how you do open models πŸ”₯

https://huggingface.co/blog/nvidia/nemotron-3-nano-evaluation-recipe

pagezyhfΒ 
posted an update 4 months ago
view post
Post
2918
πŸš€ Big news for AI builders!

We’re thrilled to announce that the Qwen3-VL family of vision-language models is now available on Azure AI Foundry, thanks to our collaboration with Microsoft.

We bring open-source innovation to enterprise-grade AI infrastructure, making it easier than ever for enterprise to deploy and scale the latest and greatest from models from hugging Face securely within Azure.

πŸ” Highlights:

- Deploy Qwen3-VL instantly via managed endpoints
- Built-in governance, telemetry, and lifecycle management
- True multimodal reasoning β€” vision, language, and code understanding
- State-of-the-art performance, outperforming closed-source models like Gemini 2.5 Pro and GPT-5
- Available in both *Instruct* and *Thinking* modes, across 24 model sizes

πŸ‘‰ Get started today: search for Qwen3-VL in the Hugging Face Collection on Azure AI Foundry.
  • 1 reply
Β·
multimodalartΒ 
posted an update 5 months ago
view post
Post
21111
Want to iterate on a Hugging Face Space with an LLM?

Now you can easily convert any HF entire repo (Model, Dataset or Space) to a text file and feed it to a language model!

multimodalart/repo2txt
  • 1 reply
Β·
pagezyhfΒ 
posted an update 6 months ago
view post
Post
861
What’s your biggest headache deploying Hugging Face models to the cloudβ€”and how can we fix it for you?
Β·
pagezyhfΒ 
posted an update 6 months ago
pagezyhfΒ 
posted an update 6 months ago
view post
Post
3931
🀝 Collaborating with AMD to ensure Hugging Face Transformers runs smoothly on AMD GPUs!

We run daily CI on AMD MI325 to track the health of the most important model architectures and we’ve just made our internal dashboard public.

By making this easily accessible, we hope to spark community contributions and improve support for everyone!
  • 2 replies
Β·
jeffboudierΒ 
posted an update 7 months ago
view post
Post
3207
Quick 30s demo of the new Hub > Azure AI integration to deploy HF models in your own Azure account. Now with Py and CLI!

GG @alvarobartt @kramp @pagezyhf
pagezyhfΒ 
posted an update 7 months ago
view post
Post
3226
We've improved the Deploy button on Hugging Face model pages for Microsoft Azure

1/ no more long waits before seeing model support status

2/ ready-to-use CLI and Python snippets

3/ redirection to Azure AI Foundry rather than Azure ML

βœ‹ if you see any bugs or have feedback, open an issue on our repo:
https://github.com/huggingface/Microsoft-Azure
pagezyhfΒ 
posted an update 7 months ago
view post
Post
2197
Deploy GPT OSS models with Hugging Face on Azure AI!

We’re thrilled to enable OpenAI GPT OSS models on Azure AI Model Catalog for Azure users to try the model securely the day of its release.

In our official launch blogpost, there’s a section on how to deploy the model to your Azure AI Hub. Get started today!

https://huggingface.co/blog/welcome-openai-gpt-oss#azure
pagezyhfΒ 
posted an update 7 months ago
view post
Post
286
We now have the newest Open AI models available on the Dell Enterprise Hub!

We built the Dell Enterprise Hub to provide access to the latest and greatest model from the Hugging Face community to our on-prem customers. We’re happy to give secure access to this amazing contribution from Open AI on the day of its launch!

https://dell.huggingface.co/
pagezyhfΒ 
posted an update 8 months ago
view post
Post
373
πŸŸͺ Qwen/Qwen3‑235B‑A22B‑Instruct‑2507‑FP8 is now available in Microsoft Azure for one‑click deployment! πŸš€

Check out their blogpost: https://qwenlm.github.io/blog/qwen3/

You can now find it in the Hugging Face Collection in Azure ML or Azure AI Foundry, along with 10k other Hugging Face models πŸ€—πŸ€—
Qwen/Qwen3-235B-A22B-Instruct-2507-FP8

Bear with us for the non‑quantized version.
pagezyhfΒ 
posted an update 8 months ago
pagezyhfΒ 
posted an update 8 months ago
view post
Post
219
πŸŽ‰ New in Azure Model Catalog: NVIDIA Parakeet TDT 0.6B V2

We're excited to welcome Parakeet TDT 0.6B V2β€”a state-of-the-art English speech-to-text modelβ€”to the Azure Foundry Model Catalog.

What is it?

A powerful ASR model built on the FastConformer-TDT architecture, offering:
πŸ•’ Word-level timestamps
✍️ Automatic punctuation & capitalization
πŸ”Š Strong performance across noisy and real-world audio

It runs with NeMo, NVIDIA’s optimized inference engine.

Want to give it a try? 🎧 You can test it with your own audio (up to 3 hours) on Hugging Face Spaces before deploying.If it fits your need, deploy easily from the Hugging Face Hub or Azure ML Studio with secure, scalable infrastructure!

πŸ“˜ Learn more by following this guide written by @alvarobartt

https://huggingface.co/docs/microsoft-azure/azure-ai/examples/deploy-nvidia-parakeet-asr
pagezyhfΒ 
posted an update 8 months ago
view post
Post
1279
If you want to dive into how the HF team worked with @seungrokj at @AMD
to optimize kernels on MI300, you should give a read to our latest blog!

Such a great educational material for anyone curious about the world of optimizing low level ML.

https://huggingface.co/blog/mi300kernels