Blog, Articles, and discussions

NEW 你也可以阅读这篇博客的中文版

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

By January 23, 2025 • 90

Community Articles

view all

KV Caching Explained: Optimizing Transformer Inference Efficiency

•

about 4 hours ago

• 5

Harnessing the PDF RAG Search Tool in KaibanJS: Empowering AI Agents for Advanced Document Analysis

•

about 8 hours ago

🅰️ℹ️ 1️⃣0️⃣1️⃣ The Keys to Prompt Optimization

•

about 9 hours ago

• 3

How biased is Whisper ? Evaluating Whisper Models for Robustness to Diverse English Accents

•

about 12 hours ago

• 6

Exploring the Website RAG Search Tool in KaibanJS: Empowering AI Agents for Semantic Web Analysis

•

1 day ago

Fine-Tune Meta Llama 3.2-Vision-Instruct Multimodal LLM on Intel Accelerators

•

1 day ago

• 8

Provence: efficient and robust context pruning for retrieval-augmented generation

•

1 day ago

• 3

SILMA Kashif v1.0: A Specialized Model for RAG Tasks

•

1 day ago

• 1

Is Attention Interpretable in Transformer-Based Large Language Models? Let’s Unpack the Hype

•

2 days ago

• 3

DeepSeek R1: A Breakthrough in Open-Source AI Technology

•

2 days ago

Janus Pro: DeepSeek's Revolutionary Multimodal AI Model

•

2 days ago

• 27

🌁#85: Curiosity, Open Source, and Timing: The Formula Behind DeepSeek’s Phenomenal Success

•

2 days ago

• 6

Reverse-engineering Custom-GPT prompts

•

2 days ago

• 5

Hunyuan video LoRA training study (Single image/style training)

•

2 days ago

• 1

Welcome fastai to the Hugging Face Hub

By May 6, 2022 • 2

Director of Machine Learning Insights [Series]

By April 27, 2022 • 1

Introducing Hugging Face for Education

By April 25, 2022 • 3

CO2 Emissions and the 🤗 Hub: Leading the Charge

By April 22, 2022 • 5

Don't repeat yourself - 🤗 Transformers Design Philosophy

By April 5, 2022 • 15

Announcing the 🤗 AI Research Residency Program

By March 22, 2022 • 1

Gradio joins Hugging Face!

By December 21, 2021 • 4

Course Launch Community Event

By October 26, 2021 • 1

Train a Sentence Embedding Model with 1B Training Pairs

By October 25, 2021 guest • 1

Fine tuning CLIP with Remote Sensing (Satellite) images and captions

By October 13, 2021 guest • 5

Summer at Hugging Face ☀️

By September 24, 2021

Understanding BigBird's Block Sparse Attention

By March 31, 2021 guest

Community Articles

view all

KV Caching Explained: Optimizing Transformer Inference Efficiency

•

about 4 hours ago

• 5

Harnessing the PDF RAG Search Tool in KaibanJS: Empowering AI Agents for Advanced Document Analysis

•

about 8 hours ago

🅰️ℹ️ 1️⃣0️⃣1️⃣ The Keys to Prompt Optimization

•

about 9 hours ago

• 3

How biased is Whisper ? Evaluating Whisper Models for Robustness to Diverse English Accents

•

about 12 hours ago

• 6

Selene 1 Mini: the best small language model-as-a-judge

•

about 13 hours ago

• 10

20+ Free and Paid AI Digital Marketing Tools to Automate Repetitive Tasks

•

about 20 hours ago

• 2

🚀 Build a Qwen 2.5 VL API endpoint with Hugging Face spaces and Docker!

•

about 24 hours ago

• 13

Honesty, Open Source, and the Future of AI in Art: An Open Question

•

1 day ago

• 3

Left-Wing leaning of LLMs

•

1 day ago

• 2

SILMA Kashif: The Arabic RAG Model

•

1 day ago

• 1

Exploring the Website RAG Search Tool in KaibanJS: Empowering AI Agents for Semantic Web Analysis

•

1 day ago

Fine-Tune Meta Llama 3.2-Vision-Instruct Multimodal LLM on Intel Accelerators

•

1 day ago

• 8

Provence: efficient and robust context pruning for retrieval-augmented generation

•

1 day ago

• 3

SILMA Kashif v1.0: A Specialized Model for RAG Tasks

•

1 day ago

• 1

Is Attention Interpretable in Transformer-Based Large Language Models? Let’s Unpack the Hype

•

2 days ago

• 3

DeepSeek R1: A Breakthrough in Open-Source AI Technology

•

2 days ago

Janus Pro: DeepSeek's Revolutionary Multimodal AI Model

•

2 days ago

• 27

🌁#85: Curiosity, Open Source, and Timing: The Formula Behind DeepSeek’s Phenomenal Success

•

2 days ago

• 6

Reverse-engineering Custom-GPT prompts

•

2 days ago

• 5

Hunyuan video LoRA training study (Single image/style training)

•

2 days ago

• 1

Blog, Articles, and discussions

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

KV Caching Explained: Optimizing Transformer Inference Efficiency

Harnessing the PDF RAG Search Tool in KaibanJS: Empowering AI Agents for Advanced Document Analysis

🅰️ℹ️ 1️⃣0️⃣1️⃣ The Keys to Prompt Optimization

**How biased is Whisper ? Evaluating Whisper Models for Robustness to Diverse English Accents**

Selene 1 Mini: the best small language model-as-a-judge

20+ Free and Paid AI Digital Marketing Tools to Automate Repetitive Tasks

🚀 Build a Qwen 2.5 VL API endpoint with Hugging Face spaces and Docker!

Honesty, Open Source, and the Future of AI in Art: An Open Question

Left-Wing leaning of LLMs

SILMA Kashif: The Arabic RAG Model

Exploring the Website RAG Search Tool in KaibanJS: Empowering AI Agents for Semantic Web Analysis

Fine-Tune Meta Llama 3.2-Vision-Instruct Multimodal LLM on Intel Accelerators

Provence: efficient and robust context pruning for retrieval-augmented generation

SILMA Kashif v1.0: A Specialized Model for RAG Tasks

Is Attention Interpretable in Transformer-Based Large Language Models? Let’s Unpack the Hype

DeepSeek R1: A Breakthrough in Open-Source AI Technology

Janus Pro: DeepSeek's Revolutionary Multimodal AI Model

🌁#85: Curiosity, Open Source, and Timing: The Formula Behind DeepSeek’s Phenomenal Success

Reverse-engineering Custom-GPT prompts

Hunyuan video LoRA training study (Single image/style training)

Welcome fastai to the Hugging Face Hub

Director of Machine Learning Insights [Series]

Introducing Hugging Face for Education

CO2 Emissions and the 🤗 Hub: Leading the Charge

Don't repeat yourself - 🤗 Transformers Design Philosophy

Announcing the 🤗 AI Research Residency Program

Gradio joins Hugging Face!

Course Launch Community Event

Train a Sentence Embedding Model with 1B Training Pairs

Fine tuning CLIP with Remote Sensing (Satellite) images and captions

Summer at Hugging Face ☀️

Understanding BigBird's Block Sparse Attention

KV Caching Explained: Optimizing Transformer Inference Efficiency

Harnessing the PDF RAG Search Tool in KaibanJS: Empowering AI Agents for Advanced Document Analysis

🅰️ℹ️ 1️⃣0️⃣1️⃣ The Keys to Prompt Optimization

**How biased is Whisper ? Evaluating Whisper Models for Robustness to Diverse English Accents**

Selene 1 Mini: the best small language model-as-a-judge

20+ Free and Paid AI Digital Marketing Tools to Automate Repetitive Tasks

🚀 Build a Qwen 2.5 VL API endpoint with Hugging Face spaces and Docker!

Honesty, Open Source, and the Future of AI in Art: An Open Question

Left-Wing leaning of LLMs

SILMA Kashif: The Arabic RAG Model

Exploring the Website RAG Search Tool in KaibanJS: Empowering AI Agents for Semantic Web Analysis

Fine-Tune Meta Llama 3.2-Vision-Instruct Multimodal LLM on Intel Accelerators

Provence: efficient and robust context pruning for retrieval-augmented generation

SILMA Kashif v1.0: A Specialized Model for RAG Tasks

Is Attention Interpretable in Transformer-Based Large Language Models? Let’s Unpack the Hype

DeepSeek R1: A Breakthrough in Open-Source AI Technology

Janus Pro: DeepSeek's Revolutionary Multimodal AI Model

🌁#85: Curiosity, Open Source, and Timing: The Formula Behind DeepSeek’s Phenomenal Success

Reverse-engineering Custom-GPT prompts

Hunyuan video LoRA training study (Single image/style training)

How biased is Whisper ? Evaluating Whisper Models for Robustness to Diverse English Accents

How biased is Whisper ? Evaluating Whisper Models for Robustness to Diverse English Accents