Edit Models filters

Inference Providers

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

3,907

Full-text search

Active filters: quantized

DevParker/VibeVoice7b-low-vram

Text-to-Speech • Updated 4 days ago • 15

NVFP4/Qwen3-30B-A3B-Instruct-2507-FP4

Text Generation • 16B • Updated Aug 1 • 2.06k • 6

nvidia/gpt-oss-120b-Eagle3

Text Generation • 0.2B • Updated 2 days ago • 768 • 23

SandLogicTechnologies/MedGemma-4B-IT-GGUF

4B • Updated Jul 29 • 2.93k • 4

nvidia/DeepSeek-R1-0528-FP4-v2

Text Generation • 394B • Updated 3 days ago • 13.3k • 3

NVFP4/Qwen3-Coder-30B-A3B-Instruct-FP4

Text Generation • 16B • Updated Aug 5 • 493 • 2

asmud/ds4sd-docling-models-onnx

Image-to-Text • Updated 3 days ago • 2

argmaxinc/whisperkit-coreml

Automatic Speech Recognition • Updated May 18 • 390k • 137

AetherArchitectural/GGUF-Quantization-Script

Text Generation • Updated 5 days ago • 69

Lewdiculous/DaturaCookie_7B-GGUF-IQ-Imatrix

7B • Updated Mar 23, 2024 • 141 • 5

MaziyarPanahi/Phi-3.5-mini-instruct-GGUF

Text Generation • 4B • Updated Aug 20, 2024 • 213k • 23

MaziyarPanahi/Llama-3.2-1B-Instruct-GGUF

Text Generation • 1B • Updated Sep 25, 2024 • 177k • 16

RedHatAI/Llama-3.3-70B-Instruct-FP8-dynamic

Text Generation • 71B • Updated 10 days ago • 40.8k • 10

MaziyarPanahi/Qwen2.5-Coder-0.5B-QwQ-draft-GGUF

Text Generation • 0.5B • Updated Jan 7 • 88 • 4

MaziyarPanahi/Phi-4-mini-instruct-GGUF

Text Generation • 4B • Updated Mar 1 • 177k • 11

MaziyarPanahi/gemma-3-1b-it-GGUF

Text Generation • 1.0B • Updated Mar 12 • 177k • 8

MaziyarPanahi/gemma-3-4b-it-GGUF

Text Generation • 4B • Updated Mar 12 • 175k • 10

MaziyarPanahi/DeepSeek-V3-0324-GGUF

Text Generation • 671B • Updated Mar 25 • 168k • 20

ABDALLALSWAITI/FLUX.1-dev-ControlNet-Union-Pro-2.0-fp8

Text-to-Image • Updated Apr 19 • 6.27k • 44

MaziyarPanahi/Qwen3-8B-GGUF

Text Generation • 8B • Updated Apr 28 • 170k • 5

MaziyarPanahi/Qwen3-4B-GGUF

Text Generation • 4B • Updated Apr 28 • 176k • 5

jedisct1/MiMo-7B-RL-GGUF

8B • Updated Apr 30 • 371 • 24

boltuix/NeuroBERT

Text Classification • 0.0B • Updated Jun 30 • 10 • 11

nvidia/DeepSeek-R1-0528-FP4

Text Generation • Updated 14 days ago • 63.2k • 36

botirk/tiny-prompt-task-complexity-classifier

Text Classification • Updated Jun 12 • 3 • 2

NVFP4/DeepSeek-R1-0528-Qwen3-8B-FP4

5B • Updated Jul 23 • 91 • 1

mzbac/flux1.kontext.8bit.mlx

Image-to-Image • Updated Jul 6 • 2

magicunicorn/kokoro-npu-quantized

Text-to-Speech • Updated Jul 10 • 1

sugiv/cardvaultplus-500m-gguf

Image-to-Text • 0.4B • Updated Jul 22 • 137 • 1

NVFP4/Qwen3-30B-A3B-Thinking-2507-FP4

Text Generation • 16B • Updated Aug 1 • 474 • 1