Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

126

Full-text search

Active filters: vLLM

QuantTrio/GLM-4.7-AWQ

Text Generation • 358B • Updated Dec 29, 2025 • 18.5k • 25

QuantTrio/GLM-4.7-Flash-AWQ

Text Generation • 31B • Updated 23 days ago • 132k • 5

QuantTrio/Qwen3-235B-A22B-Instruct-2507-GPTQ-Int4-Int8Mix

Text Generation • 248B • Updated Aug 20, 2025 • 451 • 3

QuantTrio/Qwen3-235B-A22B-Thinking-2507-GPTQ-Int4-Int8Mix

Text Generation • 253B • Updated Sep 5, 2025 • 20 • 3

QuantTrio/Qwen3-VL-235B-A22B-Instruct-AWQ

Text Generation • 236B • Updated Oct 8, 2025 • 6.84k • 13

QuantTrio/Qwen3-VL-235B-A22B-Thinking-AWQ

Text Generation • 236B • Updated Oct 8, 2025 • 1.21k • 7

QuantTrio/Qwen3-VL-30B-A3B-Instruct-AWQ

Text Generation • 31B • Updated Oct 8, 2025 • 475k • 37

QuantTrio/DeepSeek-V3.2-AWQ

Text Generation • 685B • Updated Dec 3, 2025 • 54.5k • 11

QuantTrio/Kimi-K2.5-E304

Image-Text-to-Text • 138B • Updated 11 days ago • 476 • 1

model-scope/glm-4-9b-chat-GPTQ-Int4

Text Generation • 9B • Updated Jul 17, 2024 • 103 • 6

model-scope/glm-4-9b-chat-GPTQ-Int8

Text Generation • 9B • Updated Jul 23, 2024 • 4 • 2

tclf90/qwen2.5-72b-instruct-gptq-int4

Text Generation • 73B • Updated May 12, 2025 • 24 • 2

tclf90/qwen2.5-72b-instruct-gptq-int3

Text Generation • 69B • Updated May 12, 2025 • 28

prithivMLmods/Nu2-Lupi-Qwen-14B

Text Generation • 15B • Updated Mar 27, 2025 • 1 • 2

mradermacher/Nu2-Lupi-Qwen-14B-GGUF

15B • Updated Jul 11, 2025 • 135 • 1

mradermacher/Nu2-Lupi-Qwen-14B-i1-GGUF

15B • Updated Jul 11, 2025 • 5.33k • 1

JunHowie/Qwen3-0.6B-GPTQ-Int4

Text Generation • 0.6B • Updated Sep 3, 2025 • 340 • 1

JunHowie/Qwen3-0.6B-GPTQ-Int8

Text Generation • 0.6B • Updated Sep 3, 2025 • 29

JunHowie/Qwen3-1.7B-GPTQ-Int4

Text Generation • 2B • Updated Sep 3, 2025 • 1.47k • 1

JunHowie/Qwen3-1.7B-GPTQ-Int8

Text Generation • 2B • Updated Sep 3, 2025 • 14

JunHowie/Qwen3-32B-GPTQ-Int4

Text Generation • 33B • Updated Sep 5, 2025 • 9.7k • 4

JunHowie/Qwen3-32B-GPTQ-Int8

Text Generation • 33B • Updated Sep 5, 2025 • 302 • 3

JunHowie/Qwen3-30B-A3B-GPTQ-Int4

Text Generation • 5B • Updated Sep 6, 2025 • 9 • 1

JunHowie/Qwen3-14B-GPTQ-Int8

Text Generation • 15B • Updated Sep 5, 2025 • 116 • 1

JunHowie/Qwen3-14B-GPTQ-Int4

Text Generation • 15B • Updated Sep 5, 2025 • 965 • 4

JunHowie/Qwen3-8B-GPTQ-Int8

Text Generation • 8B • Updated Sep 4, 2025 • 87

JunHowie/Qwen3-8B-GPTQ-Int4

Text Generation • 8B • Updated Sep 4, 2025 • 1.63k • 4

JunHowie/Qwen3-4B-GPTQ-Int4

Text Generation • 4B • Updated Sep 4, 2025 • 804 • 1

JunHowie/Qwen3-4B-GPTQ-Int8

Text Generation • 4B • Updated Sep 4, 2025 • 23

JunHowie/Qwen3-30B-A3B-GPTQ-Int8

Text Generation • 8B • Updated Sep 6, 2025 • 169