Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
neuralmagic
's Collections
Sparse-Llama-3.1-2of4
Vision Language Models Quantization
FP8 LLMs for vLLM
Llama-3.2 Quantization
Llama-3.1 Quantization
INT8 LLMs for vLLM
INT4 LLMs for vLLM
Sparse Foundational Llama 2 Models
Compression Papers
DeepSparse Sparse LLMs
Sparse Finetuning MPT
Compressed LLMs from the Community
INT8 LLMs for vLLM
updated
Sep 26
Accurate INT8 quantized models by Neural Magic, ready for use with vLLM!
Upvote
10
neuralmagic/Meta-Llama-3.1-70B-Instruct-quantized.w8a8
Text Generation
•
Updated
Oct 10
•
5.99k
•
13
neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w8a8
Text Generation
•
Updated
Oct 23
•
7.76k
•
12
neuralmagic/Meta-Llama-3.1-405B-Instruct-quantized.w8a8
Text Generation
•
Updated
Oct 10
•
487
•
2
neuralmagic/Phi-3-medium-128k-instruct-quantized.w8a8
Text Generation
•
Updated
Oct 9
•
477
•
2
neuralmagic/Phi-3-mini-128k-instruct-quantized.w8a8
Text Generation
•
Updated
Oct 9
•
530
neuralmagic/gemma-2-9b-it-quantized.w8a8
Text Generation
•
Updated
Oct 9
•
389
•
2
neuralmagic/Meta-Llama-3-70B-Instruct-quantized.w8a16
Text Generation
•
Updated
Jul 18
•
451
•
3
neuralmagic/Qwen2-72B-Instruct-quantized.w8a16
Text Generation
•
Updated
Jul 18
•
420
•
1
neuralmagic/Llama-2-7b-chat-quantized.w8a16
Text Generation
•
Updated
Jul 18
•
420
neuralmagic/Meta-Llama-3-8B-Instruct-quantized.w8a16
Text Generation
•
Updated
Jul 18
•
31k
•
2
neuralmagic/Qwen2-0.5B-Instruct-quantized.w8a16
Text Generation
•
Updated
Jul 18
•
16
neuralmagic/Qwen2-1.5B-Instruct-quantized.w8a16
Text Generation
•
Updated
Jul 18
•
8
neuralmagic/Qwen2-7B-Instruct-quantized.w8a16
Text Generation
•
Updated
Jul 18
•
90
neuralmagic/Mistral-7B-Instruct-v0.3-quantized.w8a16
Text Generation
•
Updated
Jul 18
•
714
neuralmagic/Phi-3-mini-128k-instruct-quantized.w8a16
Text Generation
•
Updated
Oct 9
•
16
neuralmagic/Phi-3-medium-128k-instruct-quantized.w8a16
Text Generation
•
Updated
Oct 9
•
902
•
2
neuralmagic/Meta-Llama-3-8B-Instruct-quantized.w8a8
Text Generation
•
Updated
Oct 9
•
483
•
2
neuralmagic/Llama-2-7b-chat-quantized.w8a8
Text Generation
•
Updated
Oct 9
•
556
•
1
neuralmagic/Qwen2-0.5B-Instruct-quantized.w8a8
Text Generation
•
Updated
Oct 9
•
44
neuralmagic/Qwen2-1.5B-Instruct-quantized.w8a8
Text Generation
•
Updated
Oct 9
•
1.62k
neuralmagic/Qwen2-7B-Instruct-quantized.w8a8
Text Generation
•
Updated
Oct 9
•
640
neuralmagic/Qwen2-72B-Instruct-quantized.w8a8
Text Generation
•
Updated
Oct 9
•
427
•
1
neuralmagic/Meta-Llama-3-70B-Instruct-quantized.w8a8
Text Generation
•
Updated
Oct 9
•
26
neuralmagic/Mistral-7B-Instruct-v0.3-quantized.w8a8
Text Generation
•
Updated
Oct 9
•
386
neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w8a16
Text Generation
•
Updated
Oct 23
•
5.88k
•
9
neuralmagic/Meta-Llama-3.1-70B-Instruct-quantized.w8a16
Text Generation
•
Updated
Oct 9
•
1.11k
•
3
neuralmagic/Meta-Llama-3.1-8B-quantized.w8a16
Text Generation
•
Updated
Oct 9
•
383
•
1
neuralmagic/Meta-Llama-3.1-8B-quantized.w8a8
Text Generation
•
Updated
Oct 23
•
590
•
1
neuralmagic/starcoder2-7b-quantized.w8a16
Text Generation
•
Updated
Oct 9
•
24
neuralmagic/starcoder2-15b-quantized.w8a16
Text Generation
•
Updated
Oct 9
•
383
neuralmagic/starcoder2-3b-quantized.w8a16
Text Generation
•
Updated
Oct 9
•
27
neuralmagic/starcoder2-15b-quantized.w8a8
Text Generation
•
Updated
Oct 9
•
12
neuralmagic/starcoder2-7b-quantized.w8a8
Text Generation
•
Updated
Oct 9
•
31
neuralmagic/starcoder2-3b-quantized.w8a8
Text Generation
•
Updated
Oct 9
•
15
neuralmagic/gemma-2-2b-it-quantized.w8a16
Text Generation
•
Updated
Oct 9
•
53
•
1
neuralmagic/Phi-3-small-128k-instruct-quantized.w8a16
Text Generation
•
Updated
Oct 9
•
417
neuralmagic/SmolLM-1.7B-Instruct-quantized.w8a16
Text Generation
•
Updated
Oct 9
•
20
neuralmagic/gemma-2-2b-quantized.w8a16
Text Generation
•
Updated
Oct 9
•
48
neuralmagic/gemma-2-9b-it-quantized.w8a16
Text Generation
•
Updated
Oct 9
•
771
•
1
neuralmagic/gemma-2-2b-it-quantized.w8a8
Text Generation
•
Updated
Oct 9
•
1.53k
neuralmagic/Meta-Llama-3.1-405B-Instruct-quantized.w8a16
Text Generation
•
Updated
Oct 9
•
506
•
2
neuralmagic/SmolLM-360M-Instruct-quantized.w8a8
Text Generation
•
Updated
Oct 9
•
12
neuralmagic/SmolLM-135M-Instruct-quantized.w8a8
Text Generation
•
Updated
Oct 9
•
717
neuralmagic/Llama-3.2-3B-Instruct-quantized.w8a8
Text Generation
•
Updated
Oct 16
•
6.61k
•
1
Upvote
10
+6
Share collection
View history
Collection guide
Browse collections