Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
canada-quant
/
DeepSeek-V4-Flash-W4A16-FP8
like
15
Follow
Canada Quant Labs
5
Text Generation
Safetensors
English
Chinese
vllm
deepseek_v4
deepseek
compressed-tensors
w4a16
gptq
fp8
mixture-of-experts
Mixture of Experts
License:
mit
Model card
Files
Files and versions
xet
Community
7
Copy to bucket
new
main
DeepSeek-V4-Flash-W4A16-FP8
152 GB
Ctrl+K
Ctrl+K
2 contributors
History:
26 commits
pastapaul
Document 2026-05-25 findings: same FP8 compressor shipping issue as MTP sibling + new architecture-drift KeyError on current vLLM
087ae8a
verified
3 days ago
.gitattributes
Safe
1.52 kB
initial commit
23 days ago
README.md
Safe
20.4 kB
Document 2026-05-25 findings: same FP8 compressor shipping issue as MTP sibling + new architecture-drift KeyError on current vLLM
3 days ago
config.json
12.2 kB
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts)
23 days ago
generation_config.json
Safe
174 Bytes
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts)
23 days ago
model-00001-of-00004.safetensors
Safe
50 GB
xet
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts)
23 days ago
model-00002-of-00004.safetensors
Safe
50 GB
xet
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts)
23 days ago
model-00003-of-00004.safetensors
Safe
50 GB
xet
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts)
23 days ago
model-00004-of-00004.safetensors
Safe
2.48 GB
xet
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts)
23 days ago
model.safetensors.index.json
8.51 MB
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts)
23 days ago
recipe.yaml
Safe
1.97 kB
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts)
23 days ago
tokenizer.json
Safe
10.1 MB
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts)
23 days ago
tokenizer_config.json
Safe
397 Bytes
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts)
23 days ago