canada-quant
/

DeepSeek-V4-Flash-W4A16-FP8

Text Generation

compressed-tensors

mixture-of-experts

Mixture of Experts

Model card Files Files and versions

Resources

View closed (1)

2026-05-26 dual-spark bench: GSM8K 94-98%, AIME 4/5, concurrency wall 12→125 tok/s

#7 opened about 8 hours ago by

Compatible version with Ampere? SM8.6

#6 opened about 23 hours ago by

One shot bootstrap for 2 Sparks not working

#4 opened 12 days ago by

dgx spark error ：torch.AcceleratorError: CUDA error: an illegal memory access was encountered ......

#3 opened 12 days ago by

Need vllm-w4a16-dsv4:exp Thanks

#2 opened 20 days ago by

Can I run this model on 2x H20 141GB?

#1 opened 20 days ago by