152 GB

Ctrl+K

Document 2026-05-25 findings: same FP8 compressor shipping issue as MTP sibling + new architecture-drift KeyError on current vLLM

087ae8a verified 3 days ago

.gitattributes

1.52 kB
initial commit 23 days ago
README.md

20.4 kB
Document 2026-05-25 findings: same FP8 compressor shipping issue as MTP sibling + new architecture-drift KeyError on current vLLM 3 days ago
config.json

12.2 kB
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts) 23 days ago
generation_config.json

174 Bytes
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts) 23 days ago
model-00001-of-00004.safetensors

50 GB
xet

Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts) 23 days ago
model-00002-of-00004.safetensors

50 GB
xet

Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts) 23 days ago
model-00003-of-00004.safetensors

50 GB
xet

Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts) 23 days ago
model-00004-of-00004.safetensors

2.48 GB
xet

Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts) 23 days ago
model.safetensors.index.json

8.51 MB
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts) 23 days ago
recipe.yaml

1.97 kB
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts) 23 days ago
tokenizer.json

10.1 MB
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts) 23 days ago
tokenizer_config.json

397 Bytes
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts) 23 days ago