LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws Paper • 2605.23901 • Published 6 days ago • 10
BitCPM-CANN Collection Full-pipeline ternary quantized model trained on CANN. • 12 items • Updated 3 days ago • 24
Mix-Quant: Quantized Prefilling, Precise Decoding for Agentic LLMs Paper • 2605.20315 • Published 9 days ago • 28
Enhancing Train-Free Infinite-Frame Generation for Consistent Long Videos Paper • 2605.18233 • Published 10 days ago • 90
Post-Trained MoE Can Skip Half Experts via Self-Distillation Paper • 2605.18643 • Published 10 days ago • 30
Large Language Models Explore by Latent Distilling Paper • 2604.24927 • Published about 1 month ago • 74
StateSMix: Online Lossless Compression via Mamba State Space Models and Sparse N-gram Context Mixing Paper • 2605.02904 • Published Apr 5 • 8