mlx-community/NVIDIA-Nemotron-3-Nano-30B-A3B-OptiQ-4bit Text Generation • 32B • Updated 1 day ago • 48
mlx-community/NVIDIA-Nemotron-3-Nano-30B-A3B-OptiQ-4bit Text Generation • 32B • Updated 1 day ago • 48
KVarN: Variance-Normalized KV-Cache Quantization Mitigates Error Accumulation in Reasoning Tasks Paper • 2606.03458 • Published 4 days ago • 51