nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 Text Generation • 124B • Updated about 6 hours ago • 13.1k • 189
ParoQuant Collection Pairwise Rotation Quantization for Efficient Reasoning LLM Inference • 16 items • Updated about 17 hours ago • 11