nvidia/Llama-3_3-Nemotron-Super-49B-v1_5 Text Generation β’ 50B β’ Updated 15 days ago β’ 17k β’ 188
nvidia/Llama-3.1-Nemotron-8B-UltraLong-4M-Instruct Text Generation β’ 8B β’ Updated Apr 17 β’ 11.5k β’ 119
deepseek-ai/DeepSeek-R1-Distill-Qwen-7B Text Generation β’ 8B β’ Updated Feb 24 β’ 1.65M β’ β’ 709
Running 3.15k 3.15k The Ultra-Scale Playbook π The ultimate guide to training LLM on large GPU Clusters
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B Text Generation β’ 2B β’ Updated Feb 24 β’ 613k β’ β’ 1.32k