view article Article Accelerating Qwen3-8B Agent on Intel® Core™ Ultra with Depth-Pruned Draft Models 20 days ago • 17
view article Article Introducing AutoRound: Intel’s Advanced Quantization for LLMs and VLMs Apr 29 • 40
Running 3.32k 3.32k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 418
view article Article Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference Jan 16 • 75