view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 Feb 20 • 503
view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 Dec 18, 2025 • 124
view article Article Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand Dec 4, 2025 • 68
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 Dec 1, 2025 • 309
Inference Optimized Checkpoints (with Model Optimizer) Collection A collection of generative models quantized and optimized for inference with Model Optimizer. • 61 items • Updated 4 days ago • 140
Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published Oct 6, 2025 • 513
view article Article Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers +5 Sep 11, 2025 • 186