view article Article Accelerating Qwen3-8B Agent on Intel® Core™ Ultra with Depth-Pruned Draft Models 24 days ago • 17
view article Article There is no such thing as a tokenizer-free lunch By catherinearnett • 28 days ago • 83
view article Article Rocket Money x Hugging Face: Scaling Volatile ML Models in Production Sep 19, 2023 • 1
view article Article 🌎 What kind of environmental impacts are AI companies disclosing? (And can we compare them?) 🌎 By sasha and 1 other • Sep 17 • 12
view article Article Introducing the Palmyra-mini family: Powerful, lightweight, and ready to reason! By Writer and 1 other • Sep 11 • 58
view article Article "Anemll-style" Root-Mean-Square (RMS) Normalization on the Apple Neural Engine: A Simple Hack By anemll • Sep 16 • 13
mmBERT: a modern multilingual encoder Collection mmBERT is trained on 3T tokens from over 1800 languages, showing SoTA scores on benchmarks and exceptional low-resource performance • 16 items • Updated Sep 9 • 45
view article Article Sensitivity Aware Mixed Precision Quantization V1 By badaoui and 1 other • Jun 13 • 24