view post Post 3573 The Chinese community is shipping 🚢 DeepSeek V3 (685 B MoE) has quietly released on the hub! Base: deepseek-ai/DeepSeek-V3-BaseInstruct: deepseek-ai/DeepSeek-V3Can’t wait to see what’s next! See translation 1 reply · 🔥 13 13 🚀 7 7 👍 3 3 ❤️ 2 2 🤗 2 2 👀 1 1 + Reply
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published 23 days ago • 120
view post Post 1761 Welcome back, Small Language Models Enthusiasts and GPU Poor oss enjoyers lets connect. Just created an organization which main target is to have fun with smaller models tuneable on consumer range GPUs, feel free to join and lets have some fun, much love ;3https://huggingface.co/SmolTuners See translation 3 replies · ❤️ 12 12 🤗 5 5 + Reply