Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 3 items • Updated 4 days ago • 289
Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models Paper • 2501.11873 • Published 10 days ago • 61
cognitivecomputations/Wizard-Vicuna-30B-Uncensored Text Generation • Updated May 20, 2024 • 1.81k • 151
lmstudio-community/DeepSeek-R1-Distill-Qwen-7B-GGUF Text Generation • Updated 11 days ago • 153k • 33
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 16 days ago • 271