Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper • 2412.10360 • Published 4 days ago • 118
ProcessBench: Identifying Process Errors in Mathematical Reasoning Paper • 2412.06559 • Published 9 days ago • 62
Training Large Language Models to Reason in a Continuous Latent Space Paper • 2412.06769 • Published 8 days ago • 54
Sana Collection ⚡️Sana: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer • 15 items • Updated 5 days ago • 52
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 15 items • Updated 16 days ago • 191
Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18 • 224
AMD-OLMo Collection AMD-OLMo are a series of 1 billion parameter language models trained by AMD on AMD Instinct™ MI250 GPUs based on OLMo. • 4 items • Updated Oct 31 • 17
Mini-Omni2: Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities Paper • 2410.11190 • Published Oct 15 • 20
CodeMMLU: A Multi-Task Benchmark for Assessing Code Understanding Capabilities of CodeLLMs Paper • 2410.01999 • Published Oct 2 • 10