InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions Paper β’ 2412.09596 β’ Published 1 day ago β’ 64
ProcessBench: Identifying Process Errors in Mathematical Reasoning Paper β’ 2412.06559 β’ Published 5 days ago β’ 53
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Paper β’ 2412.05271 β’ Published 8 days ago β’ 95
VisionZip: Longer is Better but Not Necessary in Vision Language Models Paper β’ 2412.04467 β’ Published 9 days ago β’ 96
view article Article Running Your Custom LoRA Fine-Tuned MusicGen Large Locally By theeseus-ai β’ 8 days ago β’ 1
view article Article πΊπ¦ββ¬ LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs By wolfram β’ 9 days ago β’ 68