LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs Paper • 2501.06186 • Published 13 days ago • 59
Graph-Aware Isomorphic Attention for Adaptive Dynamics in Transformers Paper • 2501.02393 • Published 19 days ago • 8
Virgo: A Preliminary Exploration on Reproducing o1-like MLLM Paper • 2501.01904 • Published 20 days ago • 31
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing Paper • 2412.14711 • Published Dec 19, 2024 • 16