-
Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations
Paper • 2508.09789 • Published • 5 -
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents
Paper • 2508.13186 • Published • 17 -
ZARA: Zero-shot Motion Time-Series Analysis via Knowledge and Retrieval Driven LLM Agents
Paper • 2508.04038 • Published • 1 -
Prompt Orchestration Markup Language
Paper • 2508.13948 • Published • 49
Collections
Discover the best community collections!
Collections including paper arxiv:2504.06261
-
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention
Paper • 2504.06261 • Published • 111 -
InteractVLM: 3D Interaction Reasoning from 2D Foundational Models
Paper • 2504.05303 • Published • 5 -
TTRL: Test-Time Reinforcement Learning
Paper • 2504.16084 • Published • 120 -
Hybrid 3D-4D Gaussian Splatting for Fast Dynamic Scene Representation
Paper • 2505.13215 • Published • 29
-
Nuclear Norm Regularization for Deep Learning
Paper • 2405.14544 • Published • 1 -
Token embeddings violate the manifold hypothesis
Paper • 2504.01002 • Published • 1 -
Approximate Nullspace Augmented Finetuning for Robust Vision Transformers
Paper • 2403.10476 • Published • 1 -
ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning
Paper • 2504.00254 • Published • 1
-
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
Paper • 2501.04519 • Published • 284 -
Evolving Deeper LLM Thinking
Paper • 2501.09891 • Published • 116 -
START: Self-taught Reasoner with Tools
Paper • 2503.04625 • Published • 114 -
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild
Paper • 2503.18892 • Published • 32
-
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model
Paper • 2503.24290 • Published • 63 -
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders
Paper • 2503.18878 • Published • 121 -
START: Self-taught Reasoner with Tools
Paper • 2503.04625 • Published • 114 -
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Paper • 2503.14476 • Published • 140
-
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention
Paper • 2504.06261 • Published • 111 -
RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference
Paper • 2505.02922 • Published • 28 -
InfiniPot-V: Memory-Constrained KV Cache Compression for Streaming Video Understanding
Paper • 2506.15745 • Published • 13 -
Sparse-dLLM: Accelerating Diffusion LLMs with Dynamic Cache Eviction
Paper • 2508.02558 • Published • 10
-
MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models
Paper • 2501.02955 • Published • 45 -
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Paper • 2501.00958 • Published • 107 -
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
Paper • 2501.12380 • Published • 86 -
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos
Paper • 2501.09781 • Published • 29
-
Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations
Paper • 2508.09789 • Published • 5 -
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents
Paper • 2508.13186 • Published • 17 -
ZARA: Zero-shot Motion Time-Series Analysis via Knowledge and Retrieval Driven LLM Agents
Paper • 2508.04038 • Published • 1 -
Prompt Orchestration Markup Language
Paper • 2508.13948 • Published • 49
-
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention
Paper • 2504.06261 • Published • 111 -
InteractVLM: 3D Interaction Reasoning from 2D Foundational Models
Paper • 2504.05303 • Published • 5 -
TTRL: Test-Time Reinforcement Learning
Paper • 2504.16084 • Published • 120 -
Hybrid 3D-4D Gaussian Splatting for Fast Dynamic Scene Representation
Paper • 2505.13215 • Published • 29
-
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model
Paper • 2503.24290 • Published • 63 -
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders
Paper • 2503.18878 • Published • 121 -
START: Self-taught Reasoner with Tools
Paper • 2503.04625 • Published • 114 -
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Paper • 2503.14476 • Published • 140
-
Nuclear Norm Regularization for Deep Learning
Paper • 2405.14544 • Published • 1 -
Token embeddings violate the manifold hypothesis
Paper • 2504.01002 • Published • 1 -
Approximate Nullspace Augmented Finetuning for Robust Vision Transformers
Paper • 2403.10476 • Published • 1 -
ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning
Paper • 2504.00254 • Published • 1
-
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention
Paper • 2504.06261 • Published • 111 -
RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference
Paper • 2505.02922 • Published • 28 -
InfiniPot-V: Memory-Constrained KV Cache Compression for Streaming Video Understanding
Paper • 2506.15745 • Published • 13 -
Sparse-dLLM: Accelerating Diffusion LLMs with Dynamic Cache Eviction
Paper • 2508.02558 • Published • 10
-
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
Paper • 2501.04519 • Published • 284 -
Evolving Deeper LLM Thinking
Paper • 2501.09891 • Published • 116 -
START: Self-taught Reasoner with Tools
Paper • 2503.04625 • Published • 114 -
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild
Paper • 2503.18892 • Published • 32
-
MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models
Paper • 2501.02955 • Published • 45 -
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Paper • 2501.00958 • Published • 107 -
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
Paper • 2501.12380 • Published • 86 -
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos
Paper • 2501.09781 • Published • 29