Return of the Encoder: Maximizing Parameter Efficiency for SLMs Paper • 2501.16273 • Published 4 days ago • 5
ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer Paper • 2501.15570 • Published 5 days ago • 18
Mixture-of-Mamba: Enhancing Multi-Modal State-Space Models with Modality-Aware Sparsity Paper • 2501.16295 • Published 4 days ago • 6
iFormer: Integrating ConvNet and Transformer for Mobile Application Paper • 2501.15369 • Published 6 days ago • 9
Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models Paper • 2501.13629 • Published 8 days ago • 40
SRMT: Shared Memory for Multi-agent Lifelong Pathfinding Paper • 2501.13200 • Published 9 days ago • 62
IntellAgent: A Multi-Agent Framework for Evaluating Conversational AI Systems Paper • 2501.11067 • Published 12 days ago • 12
Pairwise RM: Perform Best-of-N Sampling with Knockout Tournament Paper • 2501.13007 • Published 9 days ago • 18
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper • 2501.13106 • Published 9 days ago • 76
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 9 days ago • 279
FilmAgent: A Multi-Agent Framework for End-to-End Film Automation in Virtual 3D Spaces Paper • 2501.12909 • Published 9 days ago • 62
Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks Paper • 2501.11733 • Published 11 days ago • 26