Optimizing Large Language Model Training Using FP4 Quantization Paper • 2501.17116 • Published 2 days ago • 21
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published 2 days ago • 46
Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models Paper • 2501.13629 • Published 7 days ago • 40
SRMT: Shared Memory for Multi-agent Lifelong Pathfinding Paper • 2501.13200 • Published 8 days ago • 60
UI-TARS: Pioneering Automated GUI Interaction with Native Agents Paper • 2501.12326 • Published 9 days ago • 47
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 8 days ago • 270
FilmAgent: A Multi-Agent Framework for End-to-End Film Automation in Virtual 3D Spaces Paper • 2501.12909 • Published 8 days ago • 62
PaSa: An LLM Agent for Comprehensive Academic Paper Search Paper • 2501.10120 • Published 13 days ago • 40
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models Paper • 2501.09686 • Published 14 days ago • 36
OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking Paper • 2501.09751 • Published 14 days ago • 47
AfriHate: A Multilingual Collection of Hate Speech and Abusive Language Datasets for African Languages Paper • 2501.08284 • Published 16 days ago • 6
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 16 days ago • 271
CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities Paper • 2501.08983 • Published 15 days ago • 20
MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents Paper • 2501.08828 • Published 15 days ago • 30
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 16 days ago • 51