AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning Paper • 2509.08755 • Published 12 days ago • 56
Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory Paper • 2508.09736 • Published Aug 13 • 54
Hybrid Linear Attention Research Collection All 1.3B & 340M hybrid linear-attention experiments. • 62 items • Updated 11 days ago • 12
view article Article Vibe coding for data science: how to label a dataset with Kimi K2 By dvilasuero • Jul 22 • 21
Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities Paper • 2507.13158 • Published Jul 17 • 24
Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation Paper • 2507.02608 • Published Jul 3 • 21
FaSTA^*: Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing Paper • 2506.20911 • Published Jun 26 • 41
view article Article Gemma 3n fully available in the open-source ecosystem! By ariG23498 and 7 others • Jun 26 • 117
A Rank Stabilization Scaling Factor for Fine-Tuning with LoRA Paper • 2312.03732 • Published Nov 28, 2023 • 10
MoTE: Mixture of Ternary Experts for Memory-efficient Large Multimodal Models Paper • 2506.14435 • Published Jun 17 • 7
view article Article Tiny Agents in Python: a MCP-powered agent in ~70 lines of code By celinah and 3 others • May 23 • 163
Sparse-vDiT: Unleashing the Power of Sparse Attention to Accelerate Video Diffusion Transformers Paper • 2506.03065 • Published Jun 3 • 27
view article Article No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL By toslali-ibm and 5 others • Jun 3 • 88
view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data By danaaubakirova and 8 others • Jun 3 • 252
One RL to See Them All: Visual Triple Unified Reinforcement Learning Paper • 2505.18129 • Published May 23 • 59