LLMs&Agents - a RandomHakkaDude Collection

RandomHakkaDude 's Collections

LLMs&Agents

updated May 30

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published Feb 13 • 149
From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens

Paper • 2502.18890 • Published Feb 26 • 30
MPO: Boosting LLM Agents with Meta Plan Optimization

Paper • 2503.02682 • Published Mar 4 • 27
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents

Paper • 2505.20411 • Published May 26 • 88
DeepResearchGym: A Free, Transparent, and Reproducible Evaluation Sandbox for Deep Research

Paper • 2505.19253 • Published May 25 • 29
Universal Reasoner: A Single, Composable Plug-and-Play Reasoner for Frozen LLMs

Paper • 2505.19075 • Published May 25 • 21
Text2Grad: Reinforcement Learning from Natural Language Feedback

Paper • 2505.22338 • Published May 28 • 8
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows

Paper • 2505.19897 • Published May 26 • 104
Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

Paper • 2505.21497 • Published May 27 • 108
Beyond Prompt Engineering: Robust Behavior Control in LLMs via Steering Target Atoms

Paper • 2505.20322 • Published May 23 • 14
VideoGameBench: Can Vision-Language Models complete popular video games?

Paper • 2505.18134 • Published May 23 • 6
Alita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evolution

Paper • 2505.20286 • Published May 26 • 7
ARM: Adaptive Reasoning Model

Paper • 2505.20258 • Published May 26 • 45
Flex-Judge: Think Once, Judge Anywhere

Paper • 2505.18601 • Published May 24 • 28
Lifelong Safety Alignment for Language Models

Paper • 2505.20259 • Published May 26 • 24
Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications of Agentic AI

Paper • 2505.19443 • Published May 26 • 15
InfantAgent-Next: A Multimodal Generalist Agent for Automated Computer Interaction

Paper • 2505.10887 • Published May 16 • 10