MUSCLE: A Model Update Strategy for Compatible LLM Evolution Paper • 2407.09435 • Published Jul 12 • 20
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training Paper • 2407.09121 • Published Jul 12 • 5
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities Paper • 2407.14482 • Published Jul 19 • 25
Internal Consistency and Self-Feedback in Large Language Models: A Survey Paper • 2407.14507 • Published Jul 19 • 46
LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference Paper • 2407.14057 • Published Jul 19 • 44
Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix" Cycle Paper • 2407.13833 • Published Jul 18 • 11
AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents Paper • 2407.18901 • Published Jul 26 • 32
OpenDevin: An Open Platform for AI Software Developers as Generalist Agents Paper • 2407.16741 • Published Jul 23 • 68
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs Paper • 2408.07055 • Published Aug 13 • 65
Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents Paper • 2408.07060 • Published Aug 13 • 40
LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMs Paper • 2408.13467 • Published Aug 24 • 24
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA Paper • 2409.02897 • Published Sep 4 • 44
WildVis: Open Source Visualizer for Million-Scale Chat Logs in the Wild Paper • 2409.03753 • Published Sep 5 • 18
From MOOC to MAIC: Reshaping Online Teaching and Learning through LLM-driven Agents Paper • 2409.03512 • Published Sep 5 • 26
A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor? Paper • 2409.15277 • Published Sep 23 • 34
Hallucinating AI Hijacking Attack: Large Language Models and Malicious Code Recommenders Paper • 2410.06462 • Published Oct 9 • 7