AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents Paper • 2410.24024 • Published 28 days ago • 48
LongReward: Improving Long-context Large Language Models with AI Feedback Paper • 2410.21252 • Published Oct 28 • 16
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs Paper • 2408.07055 • Published Aug 13 • 65
Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA Paper • 2406.17419 • Published Jun 25 • 16
APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets Paper • 2406.18518 • Published Jun 26 • 23
Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More? Paper • 2406.13121 • Published Jun 19 • 2
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent Paper • 2404.03648 • Published Apr 4 • 24
ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools Paper • 2406.12793 • Published Jun 18 • 31