view article Article MCP for Research: How to Connect AI to Research Tools By dylanebert • 17 days ago • 43
view article Article Old Maps, New Terrain: Updating Labour Taxonomies for the AI Era By frimelle and 1 other • 15 days ago • 15
view article Article AWorld Multi-Agent System Hits #1 on GAIA Leaderboard By chengle • 28 days ago • 26
view article Article Welcome GPT OSS, the new open-source model family from OpenAI! By reach-vb and 11 others • about 1 month ago • 486
view article Article BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks By terryyz and 8 others • Jun 18, 2024 • 52
view article Article Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face By abidlabs and 4 others • Jul 29 • 168
view article Article Back to The Future: Evaluating AI Agents on Predicting Future Events By vinid and 6 others • Jul 17 • 39
Eurus Collection Advancing LLM Reasoning Generalists with Preference Trees • 11 items • Updated 28 days ago • 25
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents Paper • 2506.11763 • Published Jun 13 • 70
view article Article Asynchronous Robot Inference: Decoupling Action Prediction and Execution By fracapuano and 7 others • Jul 10 • 41
view article Article ScreenEnv: Deploy your full stack Desktop Agent By A-Mahla and 1 other • Jul 10 • 64
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders By thomwolf and 1 other • Jul 9 • 665
view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • Jul 8 • 645
Rope to Nope and Back Again: A New Hybrid Attention Strategy Paper • 2501.18795 • Published Jan 30 • 7