Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks? Paper • 2411.05000 • Published 16 days ago • 21
GRAB: A Challenging GRaph Analysis Benchmark for Large Multimodal Models Paper • 2408.11817 • Published Aug 21 • 8
Running on CPU Upgrade 11.8k 🏆 Open LLM Leaderboard 2 Track, rank and evaluate open LLMs and chatbots