CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings Paper • 2501.01257 • Published 28 days ago • 48
Llama-3.1-Nemotron-70B Collection SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. • 6 items • Updated 13 days ago • 152
HelpSteer2-Preference: Complementing Ratings with Preferences Paper • 2410.01257 • Published Oct 2, 2024 • 22
view article Article Synthetic dataset generation techniques: Self-Instruct By davanstrien • May 15, 2024 • 14
Llama 3.x Models Collection Our highest-performance models, built with Llama 3, 3.1, and 3.2 • 10 items • Updated Oct 31, 2024 • 3
Llamafied Yi Collection Yi base models converted to Llama architecture. • 4 items • Updated Nov 14, 2023 • 9
Open LLM Leaderboard best models ❤️🔥 Collection A daily uploaded list of models with best evaluations on the LLM leaderboard: • 64 items • Updated about 12 hours ago • 517