BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution Paper • 2510.08697 • Published Oct 9 • 35
view article Article BigCodeArena: Judging code generations end to end with code executions By bigcode • Oct 7 • 17
Dynamic Rewarding with Prompt Optimization Enables Tuning-free Self-Alignment of Language Models Paper • 2411.08733 • Published Nov 13, 2024 • 1
Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective Paper • 2506.14965 • Published Jun 17 • 49
Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective Paper • 2506.14965 • Published Jun 17 • 49
Code to Think, Think to Code: A Survey on Code-Enhanced Reasoning and Reasoning-Driven Code Intelligence in LLMs Paper • 2502.19411 • Published Feb 26 • 2
Code to Think, Think to Code: A Survey on Code-Enhanced Reasoning and Reasoning-Driven Code Intelligence in LLMs Paper • 2502.19411 • Published Feb 26 • 2
Running 26 26 Decentralized Arena Leaderboard 🥇 View and compare LLM evaluations across various domains
LLM Reasoners: New Evaluation, Library, and Analysis of Step-by-Step Reasoning with Large Language Models Paper • 2404.05221 • Published Apr 8, 2024 • 1
LLM Reasoners: New Evaluation, Library, and Analysis of Step-by-Step Reasoning with Large Language Models Paper • 2404.05221 • Published Apr 8, 2024 • 1