RevisEval: Improving LLM-as-a-Judge via Response-Adapted References Paper • 2410.05193 • Published 30 days ago • 12
Putting Data at the Centre of Offline Multi-Agent Reinforcement Learning Paper • 2409.12001 • Published Sep 18 • 3
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning Paper • 2409.12183 • Published Sep 18 • 36