Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer Paper • 2311.06720 • Published Nov 12, 2023 • 7
Safurai 001: New Qualitative Approach for Code LLM Evaluation Paper • 2309.11385 • Published Sep 20, 2023 • 2
Assessment of Pre-Trained Models Across Languages and Grammars Paper • 2309.11165 • Published Sep 20, 2023 • 1
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published May 2 • 118