MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering Paper • 2410.07095 • Published Oct 9 • 6
A Function Interpretation Benchmark for Evaluating Interpretability Methods Paper • 2309.03886 • Published Sep 7, 2023 • 1
Multimodal Neurons in Pretrained Text-Only Transformers Paper • 2308.01544 • Published Aug 3, 2023 • 15