DataDecide Collection A suite of models, data, and evals over 25 corpora, 14 sizes, and 3 seeds to measure how accurately small experiments predict rankings at large scale. • 358 items • Updated Dec 23, 2025 • 22
DataDecide: How to Predict Best Pretraining Data with Small Experiments Paper • 2504.11393 • Published Apr 15, 2025 • 18
Paloma: A Benchmark for Evaluating Language Model Fit Paper • 2312.10523 • Published Dec 16, 2023 • 13
Paloma Collection Dataset and baseline models for Paloma, a benchmark of language model fit to 546 textual domains • 8 items • Updated Dec 23, 2025 • 16