ProX General Models
Collection
base models trained on ProX curated data.
•
16 items
•
Updated
FW-ProX-1.7B is a small language model. It was and trained on the FineWeb-pro for 50B tokens.
ProX models are evaluated over 10 language model benchmarks in zero-shot setting.
ArC-c | ARC-e | CSQA | HellaS | MMLU | OBQA | PiQA | SIQA | WinoG | SciQ | AVG | |
---|---|---|---|---|---|---|---|---|---|---|---|
raw | 28.5 | 52.6 | 33.9 | 53.2 | 29.8 | 32.6 | 72.9 | 40.2 | 53.0 | 77.1 | 47.4 |
ours | 34.4 | 63.9 | 32.6 | 53.0 | 33.1 | 34.4 | 73.1 | 39.3 | 52.7 | 81.5 | 49.8 |
@article{zhou2024programming,
title={Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale},
author={Zhou, Fan and Wang, Zengzhi and Liu, Qian and Li, Junlong and Liu, Pengfei},
journal={arXiv preprint arXiv:2409.17115},
year={2024}
}