Mistral-7B-ProXMath

ArXiv | Data: OpenWebMath-Pro | Code

Mistral-7B-ProXMath is a math-adapted Mistral-7B-v0.1 model that is continually pre-trained on OpenWebMath-Pro (a refined version by ProX) for 10B tokens.

Evaluations

ProX models are evaluated on 9 common math reasoning benchmarks.

Model asdiv gsm8k mathqa mawps minerva_math mmlu_stem sat_math svamp tabmwp average
Mistral-7B-v0.1 68.5 40.6 32.3 87.0 11.4 50.0 56.2 65.4 52.9 51.6
Mistral-7B-ProXMath 72.9 51.0 53.0 89.2 22.4 54.2 75.0 64.9 49.8 59.2

Citation

@article{zhou2024programming,
  title={Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale},
  author={Zhou, Fan and Wang, Zengzhi and Liu, Qian and Li, Junlong and Liu, Pengfei},
  journal={arXiv preprint arXiv:2409.17115},
  year={2024}
}
Downloads last month
76
Safetensors
Model size
7.24B params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for gair-prox/Mistral-7B-ProXMath

Finetuned
(819)
this model
Quantizations
3 models

Dataset used to train gair-prox/Mistral-7B-ProXMath

Collection including gair-prox/Mistral-7B-ProXMath