SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better Reasoning in SLMs
Paper
•
2412.08347
•
Published
•
4
A collection of models that use SmolLM2 as the pretrained base in conjunction with AllenAI's Tulu 3 post training pipeline.