smol llama
Collection
🚧"raw" pretrained smol_llama checkpoints - WIP 🚧
•
4 items
•
Updated
•
6
A small 81M param (total) decoder model, enabled through tying the input/output embeddings. This is the first version of the model.
This checkpoint is the 'raw' pre-trained model and has not been tuned to a more specific task. It should be fine-tuned before use in most cases.
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 24.52 |
ARC (25-shot) | 22.18 |
HellaSwag (10-shot) | 29.33 |
MMLU (5-shot) | 24.06 |
TruthfulQA (0-shot) | 43.97 |
Winogrande (5-shot) | 49.25 |
GSM8K (5-shot) | 0.23 |
DROP (3-shot) | 2.64 |