Edit model card

We prune the Llama-3.1-8B-Instruct to 1.4B and fine-tune it with LLM-Neo method,which combines LoRA and KD in one. Training data is sampling from BAAI/Infinity-Instruct for 1 Million lines.

Benchmarks

In this section, we report the results for Llama3.1-Neo-1B-100w on standard automatic benchmarks. For all the evaluations, we use lm-evaluation-harness library.

Evaluation results

Category Benchmark Version n-shot Metric Value Stderr
ARC ARC-Challenge 1 0 acc 0.1920 ± 0.0115
ARC-Easy 1 0 acc 0.3834 ± 0.0100
CEVAL CEVAL (valid) N/A 0 acc 0.2370 ± 0.0117
CEVAL (Accountant) 1 0 acc 0.2449 ± 0.0621
CEVAL (Advanced Mathematics) 1 0 acc 0.3158 ± 0.1096
MMLU MMLU N/A 0 acc 0.2439 ± 0.0036
MMLU (Abstract Algebra) 0 0 acc 0.2500 ± 0.0435
PIQA PIQA 1 0 acc 0.5843 ± 0.0115
PIQA (Normalized) 1 0 acc_norm 0.5822 ± 0.0115
Winogrande Winogrande 1 0 acc 0.5249 ± 0.0140
Downloads last month
2
Safetensors
Model size
1.92B params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for yang31210999/Llama3.1-Neo-1B-100w

Finetuned
(421)
this model

Dataset used to train yang31210999/Llama3.1-Neo-1B-100w

Collection including yang31210999/Llama3.1-Neo-1B-100w