File size: 2,302 Bytes
11b49f8 804809b 11b49f8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 |
---
license: llama3
datasets:
- BAAI/Infinity-Instruct
base_model:
- meta-llama/Meta-Llama-3.1-8B-Instruct
---
We prune the Llama-3.1-8B-Instruct to 1.4B and fine-tune it with LLM-Neo method,which combines LoRA and KD in one. Training data is sampling from BAAI/Infinity-Instruct for 1 Million lines.
## Benchmarks
In this section, we report the results for Llama3.1-Neo-1B-100w on standard automatic benchmarks. For all the evaluations, we use [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) library.
### Evaluation results
<table>
<tr>
<td><strong>Category</strong>
</td>
<td><strong>Benchmark</strong>
</td>
<td><strong>Version</strong>
</td>
<td><strong>n-shot</strong>
</td>
<td><strong>Metric</strong>
</td>
<td><strong>Value</strong>
</td>
<td><strong>Stderr</strong>
</td>
</tr>
<tr>
<td rowspan="2" >ARC
</td>
<td>ARC-Challenge</td>
<td>1</td>
<td>0</td>
<td>acc</td>
<td>0.1920</td>
<td>± 0.0115</td>
</tr>
<tr>
<td>ARC-Easy</td>
<td>1</td>
<td>0</td>
<td>acc</td>
<td>0.3834</td>
<td>± 0.0100</td>
</tr>
<tr>
<td rowspan="3" >CEVAL</td>
<td>CEVAL (valid)</td>
<td>N/A</td>
<td>0</td>
<td>acc</td>
<td>0.2370</td>
<td>± 0.0117</td>
</tr>
<tr>
<td>CEVAL (Accountant)</td>
<td>1</td>
<td>0</td>
<td>acc</td>
<td>0.2449</td>
<td>± 0.0621</td>
</tr>
<tr>
<td>CEVAL (Advanced Mathematics)</td>
<td>1</td>
<td>0</td>
<td>acc</td>
<td>0.3158</td>
<td>± 0.1096</td>
</tr>
<tr>
<td rowspan="2" >MMLU</td>
<td>MMLU</td>
<td>N/A</td>
<td>0</td>
<td>acc</td>
<td>0.2439</td>
<td>± 0.0036</td>
</tr>
<tr>
<td>MMLU (Abstract Algebra)</td>
<td>0</td>
<td>0</td>
<td>acc</td>
<td>0.2500</td>
<td>± 0.0435</td>
</tr>
<tr>
<td rowspan="2" >PIQA</td>
<td>PIQA</td>
<td>1</td>
<td>0</td>
<td>acc</td>
<td>0.5843</td>
<td>± 0.0115</td>
</tr>
<tr>
<td>PIQA (Normalized)</td>
<td>1</td>
<td>0</td>
<td>acc_norm</td>
<td>0.5822</td>
<td>± 0.0115</td>
</tr>
<tr>
<td>Winogrande</td>
<td>Winogrande</td>
<td>1</td>
<td>0</td>
<td>acc</td>
<td>0.5249</td>
<td>± 0.0140</td>
</tr>
</table> |