results
Browse files
README.md
CHANGED
@@ -1,3 +1,55 @@
|
|
1 |
AI Model Name: Llama 3 8B "Built with Meta Llama 3" https://llama.meta.com/llama3/license/
|
2 |
|
3 |
-
Full walkthrough to reproduce these results here: https://github.com/catid/AQLM/blob/main/catid_readme.md
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
AI Model Name: Llama 3 8B "Built with Meta Llama 3" https://llama.meta.com/llama3/license/
|
2 |
|
3 |
+
Full walkthrough to reproduce these results here: https://github.com/catid/AQLM/blob/main/catid_readme.md
|
4 |
+
|
5 |
+
Baseline evaluation results:
|
6 |
+
|
7 |
+
```
|
8 |
+
hf (pretrained=meta-llama/Meta-Llama-3-8B-Instruct), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 16
|
9 |
+
| Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
|
10 |
+
|-------------|------:|------|-----:|--------|-----:|---|-----:|
|
11 |
+
|winogrande | 1|none | 0|acc |0.7198|± |0.0126|
|
12 |
+
|piqa | 1|none | 0|acc |0.7873|± |0.0095|
|
13 |
+
| | |none | 0|acc_norm|0.7867|± |0.0096|
|
14 |
+
|hellaswag | 1|none | 0|acc |0.5767|± |0.0049|
|
15 |
+
| | |none | 0|acc_norm|0.7585|± |0.0043|
|
16 |
+
|arc_easy | 1|none | 0|acc |0.8140|± |0.0080|
|
17 |
+
| | |none | 0|acc_norm|0.7971|± |0.0083|
|
18 |
+
|arc_challenge| 1|none | 0|acc |0.5290|± |0.0146|
|
19 |
+
| | |none | 0|acc_norm|0.5674|± |0.0145|
|
20 |
+
```
|
21 |
+
|
22 |
+
This repo evaluation results (AQLM with global fine-tuning):
|
23 |
+
|
24 |
+
```
|
25 |
+
hf (pretrained=catid/cat-llama-3-8b-instruct-aqlm), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 16
|
26 |
+
| Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
|
27 |
+
|-------------|------:|------|-----:|--------|-----:|---|-----:|
|
28 |
+
|winogrande | 1|none | 0|acc |0.7119|± |0.0127|
|
29 |
+
|piqa | 1|none | 0|acc |0.7807|± |0.0097|
|
30 |
+
| | |none | 0|acc_norm|0.7824|± |0.0096|
|
31 |
+
|hellaswag | 1|none | 0|acc |0.5716|± |0.0049|
|
32 |
+
| | |none | 0|acc_norm|0.7539|± |0.0043|
|
33 |
+
|arc_easy | 1|none | 0|acc |0.8152|± |0.0080|
|
34 |
+
| | |none | 0|acc_norm|0.7866|± |0.0084|
|
35 |
+
|arc_challenge| 1|none | 0|acc |0.5043|± |0.0146|
|
36 |
+
| | |none | 0|acc_norm|0.5555|± |0.0145|
|
37 |
+
```
|
38 |
+
|
39 |
+
To reproduce evaluation results:
|
40 |
+
|
41 |
+
```bash
|
42 |
+
git clone https://github.com/EleutherAI/lm-evaluation-harness
|
43 |
+
cd lm-evaluation-harness
|
44 |
+
|
45 |
+
conda create -n lmeval python=3.10 -y && conda activate lmeval
|
46 |
+
pip install -e .
|
47 |
+
pip install accelerate aqlm"[gpu,cpu]"
|
48 |
+
|
49 |
+
accelerate launch lm_eval --model hf \
|
50 |
+
--model_args pretrained=catid/cat-llama-3-8b-instruct-aqlm \
|
51 |
+
--tasks winogrande,piqa,hellaswag,arc_easy,arc_challenge \
|
52 |
+
--batch_size 16
|
53 |
+
```
|
54 |
+
|
55 |
+
You can run this model as a `transformers` model using https://github.com/oobabooga/text-generation-webui
|