leonardlin
commited on
Commit
•
35add1d
1
Parent(s):
fc5c300
Update README.md
Browse files
README.md
CHANGED
@@ -10,6 +10,14 @@ model-index:
|
|
10 |
|
11 |
shisa-v2 Base Model ablation
|
12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
13 |
Using a [fork](https://github.com/shisa-ai/shaberi) of [Lightblue's Shaberi benchmark framework](https://github.com/lightblue-tech/japanese_llm_eval):
|
14 |
|
15 |
| Model | Average | ELYZA-tasks-100 | MT-Bench | Rakuda | Tengu-Bench |
|
|
|
10 |
|
11 |
shisa-v2 Base Model ablation
|
12 |
|
13 |
+
This model uses a LR of 8e-6 that slightly improves performance vs the original 2e-5
|
14 |
+
It also uses NEFTune, although the expected impact may be neglible for this dataset.
|
15 |
+
|
16 |
+
(this appears to validate the Llama 3 8B LR ablations for predicting improved LR hyperparameter)
|
17 |
+
|
18 |
+
While the last model matched gpt-3.5-turbo, I think it's fair to say that this model allows us to farily say that it "beats" it.
|
19 |
+
|
20 |
+
|
21 |
Using a [fork](https://github.com/shisa-ai/shaberi) of [Lightblue's Shaberi benchmark framework](https://github.com/lightblue-tech/japanese_llm_eval):
|
22 |
|
23 |
| Model | Average | ELYZA-tasks-100 | MT-Bench | Rakuda | Tengu-Bench |
|