Update README.md
Browse files
README.md
CHANGED
@@ -38,6 +38,21 @@ Actually scratch all of that, since there was [a new actually multilingual model
|
|
38 |
|
39 |
| FIN-bench (score) | Ahma-SlimInstruct-V1-7B | [Alpacazord-Viking-7B](https://huggingface.co/mpasila/Alpacazord-Viking-7B) | [Finnish-Alpaca-Small-7B](https://huggingface.co/mpasila/Finnish-Alpaca-Small-7B) | [Finnish-Alpaca-Tiny-V2-7B](https://huggingface.co/mpasila/Finnish-Alpaca-Tiny-V2-7B) | [Finnish-Viking-Alpaca-V1-7B](https://huggingface.co/mpasila/Finnish-Viking-Alpaca-V1-7B) | [NordicAlpaca-Finnish-V1-7B](https://huggingface.co/mpasila/NordicAlpaca-Finnish-V1-7B) | [llama-7b-finnish-instruct-v0.1](https://huggingface.co/Finnish-NLP/llama-7b-finnish-instruct-v0.1) | [llama-7b-finnish-instruct-v0.2](https://huggingface.co/Finnish-NLP/llama-7b-finnish-instruct-v0.2) | [llama-7b-finnish](https://huggingface.co/Finnish-NLP/llama-7b-finnish) | [Viking-7B (1000B)](https://huggingface.co/LumiOpen/Viking-7B) | [gpt-7b-nordic-prerelease](https://huggingface.co/HPLT/gpt-7b-nordic-prerelease) |
|
40 |
|-------|------|------|------|------|------|------|------|------|------|------|-------|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41 |
| Average | TBA | 0.4123 | 0.3586 | **0.4654** | 0.3943 | 0.3891 | 0.4365 | 0.3993 | 0.2350 | 0.3721 | 0.3169 |
|
42 |
|
43 |
[Source](https://docs.google.com/spreadsheets/d/1rqJb9dQVihg-Z1_Ras1L_-wuzPg9xNzpdmM2x5HueeY/edit?usp=sharing)
|
|
|
38 |
|
39 |
| FIN-bench (score) | Ahma-SlimInstruct-V1-7B | [Alpacazord-Viking-7B](https://huggingface.co/mpasila/Alpacazord-Viking-7B) | [Finnish-Alpaca-Small-7B](https://huggingface.co/mpasila/Finnish-Alpaca-Small-7B) | [Finnish-Alpaca-Tiny-V2-7B](https://huggingface.co/mpasila/Finnish-Alpaca-Tiny-V2-7B) | [Finnish-Viking-Alpaca-V1-7B](https://huggingface.co/mpasila/Finnish-Viking-Alpaca-V1-7B) | [NordicAlpaca-Finnish-V1-7B](https://huggingface.co/mpasila/NordicAlpaca-Finnish-V1-7B) | [llama-7b-finnish-instruct-v0.1](https://huggingface.co/Finnish-NLP/llama-7b-finnish-instruct-v0.1) | [llama-7b-finnish-instruct-v0.2](https://huggingface.co/Finnish-NLP/llama-7b-finnish-instruct-v0.2) | [llama-7b-finnish](https://huggingface.co/Finnish-NLP/llama-7b-finnish) | [Viking-7B (1000B)](https://huggingface.co/LumiOpen/Viking-7B) | [gpt-7b-nordic-prerelease](https://huggingface.co/HPLT/gpt-7b-nordic-prerelease) |
|
40 |
|-------|------|------|------|------|------|------|------|------|------|------|-------|
|
41 |
+
| Analogies | TBA | 0.5000 | 0.5923 | **0.6385** | 0.6308 | 0.5615 | 0.5000 | 0.5385 | 0.2692 | 0.5077 | 0.5846 |
|
42 |
+
| Arithmetic | TBA | 0.3678 | 0.2789 | **0.4815** | 0.3375 | 0.3393 | 0.4233 | 0.3299 | 0.0867 | 0.3136 | 0.2085 |
|
43 |
+
| Cause and effect | TBA | 0.6013 | 0.6013 | 0.5490 | 0.5752 | 0.6013 | 0.5948 | **0.6078** | 0.5752 | 0.5752 | 0.5882 |
|
44 |
+
| Emotions | TBA | 0.2938 | 0.3312 | 0.2250 | 0.2812 | 0.2938 | 0.2313 | **0.4750** | 0.3688 | 0.2313 | 0.2375 |
|
45 |
+
| Empirical judgments | TBA | 0.3333 | 0.3333 | 0.2525 | 0.2828 | 0.3333 | 0.3535 | **0.4141** | 0.3434 | 0.3434 | 0.3434 |
|
46 |
+
| General knowledge | TBA | 0.3429 | 0.2857 | 0.3429 | 0.4000 | 0.2857 | 0.3857 | **0.4429** | 0.1429 | 0.3143 | 0.2857 |
|
47 |
+
| Alignment harmless | TBA | 0.3621 | 0.3793 | 0.3793 | 0.3621 | 0.3448 | **0.3966** | 0.3793 | 0.3793 | 0.3793 | 0.3621 |
|
48 |
+
| Alignment helpful | TBA | **0.3559** | **0.3559** | 0.3390 | **0.3559** | 0.3220 | 0.3220 | 0.3220 | 0.3051 | 0.3390 | 0.3390 |
|
49 |
+
| Alignment honest | TBA | **0.4068** | 0.3559 | 0.3729 | 0.3729 | 0.3729 | 0.3898 | 0.3898 | **0.4068** | 0.3898 | 0.3729 |
|
50 |
+
| Alignment other | TBA | 0.5581 | 0.5349 | 0.5349 | 0.5581 | 0.5581 | **0.5814** | 0.5581 | **0.5814** | 0.5581 | **0.5814** |
|
51 |
+
| Intent recognition | TBA | 0.2587 | 0.1546 | 0.2153 | 0.1879 | 0.1777 | 0.2211 | **0.2717** | 0.1850 | 0.1864 | 0.1806 |
|
52 |
+
| Misconceptions | TBA | 0.5299 | **0.5448** | 0.5224 | 0.5373 | 0.5373 | 0.5149 | 0.5373 | 0.5373 | **0.5448** | 0.5373 |
|
53 |
+
| Paraphrase | TBA | 0.5050 | 0.5300 | 0.4750 | 0.5150 | 0.4750 | **0.5400** | 0.5000 | 0.5000 | 0.4800 | 0.5100 |
|
54 |
+
| Sentence ambiquity | TBA | 0.5000 | 0.4333 | 0.4833 | 0.5000 | 0.4333 | 0.4500 | **0.5333** | **0.5333** | 0.4667 | **0.5333** |
|
55 |
+
| Similarities abstraction | TBA | **0.7368** | 0.6974 | 0.6974 | **0.7368** | 0.7237 | 0.5789 | 0.5921 | 0.4474 | 0.6579 | 0.6053 |
|
56 |
| Average | TBA | 0.4123 | 0.3586 | **0.4654** | 0.3943 | 0.3891 | 0.4365 | 0.3993 | 0.2350 | 0.3721 | 0.3169 |
|
57 |
|
58 |
[Source](https://docs.google.com/spreadsheets/d/1rqJb9dQVihg-Z1_Ras1L_-wuzPg9xNzpdmM2x5HueeY/edit?usp=sharing)
|