utter-project
/

EuroLLM-1.7B

@@ -141,16 +141,20 @@ Results show that EuroLLM-1.7B is superior to TinyLlama-v1.1 and similar to Gemm
 #### Arc Challenge
 | Model              | Average | English | German | Spanish | French | Italian | Portuguese | Chinese | Russian | Dutch | Arabic | Swedish | Hindi  | Hungarian | Romanian | Ukrainian | Danish | Catalan |
 |--------------------|---------|---------|--------|---------|--------|---------|------------|---------|---------|-------|--------|---------|--------|-----------|----------|-----------|--------|---------|
-| EuroLLM-1.7B | 0.3268    | 0.4070  | 0.3293  | 0.3521  | 0.3370  | 0.3422  | 0.3496  | 0.3060  | 0.3122   | 0.3174   | 0.2866   | 0.3373   | 0.2817   | 0.3031   | 0.3179   | 0.3199   | 0.3248   | 0.3310   |
 | TinyLlama-v1.1       | 0.2650    | 0.3712  | 0.2524  | 0.2795  | 0.2883  | 0.2652  | 0.2906  | 0.2410  | 0.2669   | 0.2404   | 0.2310   | 0.2687   | 0.2354   | 0.2449   | 0.2476   | 0.2524   | 0.2494   | 0.2796   |
 | Gemma-2B             | 0.3617    | 0.4846  | 0.3755  | 0.3940  | 0.4080  | 0.3687  | 0.3872  | 0.3726  | 0.3456   | 0.3328   | 0.3122   | 0.3519   | 0.2851   | 0.3039   | 0.3590   | 0.3601   | 0.3565   | 0.3516   |
 #### Hellaswag
 | Model              | Average | English | German | Spanish | French | Italian | Portuguese | Russian | Dutch  | Arabic | Swedish | Hindi  | Hungarian | Romanian | Ukrainian | Danish | Catalan |
 |--------------------|---------|---------|--------|---------|--------|---------|------------|---------|--------|--------|---------|--------|-----------|----------|-----------|--------|---------|
-| EuroLLM-1.7B | 0.4744  | 0.6084  | 0.4772  | 0.5310  | 0.5260  | 0.5067  | 0.5206  | 0.4674  | 0.4893   | 0.4075   | 0.4813   | 0.3605   | 0.4067   | 0.4598   | 0.4368   | 0.4700   | 0.4405   |
 | TinyLlama-v1.1       |0.3674   | 0.6248  | 0.3650  | 0.4137  | 0.4010  | 0.3780  | 0.3892  | 0.3494  | 0.3588   | 0.2880   | 0.3561   | 0.2841   | 0.3073   | 0.3267   | 0.3349   | 0.3408   | 0.3613   |
 | Gemma-2B             |0.4666  | 0.7165  | 0.4756  | 0.5414  | 0.5180  | 0.4841  | 0.5081  | 0.4664  | 0.4655   | 0.3868   | 0.4383   | 0.3413   | 0.3710   | 0.4316   | 0.4291   | 0.4471   | 0.4448   |
 ## Bias, Risks, and Limitations
 EuroLLM-1.7B has not been aligned to human preferences, so the model may generate problematic outputs (e.g., hallucinations, harmful content, or false statements).

 #### Arc Challenge
 | Model              | Average | English | German | Spanish | French | Italian | Portuguese | Chinese | Russian | Dutch | Arabic | Swedish | Hindi  | Hungarian | Romanian | Ukrainian | Danish | Catalan |
 |--------------------|---------|---------|--------|---------|--------|---------|------------|---------|---------|-------|--------|---------|--------|-----------|----------|-----------|--------|---------|
+| EuroLLM-1.7B | 0.3496    | 0.4061    | 0.3464    | 0.3684    | 0.3627    | 0.3738    | 0.3855    | 0.3521    | 0.3208    | 0.3507     | 0.3045     | 0.3605     | 0.2928     | 0.3271     | 0.3488     | 0.3516     | 0.3513     | 0.3396     |
 | TinyLlama-v1.1       | 0.2650    | 0.3712  | 0.2524  | 0.2795  | 0.2883  | 0.2652  | 0.2906  | 0.2410  | 0.2669   | 0.2404   | 0.2310   | 0.2687   | 0.2354   | 0.2449   | 0.2476   | 0.2524   | 0.2494   | 0.2796   |
 | Gemma-2B             | 0.3617    | 0.4846  | 0.3755  | 0.3940  | 0.4080  | 0.3687  | 0.3872  | 0.3726  | 0.3456   | 0.3328   | 0.3122   | 0.3519   | 0.2851   | 0.3039   | 0.3590   | 0.3601   | 0.3565   | 0.3516   |
 #### Hellaswag
 | Model              | Average | English | German | Spanish | French | Italian | Portuguese | Russian | Dutch  | Arabic | Swedish | Hindi  | Hungarian | Romanian | Ukrainian | Danish | Catalan |
 |--------------------|---------|---------|--------|---------|--------|---------|------------|---------|--------|--------|---------|--------|-----------|----------|-----------|--------|---------|
+| EuroLLM-1.7B | 0.4744  | 0.4760    | 0.6057    | 0.4793    | 0.5337    | 0.5298    | 0.5085    | 0.5224    | 0.4654    | 0.4949    | 0.4104     | 0.4800     | 0.3655     | 0.4097     | 0.4606     | 0.436      | 0.4702     | 0.4445     |
 | TinyLlama-v1.1       |0.3674   | 0.6248  | 0.3650  | 0.4137  | 0.4010  | 0.3780  | 0.3892  | 0.3494  | 0.3588   | 0.2880   | 0.3561   | 0.2841   | 0.3073   | 0.3267   | 0.3349   | 0.3408   | 0.3613   |
 | Gemma-2B             |0.4666  | 0.7165  | 0.4756  | 0.5414  | 0.5180  | 0.4841  | 0.5081  | 0.4664  | 0.4655   | 0.3868   | 0.4383   | 0.3413   | 0.3710   | 0.4316   | 0.4291   | 0.4471   | 0.4448   |
 ## Bias, Risks, and Limitations
 EuroLLM-1.7B has not been aligned to human preferences, so the model may generate problematic outputs (e.g., hallucinations, harmful content, or false statements).