jjhsnail0822
commited on
Commit
•
d0d57ef
1
Parent(s):
09cd401
Update README.md
Browse files
README.md
CHANGED
@@ -20,7 +20,7 @@ Jinhong Jeong, Ungsang Yoon
|
|
20 |
|
21 |
## Model Architecture
|
22 |
|
23 |
-
The vocabulary size was expanded from original 32000 to 40000 to add Korean tokens efficiently. The model has sequence length of 2048. Everything else is the same as the original model.
|
24 |
|
25 |
## Training Datasets
|
26 |
|
@@ -28,8 +28,15 @@ We used CulturaX, Common Crawl CC-MAIN-2024-10, AI Hub Data, Korean Wikis, Corpo
|
|
28 |
|
29 |
## Model Benchmark
|
30 |
|
|
|
|
|
31 |
| Task | Value |
|
32 |
| --- | --- |
|
|
|
|
|
|
|
|
|
|
|
33 |
| kmmlu_direct | 29.05 |
|
34 |
| kobest | 59.13 |
|
35 |
|
|
|
20 |
|
21 |
## Model Architecture
|
22 |
|
23 |
+
The vocabulary size was expanded from original 32000 to 40000 to add Korean tokens efficiently. We used the [EEVE](https://arxiv.org/abs/2402.14714) technique for training. The model has sequence length of 2048. Everything else is the same as the original model.
|
24 |
|
25 |
## Training Datasets
|
26 |
|
|
|
28 |
|
29 |
## Model Benchmark
|
30 |
|
31 |
+
This model is ranked #1 in Ko-MMLU on the [Open Ko-LLM Leaderboard](https://huggingface.co/spaces/upstage/open-ko-llm-leaderboard) among pretrained Korean models of size 2B or smaller as of July 5, 2024.
|
32 |
+
|
33 |
| Task | Value |
|
34 |
| --- | --- |
|
35 |
+
| Ko-ARC | 31.74 |
|
36 |
+
| Ko-HellaSwag | 44.44 |
|
37 |
+
| Ko-MMLU | 28.06 |
|
38 |
+
| Ko-TruthfulQA | 41.63 |
|
39 |
+
| Ko-CommonGen V2 | 32.7 |
|
40 |
| kmmlu_direct | 29.05 |
|
41 |
| kobest | 59.13 |
|
42 |
|