jjhsnail0822 commited on
Commit
d0d57ef
1 Parent(s): 09cd401

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -1
README.md CHANGED
@@ -20,7 +20,7 @@ Jinhong Jeong, Ungsang Yoon
20
 
21
  ## Model Architecture
22
 
23
- The vocabulary size was expanded from original 32000 to 40000 to add Korean tokens efficiently. The model has sequence length of 2048. Everything else is the same as the original model.
24
 
25
  ## Training Datasets
26
 
@@ -28,8 +28,15 @@ We used CulturaX, Common Crawl CC-MAIN-2024-10, AI Hub Data, Korean Wikis, Corpo
28
 
29
  ## Model Benchmark
30
 
 
 
31
  | Task | Value |
32
  | --- | --- |
 
 
 
 
 
33
  | kmmlu_direct | 29.05 |
34
  | kobest | 59.13 |
35
 
 
20
 
21
  ## Model Architecture
22
 
23
+ The vocabulary size was expanded from original 32000 to 40000 to add Korean tokens efficiently. We used the [EEVE](https://arxiv.org/abs/2402.14714) technique for training. The model has sequence length of 2048. Everything else is the same as the original model.
24
 
25
  ## Training Datasets
26
 
 
28
 
29
  ## Model Benchmark
30
 
31
+ This model is ranked #1 in Ko-MMLU on the [Open Ko-LLM Leaderboard](https://huggingface.co/spaces/upstage/open-ko-llm-leaderboard) among pretrained Korean models of size 2B or smaller as of July 5, 2024.
32
+
33
  | Task | Value |
34
  | --- | --- |
35
+ | Ko-ARC | 31.74 |
36
+ | Ko-HellaSwag | 44.44 |
37
+ | Ko-MMLU | 28.06 |
38
+ | Ko-TruthfulQA | 41.63 |
39
+ | Ko-CommonGen V2 | 32.7 |
40
  | kmmlu_direct | 29.05 |
41
  | kobest | 59.13 |
42