File size: 2,827 Bytes
87499a5
 
 
2ab8898
ba98fb8
 
c8ab90b
53a2c9e
22b9989
2bfaaf0
ba98fb8
60b3ed9
494b09e
af15cca
ba98fb8
 
5286b27
2bfaaf0
 
494b09e
 
 
 
 
 
 
 
4ced979
494b09e
 
5286b27
2bfaaf0
494b09e
4ced979
 
2bfaaf0
494b09e
 
c9e09ae
2bfaaf0
494b09e
c9e09ae
ba98fb8
 
 
 
 
 
 
 
 
 
 
 
 
4ced979
2bfaaf0
 
 
 
2ab8898
ba98fb8
2ab8898
ba98fb8
 
 
 
 
 
 
 
 
2ab8898
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
---
license: cc-by-4.0
---
# **KoQuality-Polyglot-5.8b**


KoQuality-Polyglot-5.8b is a fine-tuned version of [EleutherAI/polyglot-ko-5.8b](https://huggingface.co/EleutherAI/polyglot-ko-5.8b) on [KoQuality dataset](https://huggingface.co/datasets/DILAB-HYU/KoQuality), which is curated by proposed method (len_group=5, k=100, n=0.01, method=ppl_sampling). 


## Overall Average accuracy score of the KoBEST datasets

We use [KoBEST benchmark](https://huggingface.co/datasets/skt/kobest_v1) datasets(BoolQ, COPA, HellaSwag, SentiNeg, WiC) to compare the performance of our best model and other models accuracy. Our model outperforms other models in the average accuracy score of the KoBEST datasets.
<img src=https://cdn-uploads.huggingface.co/production/uploads/650fecfd247f564485f8fbcf/t5x4PphoNb-tW3iCzXXHT.png style="max-width: 500px; width: 300%"/>




| Model | 0-shot | 1-shot | 2-shot | 5-shot | 10-shot
| --- | --- | --- | --- | --- | --- |
| polyglot-ko-5.8b | 0.4734 | 0.5929 | 0.6120 | 0.6388 | 0.6295
| koalpcaca-polyglot-5.8b | 0.4731 | 0.5284 | 0.5721 | 0.6054 | 0.6042
| kullm-polyglot-5.8b | 0.4415 | 0.6030 | 0.5849 | 0.6252 | 0.6451
| koquality-polyglot-5.8b | 0.4530 | 0.6050 | 0.6351 | 0.6420 | 0.6457

## Evaluation results
### COPA (F1)
<img src=https://cdn-uploads.huggingface.co/production/uploads/650fecfd247f564485f8fbcf/QAie0x99S8-KEKvK0I_uZ.png style="max-width: 500px; width: 200%"/>

### BoolQ (F1)
<img src=https://cdn-uploads.huggingface.co/production/uploads/650fecfd247f564485f8fbcf/CtEWEQ5BBS05V9cDWA7kp.png style="max-width: 500px; width: 200%"/>

### HellaSwag (F1)
<img src=https://cdn-uploads.huggingface.co/production/uploads/650fecfd247f564485f8fbcf/cHws6qWkDlTfs5GVcQvtN.png style="max-width: 500px; width: 200%"/>


### SentiNeg (F1)
<img src=https://cdn-uploads.huggingface.co/production/uploads/650fecfd247f564485f8fbcf/VEG15XXOIbzJyQAusLa4B.png style="max-width: 500px; width: 200%"/>


### WiC (F1)
<img src=https://cdn-uploads.huggingface.co/production/uploads/650fecfd247f564485f8fbcf/hV-uADJiydkVQOyYysej9.png style="max-width: 500px; width: 200%"/>



## Training hyperparameters
- learning_rate: 5e-5
- train_batch_size: 4
- seed: 42
- distributed_type: multi-GPU (A100 80G)
- num_devices: 4
- gradient_accumulation_steps: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 2.0

## Framework versions
- Transformers 4.30.2
- Pytorch 2.0.1+cu117
- Datasets 2.11.0
- deepspeed 0.9.5

## Citation

```
@misc{2023koqaulity,
  title = {KoQuality: Curation of High-quality Instruction Data for Korean Language Models},
  author = {Na, Yohan and Kim, Dahye and Chae, Dong-Kyu},
  journal={Proceedings of the 35th Annual Conference on Human and Cognitive Language Technology (HCLT 2023)},
  pages={},
  year = {2023},
}
```


<br>