dahye1 commited on
Commit
494b09e
1 Parent(s): f9558ff

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -15
README.md CHANGED
@@ -10,35 +10,35 @@ KoQuality-Polyglot-5.8b is a fine-tuned version of [EleutherAI/polyglot-ko-5.8b]
10
  ## Overall Average accuracy score of the KoBEST datasets
11
 
12
  We use [KoBEST benchmark](https://huggingface.co/datasets/skt/kobest_v1) datasets(BoolQ, COPA, HellaSwag, SentiNeg, WiC) to compare the performance of our best model and other models accuracy. Our model outperforms other models in the average accuracy score of the KoBEST datasets.
13
- <img src=https://cdn-uploads.huggingface.co/production/uploads/650fecfd247f564485f8fbcf/q4cCUCzRJa3m2f7oxI_FY.png style="max-width: 500px; width: 300%"/>
14
 
15
- | Model | 0-shot | 1-shot | 2-shot | 5-shot | 10-shot
16
- | --- | --- | --- | --- | --- | --- |
17
- | polyglot-ko-5.8b | 0.5587 | 0.5977 | 0.6138 | 0.6431 | 0.6457
18
- | koalpcaca-polyglot-5.8b | 0.5085 | 0.5561 | 0.5768 | 0.6097 | 0.6059
19
- | kullm-polyglot-5.8b | 0.5409 | 0.6072 | 0.5945 | 0.6345 | 0.6530
20
- | koquality-polyglot-5.8b | 0.5472 | 0.5979 | 0.6260 | 0.6486 | 0.6535
21
 
22
- ## Evaluation results
23
- ### COPA (F1)
24
 
25
- <img src=https://cdn-uploads.huggingface.co/production/uploads/650fecfd247f564485f8fbcf/7EKl1OAgKgPBFcSlGzBiW.png style="max-width: 600px; width: 400%"/>
26
 
27
  | Model | 0-shot | 1-shot | 2-shot | 5-shot | 10-shot
28
  | --- | --- | --- | --- | --- | --- |
29
- | polyglot-ko-5.8b | 0.5587 | 0.5977 | 0.6138 | 0.6431 | 0.6457
30
- | koalpcaca-polyglot-5.8b | 0.5085 | 0.5561 | 0.5768 | 0.6097 | 0.6059
31
- | kullm-polyglot-5.8b | 0.5409 | 0.6072 | 0.5945 | 0.6345 | 0.6530
32
- | koquality-polyglot-5.8b | 0.5472 | 0.5979 | 0.6260 | 0.6486 | 0.6535
 
 
 
 
33
 
 
 
34
 
35
  ### HellaSwag (F1)
 
36
 
37
- ### BoolQ (F1)
38
 
39
  ### SentiNeg (F1)
 
 
40
 
41
  ### WiC (F1)
 
42
 
43
 
44
 
 
10
  ## Overall Average accuracy score of the KoBEST datasets
11
 
12
  We use [KoBEST benchmark](https://huggingface.co/datasets/skt/kobest_v1) datasets(BoolQ, COPA, HellaSwag, SentiNeg, WiC) to compare the performance of our best model and other models accuracy. Our model outperforms other models in the average accuracy score of the KoBEST datasets.
13
+ <img src=https://cdn-uploads.huggingface.co/production/uploads/650fecfd247f564485f8fbcf/t5x4PphoNb-tW3iCzXXHT.png style="max-width: 500px; width: 300%"/>
14
 
 
 
 
 
 
 
15
 
 
 
16
 
 
17
 
18
  | Model | 0-shot | 1-shot | 2-shot | 5-shot | 10-shot
19
  | --- | --- | --- | --- | --- | --- |
20
+ | polyglot-ko-5.8b | 0.4734 | 0.5929 | 0.6120 | 0.6388 | 0.6295
21
+ | koalpcaca-polyglot-5.8b | 0.4731 | 0.5284 | 0.5721 | 0.6054 | 0.6042
22
+ | kullm-polyglot-5.8b | 0.4415 | 0.6030 | 0.5849 | 0.6252 | 0.6451
23
+ | koquality-polyglot-5.8b | 0.4530 | 0.6050 | 0.6351 | 0.6420 | 0.6457
24
+
25
+ ## Evaluation results
26
+ ### COPA (F1)
27
+ <img src=https://cdn-uploads.huggingface.co/production/uploads/650fecfd247f564485f8fbcf/QAie0x99S8-KEKvK0I_uZ.png style="max-width: 500px; width: 200%"/>
28
 
29
+ ### BoolQ (F1)
30
+ <img src=https://cdn-uploads.huggingface.co/production/uploads/650fecfd247f564485f8fbcf/CtEWEQ5BBS05V9cDWA7kp.png style="max-width: 500px; width: 200%"/>
31
 
32
  ### HellaSwag (F1)
33
+ <img src=https://cdn-uploads.huggingface.co/production/uploads/650fecfd247f564485f8fbcf/cHws6qWkDlTfs5GVcQvtN.png style="max-width: 500px; width: 200%"/>
34
 
 
35
 
36
  ### SentiNeg (F1)
37
+ <img src=https://cdn-uploads.huggingface.co/production/uploads/650fecfd247f564485f8fbcf/VEG15XXOIbzJyQAusLa4B.png style="max-width: 500px; width: 200%"/>
38
+
39
 
40
  ### WiC (F1)
41
+ <img src=https://cdn-uploads.huggingface.co/production/uploads/650fecfd247f564485f8fbcf/hV-uADJiydkVQOyYysej9.png style="max-width: 500px; width: 200%"/>
42
 
43
 
44