dahye1's picture
Update README.md
af15cca
|
raw
history blame
No virus
2.01 kB
metadata
license: cc-by-4.0

KoQuality-Polyglot-5.8b

KoQuality-Polyglot-5.8b is an auto-regressive language model that conducts instruction tuning with KoQuality datasets on Polyglot-5.8b model.

Average accuracy score of the KoBEST datasets

Our best model is trained on KoQuality dataset, which is curated by proposed method (len_group=5, k=100, method=ppl_sampling). We use KoBEST benchmark datasets(KoBEST_boolq, KoBEST_copa, KoBEST_hellaswag, KoBEST_sentineg, KoBEST_wic) to compare the performance of our best model and other models accuracy. Our model outperforms other models in the average accuracy score of the KoBEST datasets.

Model 0-shot 1-shot 2-shot 5-shot 10-shot
koquality-polyglot-5.8b 0.5472 0.5979 0.6260 0.6486 0.6535
polyglot-ko-5.8b 0.5587 0.5977 0.6138 0.6431 0.6457
koalpcaca-polyglot-5.8b 0.5085 0.5561 0.5768 0.6097 0.6059
kullm-polyglot-5.8b 0.5409 0.6072 0.5945 0.6345 0.6530

Training hyperparameters

  • learning_rate: 5e-5
  • train_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU (A100 80G)
  • num_devices: 4
  • gradient_accumulation_steps: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 2.0

Citation

@misc{2023koqaulity,
  title = {KoQuality: Curation of High-quality Instruction Data for Korean Language Models},
  author = {Na, Yohan and Kim, Dahye and Chae, Dong-Kyu},
  journal={Proceedings of the 35th Annual Conference on Human and Cognitive Language Technology (HCLT 2023)},
  pages={},
  year = {2023},
}