KoQuality-Polyglot-5.8b / README.md

nayohan

Update README.md

d82bc1b 11 months ago

preview code

raw

history blame

No virus

3.74 kB

	---
	language:
	- ko
	license: apache-2.0
	tags:
	- generated_from_trainer
	- polyglot-ko
	- gpt-neox
	- KoQuality
	datasets:
	- DILAB-HYU/KoQuality
	pipeline_tag: text-generation
	base_model: EleutherAI/polyglot-ko-5.8b
	model-index:
	- name: KoAlpaca-Polyglot-5.8B
	results: []
	---
	# KoQuality-Polyglot-5.8b

	KoQuality-Polyglot-5.8b is a fine-tuned iteration of the [EleutherAI/polyglot-ko-5.8b](https://huggingface.co/EleutherAI/polyglot-ko-5.8b) model, specifically trained on the [KoQuality dataset](https://huggingface.co/datasets/DILAB-HYU/KoQuality). Notably, when excluding models employing COT datasets, KoQuality-Polyglot-5.8b exhibits exceptional performance in same size models, even though it operates with a relatively small dataset.

	## Open Ko-LLM LeaderBoard
	<img src="https://cdn-uploads.huggingface.co/production/uploads/6152b4b9ecf3ca6ab820e325/iYzR_mdvkcjnVquho0Y9R.png" width= "1000px" title="하얀 강아지">

	Our approach centers around leveraging high-quality instruction datasets to deepen our understanding of commands, all the while preserving the performance of the Pre-trained Language Model (PLM). Compared to alternative models, we have achieved this with minimal learning, utilizing only 1% of the dataset, which equates to 4006 instructions.

	## Overall Average accuracy score of the KoBEST datasets

	We use [KoBEST benchmark](https://huggingface.co/datasets/skt/kobest_v1) datasets(BoolQ, COPA, HellaSwag, SentiNeg, WiC) to compare the performance of our best model and other models accuracy. Our model outperforms other models in the average accuracy score of the KoBEST datasets.
	<img src="https://cdn-uploads.huggingface.co/production/uploads/650fecfd247f564485f8fbcf/t5x4PphoNb-tW3iCzXXHT.png" width= "500px">



	\| Model \| 0-shot \| 1-shot \| 2-shot \| 5-shot \| 10-shot
	\| --- \| --- \| --- \| --- \| --- \| --- \|
	\| polyglot-ko-5.8b \| 0.4734 \| 0.5929 \| 0.6120 \| 0.6388 \| 0.6295
	\| koalpcaca-polyglot-5.8b \| 0.4731 \| 0.5284 \| 0.5721 \| 0.6054 \| 0.6042
	\| kullm-polyglot-5.8b \| 0.4415 \| 0.6030 \| 0.5849 \| 0.6252 \| 0.6451
	\| koquality-polyglot-5.8b \| 0.4530 \| 0.6050 \| 0.6351 \| 0.6420 \| 0.6457

	## Evaluation results
	### COPA (F1)
	<img src="https://cdn-uploads.huggingface.co/production/uploads/650fecfd247f564485f8fbcf/QAie0x99S8-KEKvK0I_uZ.png" width= "500px">

	### BoolQ (F1)
	<img src="https://cdn-uploads.huggingface.co/production/uploads/650fecfd247f564485f8fbcf/CtEWEQ5BBS05V9cDWA7kp.png" width= "500px">

	### HellaSwag (F1)
	<img src="https://cdn-uploads.huggingface.co/production/uploads/650fecfd247f564485f8fbcf/cHws6qWkDlTfs5GVcQvtN.png" width= "500px">

	### SentiNeg (F1)
	<img src="https://cdn-uploads.huggingface.co/production/uploads/650fecfd247f564485f8fbcf/VEG15XXOIbzJyQAusLa4B.png" width= "500px">

	### WiC (F1)
	<img src="https://cdn-uploads.huggingface.co/production/uploads/650fecfd247f564485f8fbcf/hV-uADJiydkVQOyYysej9.png" width= "500px">


	## Training hyperparameters
	- learning_rate: 5e-5
	- train_batch_size: 4
	- seed: 42
	- distributed_type: multi-GPU (A100 80G) + No offloading
	- num_devices: 4
	- gradient_accumulation_steps: 16
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 2.0

	## Framework versions
	- Transformers 4.30.2
	- Pytorch 2.0.1+cu117
	- Datasets 2.11.0
	- deepspeed 0.9.5

	## Citation

	```
	@misc{2023koqaulity,
	title = {KoQuality: Curation of High-quality Instruction Data for Korean Language Models},
	author = {Na, Yohan and Kim, Dahye and Chae, Dong-Kyu},
	journal={Proceedings of the 35th Annual Conference on Human and Cognitive Language Technology (HCLT 2023)},
	pages={306-311},
	year = {2023},
	}
	```

	More details can be found here: [github.com/nayohan/KoQuality](https://github.com/nayohan/KoQuality)
	<br>