yanolja
/

EEVE-Korean-Instruct-10.8B-v1.0

@@ -30,6 +30,43 @@ If you're passionate about the field of Large Language Models and wish to exchan
 This model is a fine-tuned version of [yanolja/EEVE-Korean-10.8B-v1.0](https://huggingface.co/yanolja/EEVE-Korean-10.8B-v1.0), which is a Korean vocabulary-extended version of [upstage/SOLAR-10.7B-v1.0](https://huggingface.co/upstage/SOLAR-10.7B-v1.0). Specifically, we employed Direct Preference Optimization (DPO) based on [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory).
 ### Training Data
   - Korean-translated version of [Open-Orca/SlimOrca-Dedup](https://huggingface.co/datasets/Open-Orca/SlimOrca-Dedup)
   - Korean-translated version of [argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences-cleaned)

 This model is a fine-tuned version of [yanolja/EEVE-Korean-10.8B-v1.0](https://huggingface.co/yanolja/EEVE-Korean-10.8B-v1.0), which is a Korean vocabulary-extended version of [upstage/SOLAR-10.7B-v1.0](https://huggingface.co/upstage/SOLAR-10.7B-v1.0). Specifically, we employed Direct Preference Optimization (DPO) based on [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory).
+## Prompt Template
+```
+A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
+Human: {prompt}
+Assistant:
+```
+## How to Use it
+```python
+from transformers import AutoTokenizer
+from transformers import AutoModelForCausalLM
+model = AutoModelForCausalLM.from_pretrained("yanolja/EEVE-Korean-Instruct-10.8B-v1.0")
+tokenizer = AutoTokenizer.from_pretrained("yanolja/EEVE-Korean-Instruct-10.8B-v1.0")
+prompt_template = "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\nHuman: {prompt}\nAssistant:\n"
+text = '한국의 수도는 어디인가요? 아래 선택지 중 골라주세요.\n\n(A) 경성\n(B) 부산\n(C) 평양\n(D) 서울\n(E) 전주'
+model_inputs = tokenizer(prompt_template.format(prompt=text), return_tensors='pt')
+outputs = model.generate(**model_inputs, max_new_tokens=256)
+output_text = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
+print(output_text)
+```
+### Example Output
+```
+A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
+Human: 한국의 수도는 어디인가요? 아래 선택지 중 골라주세요.
+(A) 경성
+(B) 부산
+(C) 평양
+(D) 서울
+(E) 전주
+Assistant:
+(D) 서울이 한국의 수도입니다. 서울은 나라의 북동부에 위치해 있으며, 정치, 경제, 문화의 중심지입니다. 약 1,000만 명이 넘는 인구를 가진 세계에서 가장 큰 도시 중 하나입니다. 서울은 높은 빌딩, 현대적인 인프라, 활기 문화 장면으로 유명합니다. 또한, 많은 역사적 명소와 박물관이 있어 방문객들에게 풍부한 문화 체험을 제공합니다.
+```
 ### Training Data
   - Korean-translated version of [Open-Orca/SlimOrca-Dedup](https://huggingface.co/datasets/Open-Orca/SlimOrca-Dedup)
   - Korean-translated version of [argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences-cleaned)