metadata
library_name: transformers
license: llama3
language:
- ko
- en
pipeline_tag: text-generation
davidkim205/ko-gemma-2-9b-it
davidkim205/ko-gemma-2-9b-it is one of several models being researched to improve the performance of Korean language models.
(would be released soon)
Model Details
- Model Developers : davidkim(changyeon kim)
- Repository : -
- base mode : google/gemma-2-9b-it
- sft dataset : qa_ability_1851.jsonl
Usage
Chat Template
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
model_id = "davidkim205/ko-gemma-2-9b-it"
quantization_config = BitsAndBytesConfig(load_in_4bit=True)
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
quantization_config=quantization_config)
chat = [
{ "role": "system", "content":"๋น์ ์ ์ง๋ฌธ์ ๋ํด์ ์์ธํ ์ค๋ช
ํ๋ AI์
๋๋ค."},
{ "role": "user", "content": "๋ฅ๋ฌ๋์ ์ด๋ป๊ฒ ๊ณต๋ถํด์ผํ๋์?" },
]
prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
inputs = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")
outputs = model.generate(input_ids=inputs.to(model.device), max_new_tokens=1024)
print(tokenizer.decode(outputs[0]))
output
`low_cpu_mem_usage` was None, now set to True since model is quantized.
Loading checkpoint shards: 100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 4/4 [00:04<00:00, 1.04s/it]
/home/david/anaconda3/envs/eval/lib/python3.10/site-packages/bitsandbytes/nn/modules.py:426: UserWarning: Input type into Linear4bit is torch.float16, but bnb_4bit_compute_dtype=torch.float32 (default). This will lead to slow inference or training speed.
warnings.warn(
<bos>๋น์ ์ ์ง๋ฌธ์ ๋ํด์ ์์ธํ ์ค๋ช
ํ๋ AI์
๋๋ค.<start_of_turn>user
๋ฅ๋ฌ๋์ ์ด๋ป๊ฒ ๊ณต๋ถํด์ผํ๋์?<end_of_turn>
<start_of_turn>model
๋ฅ๋ฌ๋์ ๊ณต๋ถํ๋ ๊ฒ์ ํฅ๋ฏธ๋กญ๊ณ ๋ณด๋ ์๋ ์ฌ์ ์ด ๋ ์ ์์ต๋๋ค!
ํ์ง๋ง ์ด๋์๋ถํฐ ์์ํด์ผ ํ ์ง ๋ง๋งํ๊ฒ ๋๊ปด์ง ์๋ ์์ต๋๋ค.
๋ค์์ ๋ฅ๋ฌ๋์ ๊ณต๋ถํ๊ธฐ ์ํ ๋จ๊ณ๋ณ ๊ฐ์ด๋์
๋๋ค.
**1๋จ๊ณ: ๊ธฐ์ด ๋ค์ง๊ธฐ**
* **์ํ**: ๋ฅ๋ฌ๋์ ๊ธฐ๋ฐ์ด ๋๋ ์ ํ๋์, ๋ฏธ์ ๋ถ, ํ๋ฅ ๋ฐ ํต๊ณ์ ๋ํ ๊ธฐ๋ณธ ์ง์์ด ํ์ํฉ๋๋ค. Khan Academy, Coursera ๋ฑ ์จ๋ผ์ธ ํ๋ซํผ์์ ์ํ ๊ฐ์ข๋ฅผ ๋ฃ๋ ๊ฒ์ ์ถ์ฒํฉ๋๋ค.
* **ํ๋ก๊ทธ๋๋ฐ**: Python์ ๋ฅ๋ฌ๋ ๋ถ์ผ์์ ๊ฐ์ฅ ๋๋ฆฌ ์ฌ์ฉ๋๋ ํ๋ก๊ทธ๋๋ฐ ์ธ์ด์
๋๋ค. Python ๊ธฐ์ด ๋ฌธ๋ฒ, ๋ฐ์ดํฐ ๊ตฌ์กฐ, ํจ์ ๋ฑ์ ์ตํ์ธ์. Codecademy, Google's Python Class ๋ฑ์ ํ๋ซํผ์์ Python์ ๋ฐฐ์ธ ์ ์์ต๋๋ค.
* **๊ธฐ๋ณธ ๋จธ์ ๋ฌ๋**: ๋ฅ๋ฌ๋์ ์ดํดํ๊ธฐ ์ ์ ๊ธฐ๋ณธ์ ์ธ ๋จธ์ ๋ฌ๋ ๊ฐ๋
์ ์ตํ๋ ๊ฒ์ด ์ค์ํฉ๋๋ค.
* ๋ถ๋ฅ, ํ๊ท, ํด๋ฌ์คํฐ๋ง ๋ฑ์ ๋จธ์ ๋ฌ๋ ์๊ณ ๋ฆฌ์ฆ์ ์ดํดํ๊ณ , Scikit-learn ๋ผ์ด๋ธ๋ฌ๋ฆฌ๋ฅผ ํ์ฉํ์ฌ ์ค์ต์ ํด๋ณด์ธ์.
**2๋จ๊ณ: ๋ฅ๋ฌ๋ ๊ฐ๋
ํ์ต**
* **์จ๋ผ์ธ ๊ฐ์ข**: Coursera, edX, Udacity ๋ฑ์ ํ๋ซํผ์์ ์ ๊ณตํ๋ ๋ฅ๋ฌ๋ ๊ฐ์ข๋ฅผ ์๊ฐํ์ธ์. Andrew Ng์ Deep Learning Specialization์ ๋ฅ๋ฌ๋ ๋ถ์ผ์ ๊ธฐ๋ณธ ๊ฐ๋
์ ํํํ๊ฒ ๋ค์ง๋ ๋ฐ ์ข์ ์ ํ์
๋๋ค.
* **์ฑ
**: ๋ฅ๋ฌ๋์ ๋ํ ์ดํด๋ฅผ ์ฌํ์ํค๊ธฐ ์ํด ์ฑ
์ ์ฝ๋ ๊ฒ๋ ์ข์ ๋ฐฉ๋ฒ์
๋๋ค.
* "Deep Learning" (Ian Goodfellow, Yoshua Bengio, Aaron Courville)์ ๋ฅ๋ฌ๋ ๋ถ์ผ์ ์ ๋ฌธ๊ฐ๋ฅผ ์ํ ์ฌ๋ ์๋ ์ฑ
์
๋๋ค.
* "Hands-On Machine Learning with Scikit-Learn, Keras & TensorFlow" (Aurรฉlien Gรฉron)์ ์ค์ต ์ค์ฌ์ผ๋ก ๋ฅ๋ฌ๋์ ๋ฐฐ์ฐ๊ณ ์ถ์ ์ฌ๋์๊ฒ ์ ํฉํฉ๋๋ค.
* **๋ธ๋ก๊ทธ ๋ฐ ๊ธฐ์ฌ**: ๋ฅ๋ฌ๋ ๊ด๋ จ ์ต์ ํธ๋ ๋์ ์ฐ๊ตฌ ๋ํฅ์ ํ์
ํ๊ธฐ ์ํด ๋ธ๋ก๊ทธ ๋ฐ ๊ธฐ์ฌ๋ฅผ ์ฝ๋ ๊ฒ์ด ์ข์ต๋๋ค.
**3๋จ๊ณ: ์ค์ต ๋ฐ ํ๋ก์ ํธ ์งํ**
* **๋ฐ์ดํฐ์
**: Kaggle, UCI Machine Learning Repository ๋ฑ์ ํ๋ซํผ์์ ๋ค์ํ ๋ฐ์ดํฐ์
์ ์ฐพ์ ์ค์ตํ ์ ์์ต๋๋ค.
* **๋ผ์ด๋ธ๋ฌ๋ฆฌ**: TensorFlow, PyTorch, Keras ๋ฑ์ ๋ฅ๋ฌ๋ ๋ผ์ด๋ธ๋ฌ๋ฆฌ๋ฅผ ํ์ฉํ์ฌ ๋ชจ๋ธ์ ๊ตฌ์ถํ๊ณ ํ๋ จํ์ธ์.
* **ํ๋ก์ ํธ**: ๋ฅ๋ฌ๋ ๊ธฐ์ ์ ์ ์ฉํ์ฌ ์ค์ ๋ฌธ์ ๋ฅผ ํด๊ฒฐํ๋ ํ๋ก์ ํธ๋ฅผ ์งํํ๋ ๊ฒ์ด ์ค์ํฉ๋๋ค.
* ์ด๋ฏธ์ง ๋ถ๋ฅ, ์์ฐ์ด ์ฒ๋ฆฌ, ์์ธก ๋ชจ๋ธ ๊ฐ๋ฐ ๋ฑ ๋ค์ํ ํ๋ก์ ํธ๋ฅผ ํตํด ๋ฅ๋ฌ๋ ์ค๋ ฅ์ ํฅ์์ํฌ ์ ์์ต๋๋ค.
**์ถ๊ฐ ํ**
* **์ปค๋ฎค๋ํฐ ํ๋**: ๋ฅ๋ฌ๋ ๊ด๋ จ ์ปค๋ฎค๋ํฐ์ ์ฐธ์ฌํ์ฌ ๋ค๋ฅธ ์ฌ๋๋ค๊ณผ ๊ต๋ฅํ๊ณ ์ง๋ฌธ์ ํด๋ณด์ธ์.
* **๊พธ์คํจ**: ๋ฅ๋ฌ๋์ ๋ณต์กํ ๋ถ์ผ์ด๋ฏ๋ก ๊พธ์คํ ๊ณต๋ถํ๊ณ ์ค์ตํ๋ ๊ฒ์ด ์ค์ํฉ๋๋ค.
<end_of_turn><eos>
Benchmark
kollm_evaluation
https://github.com/davidkim205/kollm_evaluation
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | |
---|---|---|---|---|---|---|---|
kobest | N/A | none | 0 | acc | 0.5150 | ยฑ | 0.0073 |
none | 0 | f1 | 0.4494 | ยฑ | N/A | ||
- kobest_boolq | 1 | none | 0 | acc | 0.6154 | ยฑ | 0.0130 |
none | 0 | f1 | 0.5595 | ยฑ | N/A | ||
- kobest_copa | 1 | none | 0 | acc | 0.4710 | ยฑ | 0.0158 |
none | 0 | f1 | 0.4700 | ยฑ | N/A | ||
- kobest_hellaswag | 1 | none | 0 | acc | 0.3880 | ยฑ | 0.0218 |
none | 0 | f1 | 0.3832 | ยฑ | N/A | ||
none | 0 | acc_norm | 0.4780 | ยฑ | 0.0224 | ||
- kobest_sentineg | 1 | none | 0 | acc | 0.5189 | ยฑ | 0.0251 |
none | 0 | f1 | 0.4773 | ยฑ | N/A | ||
- kobest_wic | 1 | none | 0 | acc | 0.4873 | ยฑ | 0.0141 |
none | 0 | f1 | 0.3276 | ยฑ | N/A | ||
ko_truthfulqa | 2 | none | 0 | acc | 0.3390 | ยฑ | 0.0166 |
ko_mmlu | 1 | none | 0 | acc | 0.1469 | ยฑ | 0.0019 |
none | 0 | acc_norm | 0.1469 | ยฑ | 0.0019 | ||
ko_hellaswag | 1 | none | 0 | acc | 0.2955 | ยฑ | 0.0046 |
none | 0 | acc_norm | 0.3535 | ยฑ | 0.0048 | ||
ko_common_gen | 1 | none | 0 | acc | 0.5825 | ยฑ | 0.0126 |
none | 0 | acc_norm | 0.5825 | ยฑ | 0.0126 | ||
ko_arc_easy | 1 | none | 0 | acc | 0.2329 | ยฑ | 0.0124 |
none | 0 | acc_norm | 0.2867 | ยฑ | 0.0132 |
Evaluation of KEval
keval is an evaluation model that learned the prompt and dataset used in the benchmark for evaluating Korean language models among various methods of evaluating models with chatgpt to compensate for the shortcomings of the existing lm-evaluation-harness.
https://huggingface.co/davidkim205/keval-7b
model | ned | exe_time | evalscore | count |
---|---|---|---|---|
claude-3-opus-20240229 | nan | nan | 8.79 | 42 |
gpt-4-turbo-2024-04-09 | nan | nan | 8.71 | 42 |
Qwen2-72B-Instruct | nan | 29850.5 | 7.85 | 42 |
WizardLM-2-8x22B | nan | 133831 | 7.57 | 42 |
ko-gemma-2-9b-it | nan | 30789.5 | 7.52 | 42 |
HyperClovaX | nan | nan | 7.44 | 42 |
gemma-2-9b-it | nan | 23531.7 | 7.4 | 42 |
glm-4-9b-chat | nan | 24825.6 | 7.31 | 42 |
Ko-Llama-3-8B-Instruct | nan | 10697.5 | 6.81 | 42 |
Qwen2-7B-Instruct | nan | 11856.3 | 6.02 | 42 |
Not-WizardLM-2-7B | nan | 12955.7 | 5.26 | 42 |
gemma-1.1-7b-it | nan | 6950.5 | 4.99 | 42 |
Mistral-7B-Instruct-v0.3 | nan | 19631.4 | 4.89 | 42 |
Phi-3-small-128k-instruct | nan | 26747.5 | 3.52 | 42 |