gemma2_colloquial_korean_translator
Model Description
This model is fine-tuned to translate English text into natural and fluent colloquial Korean based on the gemma-2-9b language model. It improves the accuracy and naturalness of translation by effectively reflecting expressions and vocabulary used in everyday conversation. It uses the PEFT (Parameter-Efficient Fine-Tuning) technique, specifically LoRA (Low-Rank Adaptation), for efficient training.
Key Features
- Base Model: Google/gemma-2-9b-Instruct
- Task: English colloquial → Korean translation
- Training Technique: QLORA (Quantized Low-Rank Adaptation)
- Quantization: 4-bit quantization (nf4)
- LoRA Configuration:
- rank (r): 6
- alpha: 8
- dropout: 0.05
- target modules: "q_proj", "o_proj", "k_proj", "v_proj", "gate_proj", "up_proj", "down_proj"
Training Data
The model was trained on a dataset consisting of English colloquial expressions and their corresponding Korean translations. The data was provided in JSON format.
(Specific) We used the 'Korean-English Translation Parallel Corpus for Daily Life and Colloquial Expressions' from AI Hub. This dataset includes 500,000 pairs of English-Korean text, significantly enhancing the model's ability to handle everyday expressions and colloquial language.
Training Settings
- Epochs: 3
- Batch Size: 4
- Gradient Accumulation Steps: 4
- Learning Rate: 2e-4
- Weight Decay: 0.01
- Optimizer: AdamW (8-bit)
- Max Sequence Length: 512 tokens
Usage
You can use this model to translate English colloquial expressions into Korean. Here's an example:
from transformers import AutoTokenizer, AutoModelForCausalLM
model_path = "Soonchan/gemma2_colloquial_korean_translator"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path)
def translate(text):
prompt = f"""<bos><start_of_turn>user
Please translate the following English colloquial expression into Korean.:
{text}<end_of_turn>
<start_of_turn>model
"""
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
# Usage example
english_text = "What's up?"
korean_translation = translate(english_text)
print(korean_translation)
Limitations
- This model is specialized for colloquial expressions and may not be suitable for translating formal documents or technical content.
- The model's output should always be reviewed, as it may generate inappropriate or inaccurate translations depending on the context.
License
This model follows the license of the original Gemma model. Please check the relevant license before use.
References
- Gemma: Open Models Based on Gemini Technology and Research
- PEFT: Parameter-Efficient Fine-Tuning of Billion-Scale Models on Low-Resource Hardware
- QLoRA: Efficient Finetuning of Quantized LLMs
Framework Versions
- PEFT 0.12.0