license: apache-2.0
language:
- en
pipeline_tag: text-generation
tags:
- chat
base_model: Qwen/Qwen2-7B
Model Summary
Qwen2-7B-Instruct-Better-Translation is a fine-tuned language model based on Qwen2-7B-Instruct, specifically optimized for improving English-to-Chinese translation. The model was fine-tuned using Direct Preference Optimization (DPO) with a custom dataset that prioritizes fluent, idiomatic translations (chosen) over literal translations (rejected).
Developers: sevenone
- License: Qwen2 License
- Base Model: Qwen2-7B-Instruct
- Model Size: 7B
- Context Length: 131,072 tokens (inherits from Qwen2-7B-Instruct)
1. Introduction
Qwen2-7B-Instruct-Better-Translation is designed to provide high-quality English-to-Chinese translations, particularly focusing on producing natural, idiomatic translations instead of literal, word-for-word translations. The fine-tuning process involved using a preference dataset where the chosen translations were idiomatic and the rejected translations were more literal. This model is ideal for users who need accurate and fluent translations for complex or nuanced English text.
2. Training Details
The model was fine-tuned using Direct Preference Optimization (DPO), a method that optimizes the model to prefer certain outputs over others based on user-provided preferences. The training dataset consisted of English source sentences, with corresponding translations labeled as either "chosen" (idiomatic) or "rejected" (literal).
- Training Framework: Hugging Face Transformers
- Optimizer: AdamW
- Training Method: Lora with direct preference optimization
- Training Data: Custom preference dataset for English-to-Chinese translation
- Preference Type: Favoring idiomatic translations (chosen) over literal translations (rejected)
3. Requirements
To use this model, please ensure you have installed transformers>=4.37.0
to avoid any compatibility issues.
4. Usage
You can load and use the model to translate English to Chinese as shown in the following code snippet:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "sevenone/Qwen2-7B-Instruct-Better-Translation"
device = "cuda" # load onto GPU if available
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype="auto",
device_map="auto"
)
prompt = "Translate the following sentence to Chinese: 'Artificial intelligence is transforming industries worldwide.'"
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": prompt}
]
# Apply the chat template for better generation
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)
# Generate translation
generated_ids = model.generate(
model_inputs.input_ids,
max_new_tokens=512
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
5. Citation
If sevenone/qwen2-7b-instruct-better-translation is helpful in your work, please kindly cite as:
@misc{sevenone_2024,
author = {Sevenone},
title = {Qwen2-7B-Instruct-Better-Translation},
year = 2024,
url = {https://huggingface.co/sevenone/qwen2-7b-instruct-better-translation},
publisher = {Hugging Face}
}