sevenone's picture
Update README.md
35c230f verified
|
raw
history blame
3.58 kB
metadata
license: apache-2.0
language:
  - en
pipeline_tag: text-generation
tags:
  - chat
base_model: Qwen/Qwen2-7B

Model Summary

Qwen2-7B-Instruct-Better-Translation is a fine-tuned language model based on Qwen2-7B-Instruct, specifically optimized for improving English-to-Chinese translation. The model was fine-tuned using Direct Preference Optimization (DPO) with a custom dataset that prioritizes fluent, idiomatic translations (chosen) over literal translations (rejected).

Developers: sevenone

  • License: Qwen2 License
  • Base Model: Qwen2-7B-Instruct
  • Model Size: 7B
  • Context Length: 131,072 tokens (inherits from Qwen2-7B-Instruct)

1. Introduction

Qwen2-7B-Instruct-Better-Translation is designed to provide high-quality English-to-Chinese translations, particularly focusing on producing natural, idiomatic translations instead of literal, word-for-word translations. The fine-tuning process involved using a preference dataset where the chosen translations were idiomatic and the rejected translations were more literal. This model is ideal for users who need accurate and fluent translations for complex or nuanced English text.

2. Training Details

The model was fine-tuned using Direct Preference Optimization (DPO), a method that optimizes the model to prefer certain outputs over others based on user-provided preferences. The training dataset consisted of English source sentences, with corresponding translations labeled as either "chosen" (idiomatic) or "rejected" (literal).

  • Training Framework: Hugging Face Transformers
  • Optimizer: AdamW
  • Training Method: Lora with direct preference optimization
  • Training Data: Custom preference dataset for English-to-Chinese translation
  • Preference Type: Favoring idiomatic translations (chosen) over literal translations (rejected)

3. Requirements

To use this model, please ensure you have installed transformers>=4.37.0 to avoid any compatibility issues.

4. Usage

You can load and use the model to translate English to Chinese as shown in the following code snippet:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "sevenone/Qwen2-7B-Instruct-Better-Translation"
device = "cuda"  # load onto GPU if available

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype="auto",
    device_map="auto"
)

prompt = "Translate the following sentence to Chinese: 'Artificial intelligence is transforming industries worldwide.'"
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": prompt}
]

# Apply the chat template for better generation
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)

# Generate translation
generated_ids = model.generate(
    model_inputs.input_ids,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

5. Citation

If sevenone/qwen2-7b-instruct-better-translation is helpful in your work, please kindly cite as:

@misc{sevenone_2024,
    author       = {Sevenone},
    title        = {Qwen2-7B-Instruct-Better-Translation},
    year         = 2024,
    url          = {https://huggingface.co/sevenone/qwen2-7b-instruct-better-translation},
    publisher    = {Hugging Face}
}