Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,90 @@
|
|
1 |
-
---
|
2 |
-
license: apache-2.0
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
pipeline_tag: text-generation
|
6 |
+
tags:
|
7 |
+
- chat
|
8 |
+
base_model: Qwen/Qwen2-7B
|
9 |
+
---
|
10 |
+
|
11 |
+
# Model Summary
|
12 |
+
Qwen2-7B-Instruct-Better-Translation is a fine-tuned language model based on Qwen2-7B-Instruct, specifically optimized for improving English-to-Chinese translation. The model was fine-tuned using Direct Preference Optimization (DPO) with a custom dataset that prioritizes fluent, idiomatic translations (chosen) over literal translations (rejected).
|
13 |
+
|
14 |
+
Developers: sevenone
|
15 |
+
|
16 |
+
- License: Qwen2 License
|
17 |
+
- Base Model: Qwen2-7B-Instruct
|
18 |
+
- Model Size: 7B
|
19 |
+
- Context Length: 131,072 tokens (inherits from Qwen2-7B-Instruct)
|
20 |
+
|
21 |
+
# 1. Introduction
|
22 |
+
Qwen2-7B-Instruct-Better-Translation is designed to provide high-quality English-to-Chinese translations, particularly focusing on producing natural, idiomatic translations instead of literal, word-for-word translations. The fine-tuning process involved using a preference dataset where the chosen translations were idiomatic and the rejected translations were more literal. This model is ideal for users who need accurate and fluent translations for complex or nuanced English text.
|
23 |
+
|
24 |
+
# 2. Training Details
|
25 |
+
The model was fine-tuned using Direct Preference Optimization (DPO), a method that optimizes the model to prefer certain outputs over others based on user-provided preferences. The training dataset consisted of English source sentences, with corresponding translations labeled as either "chosen" (idiomatic) or "rejected" (literal).
|
26 |
+
|
27 |
+
- Training Framework: Hugging Face Transformers
|
28 |
+
- Optimizer: AdamW
|
29 |
+
- Training Method: Lora with direct preference optimization
|
30 |
+
- Training Data: Custom preference dataset for English-to-Chinese translation
|
31 |
+
- Preference Type: Favoring idiomatic translations (chosen) over literal translations (rejected)
|
32 |
+
|
33 |
+
# 3. Requirements
|
34 |
+
To use this model, please ensure you have installed `transformers>=4.37.0` to avoid any compatibility issues.
|
35 |
+
|
36 |
+
# 4. Usage
|
37 |
+
You can load and use the model to translate English to Chinese as shown in the following code snippet:
|
38 |
+
|
39 |
+
```python
|
40 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
41 |
+
import torch
|
42 |
+
|
43 |
+
model_id = "sevenone/Qwen2-7B-Instruct-Better-Translation"
|
44 |
+
device = "cuda" # load onto GPU if available
|
45 |
+
|
46 |
+
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
47 |
+
model = AutoModelForCausalLM.from_pretrained(
|
48 |
+
model_id,
|
49 |
+
torch_dtype="auto",
|
50 |
+
device_map="auto"
|
51 |
+
)
|
52 |
+
|
53 |
+
prompt = "Translate the following sentence to Chinese: 'Artificial intelligence is transforming industries worldwide.'"
|
54 |
+
messages = [
|
55 |
+
{"role": "system", "content": "You are a helpful assistant."},
|
56 |
+
{"role": "user", "content": prompt}
|
57 |
+
]
|
58 |
+
|
59 |
+
# Apply the chat template for better generation
|
60 |
+
text = tokenizer.apply_chat_template(
|
61 |
+
messages,
|
62 |
+
tokenize=False,
|
63 |
+
add_generation_prompt=True
|
64 |
+
)
|
65 |
+
model_inputs = tokenizer([text], return_tensors="pt").to(device)
|
66 |
+
|
67 |
+
# Generate translation
|
68 |
+
generated_ids = model.generate(
|
69 |
+
model_inputs.input_ids,
|
70 |
+
max_new_tokens=512
|
71 |
+
)
|
72 |
+
generated_ids = [
|
73 |
+
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
|
74 |
+
]
|
75 |
+
|
76 |
+
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
|
77 |
+
print(response)
|
78 |
+
```
|
79 |
+
|
80 |
+
# 5. Citation
|
81 |
+
If sevenone/qwen2-7b-instruct-better-translation is helpful in your work, please kindly cite as:
|
82 |
+
```
|
83 |
+
@misc{sevenone_2024,
|
84 |
+
author = {Sevenone},
|
85 |
+
title = {Qwen2-7B-Instruct-Better-Translation},
|
86 |
+
year = 2024,
|
87 |
+
url = {https://huggingface.co/sevenone/qwen2-7b-instruct-better-translation},
|
88 |
+
publisher = {Hugging Face}
|
89 |
+
}
|
90 |
+
```
|