Krystalan commited on
Commit
3e0fbf8
1 Parent(s): b89415f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +87 -6
README.md CHANGED
@@ -12,10 +12,11 @@ tags:
12
  pipeline_tag: text-generation
13
  ---
14
 
 
15
  # DRT-o1
16
 
17
  <p align="center">
18
- 🤗 <a href="https://huggingface.co/Krystalan/DRT-o1-7B">DRT-o1-7B</a>&nbsp&nbsp | &nbsp&nbsp🤗 <a href="https://huggingface.co/Krystalan/DRT-o1-14B">DRT-o1-14B</a>&nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://arxiv.org/abs/2412.17498">Paper</a>
19
 
20
  </p>
21
 
@@ -23,22 +24,85 @@ This repository contains the resources for our paper ["DRT-o1: Optimized Deep Re
23
 
24
 
25
  ### Updates:
 
 
 
26
  - *2024.12.24*: We released [our paper](https://arxiv.org/abs/2412.17498). Check it out!
27
  - *2024.12.23*: We released our model checkpoints. 🤗 <a href="https://huggingface.co/Krystalan/DRT-o1-7B">DRT-o1-7B</a> and 🤗 <a href="https://huggingface.co/Krystalan/DRT-o1-14B">DRT-o1-14B</a>.
28
 
29
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
  ## Introduction
31
 
 
32
  In this work, we introduce DRT-o1, an attempt to bring the success of long thought reasoning to neural machine translation (MT). To this end,
33
  - 🌟 We mine English sentences with similes or metaphors from existing literature books, which are suitable for translation via long thought.
34
  - 🌟 We propose a designed multi-agent framework with three agents (i.e., a translator, an advisor and an evaluator) to synthesize the MT samples with long thought. There are 22,264 synthesized samples in total.
35
- - 🌟 We train DRT-o1-7B and DRT-o1-14B using Qwen2.5-7B-Instruct and Qwen2.5-14B-Instruct as backbones.
36
 
 
37
 
38
 
39
- ## Quickstart
40
 
41
- ### ⛷️ Huggingface Transformers
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42
 
43
  ```python
44
  from transformers import AutoModelForCausalLM, AutoTokenizer
@@ -76,7 +140,7 @@ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
76
  print(response)
77
  ```
78
 
79
- ### ⛷️ vllm
80
 
81
  Deploying LLMs:
82
  ```bash
@@ -101,7 +165,7 @@ chat_response = client.chat.completions.create(
101
  {"role": "system", "content": "You are a philosopher skilled in deep thinking, accustomed to exploring complex problems with profound insight."},
102
  {"role": "user", "content": "Please translate the following text from English to Chinese:\nThe mother, with her feet propped up on a stool, seemed to be trying to get to the bottom of that answer, whose feminine profundity had struck her all of a heap."},
103
  ],
104
- temperature=0.7,
105
  top_p=0.8,
106
  max_tokens=2048,
107
  extra_body={
@@ -111,6 +175,23 @@ chat_response = client.chat.completions.create(
111
  print("Chat response:", chat_response)
112
  ```
113
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
114
 
115
  ## License
116
  This work is licensed under cc-by-nc-sa-4.0
 
 
12
  pipeline_tag: text-generation
13
  ---
14
 
15
+
16
  # DRT-o1
17
 
18
  <p align="center">
19
+ 🤗 <a href="https://huggingface.co/Krystalan/DRT-o1-7B">DRT-o1-7B</a>&nbsp&nbsp | &nbsp&nbsp🤗 <a href="https://huggingface.co/Krystalan/DRT-o1-8B">DRT-o1-8B</a>&nbsp&nbsp | &nbsp&nbsp🤗 <a href="https://huggingface.co/Krystalan/DRT-o1-14B">DRT-o1-14B</a>&nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://arxiv.org/abs/2412.17498">Paper</a>
20
 
21
  </p>
22
 
 
24
 
25
 
26
  ### Updates:
27
+ - *2024.12.31*: We updated [our paper](https://arxiv.org/abs/2412.17498) with more detals and analyses. Check it out!
28
+ - *2024.12.31*: We released the testing set of our work, please refer to `data/test.jsonl`
29
+ - *2024.12.30*: We released a new model checkpoint using Llama-3.1-8B-Instruct as the backbone, i.e., 🤗 <a href="https://huggingface.co/Krystalan/DRT-o1-8B">DRT-o1-8B</a>
30
  - *2024.12.24*: We released [our paper](https://arxiv.org/abs/2412.17498). Check it out!
31
  - *2024.12.23*: We released our model checkpoints. 🤗 <a href="https://huggingface.co/Krystalan/DRT-o1-7B">DRT-o1-7B</a> and 🤗 <a href="https://huggingface.co/Krystalan/DRT-o1-14B">DRT-o1-14B</a>.
32
 
33
 
34
+ If you find this work is useful, please consider cite our paper:
35
+ ```
36
+ @article{wang2024drt,
37
+ title={DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought},
38
+ author={Wang, Jiaan and Meng, Fandong and Liang, Yunlong and Zhou, Jie},
39
+ journal={arXiv preprint arXiv:2412.17498},
40
+ year={2024}
41
+ }
42
+ ```
43
+
44
+ ## Quick Links
45
+ - [Introduction](#introduction)
46
+ - [Models](#models)
47
+ - [Model Access](#model-access)
48
+ - [Model Performance](#model-performance)
49
+ - [Model Prompts](#model-prompts)
50
+ - [Quickstart](#quickstart)
51
+ - [Translation Cases](#translation-cases)
52
+ - [Data](#data)
53
+ - [License](#license)
54
+
55
  ## Introduction
56
 
57
+
58
  In this work, we introduce DRT-o1, an attempt to bring the success of long thought reasoning to neural machine translation (MT). To this end,
59
  - 🌟 We mine English sentences with similes or metaphors from existing literature books, which are suitable for translation via long thought.
60
  - 🌟 We propose a designed multi-agent framework with three agents (i.e., a translator, an advisor and an evaluator) to synthesize the MT samples with long thought. There are 22,264 synthesized samples in total.
61
+ - 🌟 We train DRT-o1-8B, DRT-o1-7B and DRT-o1-14B using Llama-3.1-8B-Instruct, Qwen2.5-7B-Instruct and Qwen2.5-14B-Instruct as backbones.
62
 
63
+ > Our goal is not to achieve competitive performance with OpenAI’s O1 in neural machine translation (MT). Instead, we explore technical routes to bring the success of long thought to MT. To this end, we introduce DRT-o1, *a byproduct of our exploration*, and we hope it could facilitate the corresponding research in this direction.
64
 
65
 
66
+ ## Models
67
 
68
+ ### Model Access
69
+
70
+ | | Backbone | Model Access |
71
+ | :--: | :--: | :--: |
72
+ | DRT-o1-7B | 🤗 <a href="https://huggingface.co/Qwen/Qwen2.5-7B-Instruct">Qwen2.5-7B-Instruct</a> | 🤗 <a href="https://huggingface.co/Krystalan/DRT-o1-7B">DRT-o1-7B</a> |
73
+ | DRT-o1-8B | 🤗 <a href="https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct">Llama-3.1-8B-Instruct</a> | 🤗 <a href="https://huggingface.co/Krystalan/DRT-o1-8B">DRT-o1-8B</a> |
74
+ | DRT-o1-14B | 🤗 <a href="https://huggingface.co/Qwen/Qwen2.5-14B-Instruct">Qwen2.5-14B-Instruct</a> | 🤗 <a href="https://huggingface.co/Krystalan/DRT-o1-14B">DRT-o1-14B</a> |
75
+
76
+ ### Model Performance
77
+ | | GRF | CometKiwi | GRB | BLEU | CometScore |
78
+ | :--: | :--: | :--: | :--: | :--: | :--: |
79
+ | Llama-3.1-8B-Instruct | 79.25 | 70.14 | 73.30 | 18.55 | 74.58 |
80
+ | Qwen2.5-7B-Instruct | 81.53 | 70.36 | 77.92 | 27.02 | 76.78 |
81
+ | Qwen2.5-14B-Instruct | 84.74 | 72.01 | 80.85 | 30.23 | 78.84 |
82
+ | Marco-o1-7B | 82.41 | 71.62 | 77.50 | 29.48 | 77.41 |
83
+ | QwQ-32B-preview | 86.31 | 71.48 | 83.08 | 27.46 | 78.68 |
84
+ | DRT-o1-8B | 84.49 | 70.85 | 80.80 | 32.67 | 78.81 |
85
+ | DRT-o1-7B | 85.57 | 71.78 | 82.38 | 35.54 | 80.19 |
86
+ | DRT-o1-14B | **87.19** | **72.11** | **83.20** | **36.46** | **80.64** |
87
+
88
+
89
+ ### Model Prompts
90
+ During model inference, please use the following prompts:
91
+ - System prompt: `You are a philosopher skilled in deep thinking, accustomed to exploring complex problems with profound insight.`
92
+ - User prompt: `Please translate the following text from English to Chinese:\n[An English text]`
93
+
94
+ DRT-o1 models will first generate the thought and then provide the final translation, with the following format:
95
+ ```
96
+ <thought>
97
+ [Reasoning process]
98
+ </thought>
99
+ <output>
100
+ [Final translation]
101
+ </output>
102
+ ```
103
+
104
+ ### Quickstart
105
+ - ⛷️ Huggingface Transformers:
106
 
107
  ```python
108
  from transformers import AutoModelForCausalLM, AutoTokenizer
 
140
  print(response)
141
  ```
142
 
143
+ - ⛷️ vllm:
144
 
145
  Deploying LLMs:
146
  ```bash
 
165
  {"role": "system", "content": "You are a philosopher skilled in deep thinking, accustomed to exploring complex problems with profound insight."},
166
  {"role": "user", "content": "Please translate the following text from English to Chinese:\nThe mother, with her feet propped up on a stool, seemed to be trying to get to the bottom of that answer, whose feminine profundity had struck her all of a heap."},
167
  ],
168
+ temperature=0.1,
169
  top_p=0.8,
170
  max_tokens=2048,
171
  extra_body={
 
175
  print("Chat response:", chat_response)
176
  ```
177
 
178
+ ## Translation Cases
179
+
180
+ | Source Sentence | Qwen2.5-14B-Instruct | QwQ-32B-preview | DRT-o1-14B |
181
+ | :--: | :--: | :--: | :--: |
182
+ |Women were weeping and children crying, and all were going as fast as seemingly lay in their power, looking behind now and then as if pursued by some deadly enemy. | 妇女在哭泣,孩子们在哀号,所有人都似乎在尽全力快速前行,不时回头张望,仿佛背后有某个致命的敌人在追赶。 | 妇女们哭泣着,孩子们哭喊着,每个人都在尽力加快脚步,不时回头张望,仿佛背后有致命的敌人在追击。 | 妇女悲泣,孩童哀号,众人疾奔,不时回首,仿佛有致命之敌在后紧追。 |
183
+ |Wait through this season of the rain Till suns of autumn dry the plain, Then shall thy giant foe, and all His host and realm, before thee fall. | 等待这雨季过去,直到秋日的阳光晒干原野,那时你的巨敌及其部众和王国都将倒在你面前。 | 忍过雨季漫长夜,待秋阳升空,晒干大地湿润色,那时,巨敌垮台,他的军队和王国,全在你脚下。 | 历尽这雨季,待秋日阳光驱散平原上的最后一丝湿润,那时,你那巨敌,连同他的军队与王国,都将匍匐在你脚下。 |
184
+ |And if their words had taken another turn...if he himself had been less fastidious about intruding on another man's secrets...it was cruel to think how thin a film had shut out rescue from all this guilt and misery. | 而如果他们的言辞转向了另一个方向……如果他自己对侵扰他人秘密不那么介意……想到正是那层薄薄的隔阂将救援拒之门外,使得这一切的罪恶和痛苦无法避免,真是残酷。 | 而且,假如他们的谈话走向不同……假如他自己不对干涉他人私事如此谨慎……那时他会意识到,只要一层薄薄的屏障就阻隔了救援,使得所有这些罪恶和痛苦无法解脱,这真是残忍。 | 若他们的言辞稍作转变……若他本人不那么介意侵入他人的心扉……想到这层薄薄的隔阂竟将所有内疚与痛苦隔离于救援之外,实为残忍。 |
185
+ |This cold officer upon a monument, who dropped epithets unconcernedly down, would be finer as a dead man, he thought. | 他认为,这个站在纪念碑上的冷漠官员,若死了会更好,他不带任何感情地抛下了一些称呼。 | 这个冷冰冰的官员站在纪念碑上,毫不在意地抛下一些称号,他想,如果作为一个死人会更出色。 | 纪念碑上的冷淡官员,漫不经心地吟咏那些修饰语,他心想,若化为亡者,或许更显尊贵。 |
186
+
187
+
188
+ ## Data
189
+
190
+ We release the testing set of our work, please refer to `data/test.jsonl`, where `en` indicates the English source sentences, and `zh` denotes the corresponding Chinese translation.
191
+
192
+ We will release the long-thought MT data as well as the data collection codes soon!
193
+
194
 
195
  ## License
196
  This work is licensed under cc-by-nc-sa-4.0
197
+