Update README.md
Browse files
README.md
CHANGED
@@ -15,7 +15,7 @@ metrics:
|
|
15 |
- comet
|
16 |
pipeline_tag: translation
|
17 |
---
|
18 |
-
# Model Card for
|
19 |
|
20 |
## Model Details
|
21 |
|
@@ -24,7 +24,7 @@ pipeline_tag: translation
|
|
24 |
TowerInstruct-Mistral-7B-v0.2 is a language model that results from fine-tuning a Mistral version of TowerBase on the TowerBlocks supervised fine-tuning dataset.
|
25 |
The model is trained to handle several translation-related tasks, such as general machine translation (e.g., sentence- and paragraph/document-level translation, terminology-aware translation, context-aware translation), automatic post edition, named-entity recognition, gramatical error correction, and paraphrase generation.
|
26 |
|
27 |
-
This model has performance comparable to [TowerInstruct-13B-v0.2](https://huggingface.co/Unbabel/TowerInstruct-13B-v0.1), while being half the size. Check out our [paper](https://
|
28 |
|
29 |
- **Developed by:** Unbabel, Instituto Superior Técnico, CentraleSupélec University of Paris-Saclay
|
30 |
- **Model type:** A 7B parameter model fine-tuned on a mix of publicly available, synthetic datasets on translation-related tasks, as well as conversational datasets and code instructions.
|
@@ -58,7 +58,7 @@ Here's how you can run the model using the `pipeline()` function from 🤗 Trans
|
|
58 |
import torch
|
59 |
from transformers import pipeline
|
60 |
|
61 |
-
pipe = pipeline("text-generation", model="Unbabel/
|
62 |
# We use the tokenizer’s chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
|
63 |
messages = [
|
64 |
{"role": "user", "content": "Translate the following text from Portuguese into English.\nPortuguese: Um grupo de investigadores lançou um novo modelo para tarefas relacionadas com tradução.\nEnglish:"},
|
@@ -81,11 +81,11 @@ We are currently working on improving quality and consistency on document-level
|
|
81 |
|
82 |
## Bias, Risks, and Limitations
|
83 |
|
84 |
-
|
85 |
|
86 |
## Prompt Format
|
87 |
|
88 |
-
|
89 |
```
|
90 |
<|im_start|>user
|
91 |
{USER PROMPT}<|im_end|>
|
@@ -108,13 +108,13 @@ Link to [TowerBlocks](https://huggingface.co/datasets/Unbabel/TowerBlocks-v0.1).
|
|
108 |
## Citation
|
109 |
|
110 |
```bibtex
|
111 |
-
@
|
112 |
-
|
113 |
-
|
114 |
-
|
115 |
-
|
116 |
-
|
117 |
-
|
118 |
}
|
119 |
```
|
120 |
|
|
|
15 |
- comet
|
16 |
pipeline_tag: translation
|
17 |
---
|
18 |
+
# Model Card for TowerInstruct-Mistral-7B-v0.2
|
19 |
|
20 |
## Model Details
|
21 |
|
|
|
24 |
TowerInstruct-Mistral-7B-v0.2 is a language model that results from fine-tuning a Mistral version of TowerBase on the TowerBlocks supervised fine-tuning dataset.
|
25 |
The model is trained to handle several translation-related tasks, such as general machine translation (e.g., sentence- and paragraph/document-level translation, terminology-aware translation, context-aware translation), automatic post edition, named-entity recognition, gramatical error correction, and paraphrase generation.
|
26 |
|
27 |
+
This model has performance comparable to [TowerInstruct-13B-v0.2](https://huggingface.co/Unbabel/TowerInstruct-13B-v0.1), while being half the size. Check out our [paper in COLM 2024](https://openreview.net/pdf?id=EHPns3hVkj).
|
28 |
|
29 |
- **Developed by:** Unbabel, Instituto Superior Técnico, CentraleSupélec University of Paris-Saclay
|
30 |
- **Model type:** A 7B parameter model fine-tuned on a mix of publicly available, synthetic datasets on translation-related tasks, as well as conversational datasets and code instructions.
|
|
|
58 |
import torch
|
59 |
from transformers import pipeline
|
60 |
|
61 |
+
pipe = pipeline("text-generation", model="Unbabel/TowerInstruct-Mistral-7B-v0.2", torch_dtype=torch.bfloat16, device_map="auto")
|
62 |
# We use the tokenizer’s chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
|
63 |
messages = [
|
64 |
{"role": "user", "content": "Translate the following text from Portuguese into English.\nPortuguese: Um grupo de investigadores lançou um novo modelo para tarefas relacionadas com tradução.\nEnglish:"},
|
|
|
81 |
|
82 |
## Bias, Risks, and Limitations
|
83 |
|
84 |
+
TowerInstruct-Mistral-7B-v0.2 has not been aligned to human preferences, so the model may generate problematic outputs (e.g., hallucinations, harmful content, or false statements).
|
85 |
|
86 |
## Prompt Format
|
87 |
|
88 |
+
TowerInstruct-Mistral-7B-v0.2 was trained using the ChatML prompt templates without any system prompts. An example follows below:
|
89 |
```
|
90 |
<|im_start|>user
|
91 |
{USER PROMPT}<|im_end|>
|
|
|
108 |
## Citation
|
109 |
|
110 |
```bibtex
|
111 |
+
@inproceedings{
|
112 |
+
alves2024tower,
|
113 |
+
title={Tower: An Open Multilingual Large Language Model for Translation-Related Tasks},
|
114 |
+
author={Duarte Miguel Alves and Jos{\'e} Pombal and Nuno M Guerreiro and Pedro Henrique Martins and Jo{\~a}o Alves and Amin Farajian and Ben Peters and Ricardo Rei and Patrick Fernandes and Sweta Agrawal and Pierre Colombo and Jos{\'e} G. C. de Souza and Andre Martins},
|
115 |
+
booktitle={First Conference on Language Modeling},
|
116 |
+
year={2024},
|
117 |
+
url={https://openreview.net/forum?id=EHPns3hVkj}
|
118 |
}
|
119 |
```
|
120 |
|