michaelfeil
commited on
Commit
•
45a5b94
1
Parent(s):
2e2cbb8
Update README.md
Browse files
README.md
CHANGED
@@ -103,9 +103,59 @@ language:
|
|
103 |
- zu
|
104 |
license: mit
|
105 |
tags:
|
106 |
-
|
107 |
---
|
108 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
109 |
# M2M100 12B
|
110 |
|
111 |
M2M100 is a multilingual encoder-decoder (seq-to-seq) model trained for Many-to-Many multilingual translation.
|
|
|
103 |
- zu
|
104 |
license: mit
|
105 |
tags:
|
106 |
+
- ctranslate2
|
107 |
---
|
108 |
|
109 |
+
# Fast-Inference with Ctranslate2
|
110 |
+
Speedup inference by 2x-8x using int8 inference in C++
|
111 |
+
quantized version of facebook/m2m100_1.2B
|
112 |
+
|
113 |
+
pip install hf_hub_ctranslate2>=1.0.3 ctranslate2>=3.13.0
|
114 |
+
|
115 |
+
```python
|
116 |
+
from hf_hub_ctranslate2 import MultiLingualTranslatorCT2fromHfHub
|
117 |
+
|
118 |
+
model = MultiLingualTranslatorCT2fromHfHub(
|
119 |
+
model_name_or_path="michaelfeil/ct2fast-m2m100_PARAMS", device="cpu", compute_type="int8",
|
120 |
+
tokenizer=AutoTokenizer.from_pretrained(f"facebook/m2m100_418M")
|
121 |
+
)
|
122 |
+
|
123 |
+
outputs = model.generate(
|
124 |
+
["How do you call a fast Flamingo?", "Wie geht es dir?"],
|
125 |
+
src_lang=["en", "de"],
|
126 |
+
tgt_lang=["de", "fr"]
|
127 |
+
)
|
128 |
+
```
|
129 |
+
|
130 |
+
compute_type=int8_float16 for device="cuda"
|
131 |
+
compute_type=int8 for device="cpu"
|
132 |
+
|
133 |
+
Converted 5/13/23 to Ctranslate2
|
134 |
+
```bash
|
135 |
+
export ORG="facebook"
|
136 |
+
export NAME="m2m100_PARAMS"
|
137 |
+
ct2-transformers-converter --model "$ORG/$NAME" --copy_files .gitattributes README.md generation_config.json sentencepiece.bpe.model special_tokens_map.json tokenizer_config.json vocab.json --quantization float16
|
138 |
+
```
|
139 |
+
|
140 |
+
Alternative
|
141 |
+
|
142 |
+
```python
|
143 |
+
import ctranslate2
|
144 |
+
import transformers
|
145 |
+
|
146 |
+
translator = ctranslate2.Translator("m2m100_PARAMS")
|
147 |
+
tokenizer = transformers.AutoTokenizer.from_pretrained("facebook/m2m100_PARAMS")
|
148 |
+
tokenizer.src_lang = "en"
|
149 |
+
|
150 |
+
source = tokenizer.convert_ids_to_tokens(tokenizer.encode("Hello world!"))
|
151 |
+
target_prefix = [tokenizer.lang_code_to_token["de"]]
|
152 |
+
results = translator.translate_batch([source], target_prefix=[target_prefix])
|
153 |
+
target = results[0].hypotheses[0][1:]
|
154 |
+
|
155 |
+
print(tokenizer.decode(tokenizer.convert_tokens_to_ids(target)))
|
156 |
+
```
|
157 |
+
|
158 |
+
|
159 |
# M2M100 12B
|
160 |
|
161 |
M2M100 is a multilingual encoder-decoder (seq-to-seq) model trained for Many-to-Many multilingual translation.
|