license: mit | |
language: | |
- multilingual | |
- af | |
- am | |
- ar | |
- ast | |
- az | |
- ba | |
- be | |
- bg | |
- bn | |
- br | |
- bs | |
- ca | |
- ceb | |
- cs | |
- cy | |
- da | |
- de | |
- el | |
- en | |
- es | |
- et | |
- fa | |
- ff | |
- fi | |
- fr | |
- fy | |
- ga | |
- gd | |
- gl | |
- gu | |
- ha | |
- he | |
- hi | |
- hr | |
- ht | |
- hu | |
- hy | |
- id | |
- ig | |
- ilo | |
- is | |
- it | |
- ja | |
- jv | |
- ka | |
- kk | |
- km | |
- kn | |
- ko | |
- lb | |
- lg | |
- ln | |
- lo | |
- lt | |
- lv | |
- mg | |
- mk | |
- ml | |
- mn | |
- mr | |
- ms | |
- my | |
- ne | |
- nl | |
- 'no' | |
- ns | |
- oc | |
- or | |
- pa | |
- pl | |
- ps | |
- pt | |
- ro | |
- ru | |
- sd | |
- si | |
- sk | |
- sl | |
- so | |
- sq | |
- sr | |
- ss | |
- su | |
- sv | |
- sw | |
- ta | |
- th | |
- tl | |
- tn | |
- tr | |
- uk | |
- ur | |
- uz | |
- vi | |
- wo | |
- xh | |
- yi | |
- yo | |
- zh | |
- zu | |
tags: | |
- Traslation | |
- CTranslate2 | |
pipeline_tag: translation | |
# Quantized M2M100 for Fast Translation with CTranslate2 | |
This model is a quantized version of the [M2M100 418M model](https://huggingface.co/facebook/m2m100_418M) from Facebook AI, optimized for fast inference using CTranslate2. It supports translation between 100 languages with significantly improved speed compared to the original model. | |
## Key Features | |
- **Quantization:** The model is quantized to 8-bit integers, reducing model size and accelerating inference. | |
- **CTranslate2:** Leverages CTranslate2 for efficient C++-based inference, further boosting speed. | |
- **Multi-Language Support:** Translates between 100 languages, covering a wide range of linguistic needs. | |
## Installation | |
```bash | |
pip install trasformers ctranslate2 | |
git lfs install | |
git clone https://huggingface.co/Rohith04/ct2fast_m2m100_418M | |
``` | |
## Usage | |
```py | |
import ctranslate2 | |
import transformers | |
translator = ctranslate2.Translator("ct2fast_m2m100_418M") | |
tokenizer = transformers.AutoTokenizer.from_pretrained("facebook/m2m100_418M") | |
tokenizer.src_lang = "en" | |
source = tokenizer.convert_ids_to_tokens(tokenizer.encode("Hello world!")) | |
target_prefix = [tokenizer.lang_code_to_token["de"]] | |
results = translator.translate_batch([source], target_prefix=[target_prefix]) | |
target = results[0].hypotheses[0][1:] | |
print(tokenizer.decode(tokenizer.convert_tokens_to_ids(target))) | |
``` | |
## Languages covered | |
Afrikaans (af), Amharic (am), Arabic (ar), Asturian (ast), Azerbaijani (az), Bashkir (ba), Belarusian (be), Bulgarian (bg), Bengali (bn), Breton (br), Bosnian (bs), Catalan; Valencian (ca), Cebuano (ceb), Czech (cs), Welsh (cy), Danish (da), German (de), Greeek (el), English (en), Spanish (es), Estonian (et), Persian (fa), Fulah (ff), Finnish (fi), French (fr), Western Frisian (fy), Irish (ga), Gaelic; Scottish Gaelic (gd), Galician (gl), Gujarati (gu), Hausa (ha), Hebrew (he), Hindi (hi), Croatian (hr), Haitian; Haitian Creole (ht), Hungarian (hu), Armenian (hy), Indonesian (id), Igbo (ig), Iloko (ilo), Icelandic (is), Italian (it), Japanese (ja), Javanese (jv), Georgian (ka), Kazakh (kk), Central Khmer (km), Kannada (kn), Korean (ko), Luxembourgish; Letzeburgesch (lb), Ganda (lg), Lingala (ln), Lao (lo), Lithuanian (lt), Latvian (lv), Malagasy (mg), Macedonian (mk), Malayalam (ml), Mongolian (mn), Marathi (mr), Malay (ms), Burmese (my), Nepali (ne), Dutch; Flemish (nl), Norwegian (no), Northern Sotho (ns), Occitan (post 1500) (oc), Oriya (or), Panjabi; Punjabi (pa), Polish (pl), Pushto; Pashto (ps), Portuguese (pt), Romanian; Moldavian; Moldovan (ro), Russian (ru), Sindhi (sd), Sinhala; Sinhalese (si), Slovak (sk), Slovenian (sl), Somali (so), Albanian (sq), Serbian (sr), Swati (ss), Sundanese (su), Swedish (sv), Swahili (sw), Tamil (ta), Thai (th), Tagalog (tl), Tswana (tn), Turkish (tr), Ukrainian (uk), Urdu (ur), Uzbek (uz), Vietnamese (vi), Wolof (wo), Xhosa (xh), Yiddish (yi), Yoruba (yo), Chinese (zh), Zulu (zu) | |
## Resources | |
Original model: https://huggingface.co/facebook/m2m100_418M | |
CTranslate2: https://github.com/OpenNMT/CTranslate2 |