metadata

language:
  - ace
  - acm
  - acq
  - aeb
  - af
  - ajp
  - ak
  - als
  - am
  - apc
  - ar
  - ars
  - ary
  - arz
  - as
  - ast
  - awa
  - ayr
  - azb
  - azj
  - ba
  - bm
  - ban
  - be
  - bem
  - bn
  - bho
  - bjn
  - bo
  - bs
  - bug
  - bg
  - ca
  - ceb
  - cs
  - cjk
  - ckb
  - crh
  - cy
  - da
  - de
  - dik
  - dyu
  - dz
  - el
  - en
  - eo
  - et
  - eu
  - ee
  - fo
  - fj
  - fi
  - fon
  - fr
  - fur
  - fuv
  - gaz
  - gd
  - ga
  - gl
  - gn
  - gu
  - ht
  - ha
  - he
  - hi
  - hne
  - hr
  - hu
  - hy
  - ig
  - ilo
  - id
  - is
  - it
  - jv
  - ja
  - kab
  - kac
  - kam
  - kn
  - ks
  - ka
  - kk
  - kbp
  - kea
  - khk
  - km
  - ki
  - rw
  - ky
  - kmb
  - kmr
  - knc
  - kg
  - ko
  - lo
  - lij
  - li
  - ln
  - lt
  - lmo
  - ltg
  - lb
  - lua
  - lg
  - luo
  - lus
  - lvs
  - mag
  - mai
  - ml
  - mar
  - min
  - mk
  - mt
  - mni
  - mos
  - mi
  - my
  - nl
  - nn
  - nb
  - npi
  - nso
  - nus
  - ny
  - oc
  - ory
  - pag
  - pa
  - pap
  - pbt
  - pes
  - plt
  - pl
  - pt
  - prs
  - quy
  - ro
  - rn
  - ru
  - sg
  - sa
  - sat
  - scn
  - shn
  - si
  - sk
  - sl
  - sm
  - sn
  - sd
  - so
  - st
  - es
  - sc
  - sr
  - ss
  - su
  - sv
  - swh
  - szl
  - ta
  - taq
  - tt
  - te
  - tg
  - tl
  - th
  - ti
  - tpi
  - tn
  - ts
  - tk
  - tum
  - tr
  - tw
  - tzm
  - ug
  - uk
  - umb
  - ur
  - uzn
  - vec
  - vi
  - war
  - wo
  - xh
  - ydd
  - yo
  - yue
  - zh
  - zsm
  - zu
language_details: >-
  ace_Arab, ace_Latn, acm_Arab, acq_Arab, aeb_Arab, afr_Latn, ajp_Arab,
  aka_Latn, amh_Ethi, apc_Arab, arb_Arab, ars_Arab, ary_Arab, arz_Arab,
  asm_Beng, ast_Latn, awa_Deva, ayr_Latn, azb_Arab, azj_Latn, bak_Cyrl,
  bam_Latn, ban_Latn,bel_Cyrl, bem_Latn, ben_Beng, bho_Deva, bjn_Arab, bjn_Latn,
  bod_Tibt, bos_Latn, bug_Latn, bul_Cyrl, cat_Latn, ceb_Latn, ces_Latn,
  cjk_Latn, ckb_Arab, crh_Latn, cym_Latn, dan_Latn, deu_Latn, dik_Latn,
  dyu_Latn, dzo_Tibt, ell_Grek, eng_Latn, epo_Latn, est_Latn, eus_Latn,
  ewe_Latn, fao_Latn, pes_Arab, fij_Latn, fin_Latn, fon_Latn, fra_Latn,
  fur_Latn, fuv_Latn, gla_Latn, gle_Latn, glg_Latn, grn_Latn, guj_Gujr,
  hat_Latn, hau_Latn, heb_Hebr, hin_Deva, hne_Deva, hrv_Latn, hun_Latn,
  hye_Armn, ibo_Latn, ilo_Latn, ind_Latn, isl_Latn, ita_Latn, jav_Latn,
  jpn_Jpan, kab_Latn, kac_Latn, kam_Latn, kan_Knda, kas_Arab, kas_Deva,
  kat_Geor, knc_Arab, knc_Latn, kaz_Cyrl, kbp_Latn, kea_Latn, khm_Khmr,
  kik_Latn, kin_Latn, kir_Cyrl, kmb_Latn, kon_Latn, kor_Hang, kmr_Latn,
  lao_Laoo, lvs_Latn, lij_Latn, lim_Latn, lin_Latn, lit_Latn, lmo_Latn,
  ltg_Latn, ltz_Latn, lua_Latn, lug_Latn, luo_Latn, lus_Latn, mag_Deva,
  mai_Deva, mal_Mlym, mar_Deva, min_Latn, mkd_Cyrl, plt_Latn, mlt_Latn,
  mni_Beng, khk_Cyrl, mos_Latn, mri_Latn, zsm_Latn, mya_Mymr, nld_Latn,
  nno_Latn, nob_Latn, npi_Deva, nso_Latn, nus_Latn, nya_Latn, oci_Latn,
  gaz_Latn, ory_Orya, pag_Latn, pan_Guru, pap_Latn, pol_Latn, por_Latn,
  prs_Arab, pbt_Arab, quy_Latn, ron_Latn, run_Latn, rus_Cyrl, sag_Latn,
  san_Deva, sat_Beng, scn_Latn, shn_Mymr, sin_Sinh, slk_Latn, slv_Latn,
  smo_Latn, sna_Latn, snd_Arab, som_Latn, sot_Latn, spa_Latn, als_Latn,
  srd_Latn, srp_Cyrl, ssw_Latn, sun_Latn, swe_Latn, swh_Latn, szl_Latn,
  tam_Taml, tat_Cyrl, tel_Telu, tgk_Cyrl, tgl_Latn, tha_Thai, tir_Ethi,
  taq_Latn, taq_Tfng, tpi_Latn, tsn_Latn, tso_Latn, tuk_Latn, tum_Latn,
  tur_Latn, twi_Latn, tzm_Tfng, uig_Arab, ukr_Cyrl, umb_Latn, urd_Arab,
  uzn_Latn, vec_Latn, vie_Latn, war_Latn, wol_Latn, xho_Latn, ydd_Hebr,
  yor_Latn, yue_Hant, zho_Hans, zho_Hant, zul_Latn
tags:
  - nllb
  - nllb-moe
  - translation
license: cc-by-nc-4.0
datasets:
  - flores-200
metrics:
  - bleu
  - spbleu
  - chrf++
inference: false

NLLB-MoE

This is the model card of NLLB-MoE variant.

Information about training algorithms, parameters, fairness constraints or other applied approaches, and features. The exact training algorithm, data and the strategies to handle data imbalances for high and low resource languages that were used to train NLLB-200 is described in the paper.
Paper or other resource for more information NLLB Team et al, No Language Left Behind: Scaling Human-Centered Machine Translation, Arxiv, 2022
License: CC-BY-NC
Where to send questions or comments about the model: https://github.com/facebookresearch/fairseq/issues

The NLLB model was presented in No Language Left Behind: Scaling Human-Centered Machine Translation by Marta R. Costa-jussà, James Cross, Onur Çelebi, Maha Elbayad, Kenneth Heafield, Kevin Heffernan, Elahe Kalbassi, Janice Lam, Daniel Licht, Jean Maillard, Anna Sun, Skyler Wang, Guillaume Wenzek, Al Youngblood, Bapi Akula, Loic Barrault, Gabriel Mejia Gonzalez, Prangthip Hansanti, John Hoffman, Semarley Jarrett, Kaushik Ram Sadagopan, Dirk Rowe, Shannon Spruit, Chau Tran, Pierre Andrews, Necip Fazil Ayan, Shruti Bhosale, Sergey Edunov, Angela Fan, Cynthia Gao, Vedanuj Goswami, Francisco Guzmán, Philipp Koehn, Alexandre Mourachko, Christophe Ropers, Safiyyah Saleem, Holger Schwenk, and Jeff Wang.

Generating with NLLB-MoE

The avalable checkpoints requires around 350GB of storage. Make sure to use accelerate if you do not have enough RAM on your machine.

While generating the target text set the forced_bos_token_id to the target language id. The following example shows how to translate English to French using the facebook/nllb-200-distilled-600M model.

Note that we're using the BCP-47 code for French fra_Latn. See here for the list of all BCP-47 in the Flores 200 dataset.

>>> from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

>>> tokenizer = AutoTokenizer.from_pretrained("facebook/nllb-moe-54b")
>>> model = AutoModelForSeq2SeqLM.from_pretrained("facebook/nllb-moe-54b")

>>> article = "UN Chief says there is no military solution in Syria"
>>> inputs = tokenizer(article, return_tensors="pt")

>>> translated_tokens = model.generate(
...     **inputs, forced_bos_token_id=tokenizer.lang_code_to_id["fra_Latn"], max_length=30
... )
>>> tokenizer.batch_decode(translated_tokens, skip_special_tokens=True)[0]
Le chef de l'ONU dit qu'il n'y a pas de solution militaire en Syrie