Update README.md
Browse files
README.md
CHANGED
@@ -19,8 +19,8 @@ tags:
|
|
19 |
mBART-50 is a multilingual Sequence-to-Sequence model pre-trained using the "Multilingual Denoising Pretraining" objective. It was introduced in [Multilingual Translation with Extensible Multilingual Pretraining and Finetuning](https://arxiv.org/abs/2008.00401) paper.
|
20 |
|
21 |
|
22 |
-
mBART-50 is a multilingual Sequence-to-Sequence model. It was
|
23 |
-
Instead of fine-tuning on one direction, a pre-trained model is fine-tuned
|
24 |
**Multilingual Denoising Pretraining**: The model incorporates N languages by concatenating data:
|
25 |
`D = {D1, ..., DN }` where each Di is a collection of monolingual documents in language `i`. The source documents are noised using two schemes,
|
26 |
first randomly shuffling the original sentences' order, and second a novel in-filling scheme,
|
@@ -37,7 +37,7 @@ The decoder input is the original text with one position offset. A language id s
|
|
37 |
|
38 |
### Dataset Summary
|
39 |
|
40 |
-
OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side. The corpus covers 100 languages (including English).
|
41 |
|
42 |
|
43 |
### Languages
|
|
|
19 |
mBART-50 is a multilingual Sequence-to-Sequence model pre-trained using the "Multilingual Denoising Pretraining" objective. It was introduced in [Multilingual Translation with Extensible Multilingual Pretraining and Finetuning](https://arxiv.org/abs/2008.00401) paper.
|
20 |
|
21 |
|
22 |
+
mBART-50 is a multilingual Sequence-to-Sequence model. It was created to show that multilingual translation models can be created through multilingual fine-tuning.
|
23 |
+
Instead of fine-tuning on one direction, a pre-trained model is fine-tuned many directions simultaneously. mBART-50 is created using the original mBART model and extended to add extra 25 languages to support multilingual machine translation models of 50 languages. The pre-training objective is explained below.
|
24 |
**Multilingual Denoising Pretraining**: The model incorporates N languages by concatenating data:
|
25 |
`D = {D1, ..., DN }` where each Di is a collection of monolingual documents in language `i`. The source documents are noised using two schemes,
|
26 |
first randomly shuffling the original sentences' order, and second a novel in-filling scheme,
|
|
|
37 |
|
38 |
### Dataset Summary
|
39 |
|
40 |
+
OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side. The corpus covers 100 languages (including English). Languages were selected based on the volume of parallel data available in OPUS.
|
41 |
|
42 |
|
43 |
### Languages
|