Update README.md
Browse files
README.md
CHANGED
@@ -74,7 +74,7 @@ And cite our work:
|
|
74 |
|
75 |
## Model hosted here
|
76 |
|
77 |
-
This is a many-to-many model for
|
78 |
|
79 |
Usage:
|
80 |
|
@@ -82,13 +82,13 @@ Usage:
|
|
82 |
from transformers import MBartForConditionalGeneration, AutoModelForSeq2SeqLM
|
83 |
from transformers import MbartTokenizer, AutoTokenizer
|
84 |
|
85 |
-
tokenizer = AutoTokenizer.from_pretrained("
|
86 |
|
87 |
-
# Or use tokenizer = MbartTokenizer.from_pretrained("
|
88 |
|
89 |
-
model = AutoModelForSeq2SeqLM.from_pretrained("
|
90 |
|
91 |
-
# Or use model = MBartForConditionalGeneration.from_pretrained("
|
92 |
|
93 |
# First tokenize the input and outputs. The format below is how the model was trained so the input should be "Sentence </s> SRCCODE". Similarly, the output should be "TGTCODE Sentence </s>".
|
94 |
# Example: For Saint Lucian Patois to English translation, we need to use language indicator tags: <2acf> and <2eng> where acf represents Saint Lucian Patois and eng represents English.
|
|
|
74 |
|
75 |
## Model hosted here
|
76 |
|
77 |
+
This is a many-to-many model for translation into and out of Creole languages, fine-tuned on top of `facebook/mbart-large-50-many-to-many-mmt`, with only public data.
|
78 |
|
79 |
Usage:
|
80 |
|
|
|
82 |
from transformers import MBartForConditionalGeneration, AutoModelForSeq2SeqLM
|
83 |
from transformers import MbartTokenizer, AutoTokenizer
|
84 |
|
85 |
+
tokenizer = AutoTokenizer.from_pretrained("jhu-clsp/kreyol-mt-pubtrain", do_lower_case=False, use_fast=False, keep_accents=True)
|
86 |
|
87 |
+
# Or use tokenizer = MbartTokenizer.from_pretrained("jhu-clsp/kreyol-mt-pubtrain", use_fast=False)
|
88 |
|
89 |
+
model = AutoModelForSeq2SeqLM.from_pretrained("jhu-clsp/kreyol-mt-pubtrain")
|
90 |
|
91 |
+
# Or use model = MBartForConditionalGeneration.from_pretrained("jhu-clsp/kreyol-mt-pubtrain")
|
92 |
|
93 |
# First tokenize the input and outputs. The format below is how the model was trained so the input should be "Sentence </s> SRCCODE". Similarly, the output should be "TGTCODE Sentence </s>".
|
94 |
# Example: For Saint Lucian Patois to English translation, we need to use language indicator tags: <2acf> and <2eng> where acf represents Saint Lucian Patois and eng represents English.
|