puettmann
/

Quadrifoglio-mt-en-it

text2text-generation

Inference Endpoints

Model card Files Files and versions Community

Leonard Püttmann commited on 15 days ago

Commit

7c19dd7

·

verified ·

1 Parent(s): c954467

Update README.md

Files changed (1) hide show

README.md +31 -0

README.md CHANGED Viewed

@@ -40,6 +40,37 @@ response = generate_response(text_to_translate)
 print(response)
 ```
 ## Evaluation
 Done on the Opus 100 test set.

 print(response)
 ```
+As this model is trained on translating sentence pairs, it is best to split longer text into individual sentences, ideally using SpaCy:
+```python
+from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
+import spacy
+# First, install spaCy and the English language model if you haven't already
+# !pip install spacy
+# !python -m spacy download en_core_web_sm
+nlp = spacy.load("en_core_web_sm")
+tokenizer = AutoTokenizer.from_pretrained("LeonardPuettmann/mt0-Quadrifoglio-mt-en-it")
+model = AutoModelForSeq2SeqLM.from_pretrained("LeonardPuettmann/mt0-Quadrifoglio-mt-en-it")
+def generate_response(input_text):
+    input_ids = tokenizer("translate Italian to English: " + input_text, return_tensors="pt").input_ids
+    output = model.generate(input_ids, max_new_tokens=256)
+    return tokenizer.decode(output[0], skip_special_tokens=True)
+text = "How are you doing? Today is a beautiful day. I hope you are doing fine."
+doc = nlp(text)
+sentences = [sent.text for sent in doc.sents]
+sentence_translations = []
+for i, sentence in enumerate(sentences):
+    sentence_translation = generate_response(sentence)
+    sentence_translations.append(sentence_translation)
+full_translation = " ".join(sentence_translations)
+print(full_translation)
+```
 ## Evaluation
 Done on the Opus 100 test set.