File size: 6,865 Bytes
2cd6d6c 6eeef49 2cd6d6c 6eeef49 f3c3836 ca79eb5 f3c3836 dee8152 82965f1 975b299 dee8152 9b59ea9 dee8152 aad8aed 5a49c01 aad8aed 9b59ea9 76c0d61 9b59ea9 82965f1 975b299 9b59ea9 dee8152 9b59ea9 aad8aed 5a49c01 dee8152 2cd6d6c 5108f13 2cd6d6c 5108f13 2cd6d6c |
|
---
language:
- ru
- zh
- en
tags:
- translation
- text2text-generation
- t5
license: apache-2.0
datasets:
- ccmatrix
metrics:
- sacrebleu
widget:
- example_title: translate zh-ru
text: >
translate to ru: 开发的目的是为用户提供个人同步翻译。
- example_title: translate ru-en
text: >
translate to en: Цель разработки — предоставить пользователям личного синхронного переводчика.
- example_title: translate en-ru
text: >
translate to ru: The purpose of the development is to provide users with a personal synchronized interpreter.
- example_title: translate en-zh
text: >
translate to zh: The purpose of the development is to provide users with a personal synchronized interpreter.
- example_title: translate zh-en
text: >
translate to en: 开发的目的是为用户提供个人同步解释器。
- example_title: translate ru-zh
text: >
translate to zh: Цель разработки — предоставить пользователям личного синхронного переводчика.
model-index:
- name: utrobinmv/t5_translate_en_ru_zh_base_200
results:
- task:
type: translation
name: Translation en-ru
dataset:
name: ntrex_en-ru
type: ntrex
config: ntrex en-ru
split: test
metrics:
- type: sacrebleu
value: 28.575940911021487
name: bleu
verified: false
- type: chrf
value: 54.27996346886896
name: chrf
verified: false
- type: ter
value: 62.494863914873584
name: ter
verified: false
- type: meteor
value: 0.5174833677740809
name: meteor
verified: false
- type: rouge
value: 0.1908317951570274
name: ROUGE-1
verified: false
- type: rouge
value: 0.065555552204933
name: ROUGE-2
verified: false
- type: rouge
value: 0.1895542893295215
name: ROUGE-L
verified: false
- type: rouge
value: 0.1893813749889601
name: ROUGE-LSUM
verified: false
- type: bertscore
value: 0.8554933660030365
name: bertscore_f1
verified: false
- type: bertscore
value: 0.8578473615646363
name: bertscore_precision
verified: false
- type: bertscore
value: 0.8534188346862793
name: bertscore_recall
verified: false
source:
name: NTREX dataset Benchmark
url: https://huggingface.co/spaces/utrobinmv/TREX_benchmark_en_ru_zh
- name: utrobinmv/t5_translate_en_ru_zh_base_200
results:
- task:
type: translation
name: Translation ru-en
dataset:
name: ntrex_ru-en
type: ntrex
config: ntrex ru-en
split: test
metrics:
- type: sacrebleu
value: 28.575940911021487
name: bleu
verified: false
- type: chrf
value: 54.27996346886896
name: chrf
verified: false
- type: ter
value: 62.494863914873584
name: ter
verified: false
- type: meteor
value: 0.5174833677740809
name: meteor
verified: false
- type: rouge
value: 0.1908317951570274
name: ROUGE-1
verified: false
- type: rouge
value: 0.065555552204933
name: ROUGE-2
verified: false
- type: rouge
value: 0.1895542893295215
name: ROUGE-L
verified: false
- type: rouge
value: 0.1893813749889601
name: ROUGE-LSUM
verified: false
- type: bertscore
value: 0.8554933660030365
name: bertscore_f1
verified: false
- type: bertscore
value: 0.8578473615646363
name: bertscore_precision
verified: false
- type: bertscore
value: 0.8534188346862793
name: bertscore_recall
verified: false
source:
name: NTREX dataset Benchmark
url: https://huggingface.co/spaces/utrobinmv/TREX_benchmark_en_ru_zh
---
# T5 English, Russian and Chinese multilingual machine translation
This model represents a conventional T5 transformer in multitasking mode for translation into the required language, precisely configured for machine translation for pairs: ru-zh, zh-ru, en-zh, zh-en, en-ru, ru-en.
The model can perform direct translation between any pair of Russian, Chinese or English languages. For translation into the target language, the target language identifier is specified as a prefix 'translate to <lang>:'. In this case, the source language may not be specified, in addition, the source text may be multilingual.
Example translate Russian to Chinese
```python
from transformers import T5ForConditionalGeneration, T5Tokenizer
model_name = 'utrobinmv/t5_translate_en_ru_zh_small_1024'
model = T5ForConditionalGeneration.from_pretrained(model_name)
tokenizer = T5Tokenizer.from_pretrained(model_name)
prefix = 'translate to zh: '
src_text = prefix + "Цель разработки — предоставить пользователям личного синхронного переводчика."
# translate Russian to Chinese
input_ids = tokenizer(src_text, return_tensors="pt")
generated_tokens = model.generate(**input_ids)
result = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
print(result)
#开发的目的是为用户提供个人同步翻译。
```
and Example translate Chinese to Russian
```python
from transformers import T5ForConditionalGeneration, T5Tokenizer
model_name = 'utrobinmv/t5_translate_en_ru_zh_small_1024'
model = T5ForConditionalGeneration.from_pretrained(model_name)
tokenizer = T5Tokenizer.from_pretrained(model_name)
prefix = 'translate to ru: '
src_text = prefix + "开发的目的是为用户提供个人同步翻译。"
# translate Russian to Chinese
input_ids = tokenizer(src_text, return_tensors="pt")
generated_tokens = model.generate(**input_ids)
result = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
print(result)
#Цель разработки - предоставить пользователям персональный синхронный перевод.
```
##
## Languages covered
Russian (ru_RU), Chinese (zh_CN), English (en_US)
|