iryneko571/CCMatrix-v1-Ja_Zh-fused
Viewer • Updated • 500k • 18 • 3
How to use iryneko571/mt5-small-translation-ja_zh with Transformers:
# Use a pipeline as a high-level helper
# Warning: Pipeline type "translation" is no longer supported in transformers v5.
# You must load the model directly (see below) or downgrade to v4.x with:
# 'pip install "transformers<5.0.0'
from transformers import pipeline
pipe = pipeline("translation", model="iryneko571/mt5-small-translation-ja_zh") # Load model directly
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("iryneko571/mt5-small-translation-ja_zh")
model = AutoModelForSeq2SeqLM.from_pretrained("iryneko571/mt5-small-translation-ja_zh")不需要自己装环境即可使用!!No environment needed, use colab to test
https://colab.research.google.com/drive/1PA30HPgRooCTV-H9Wr_DZXHqC42PrgTO?usp=sharing
现在翻译能力就是人工吗喽,不是词汇不够,是学不会了
this model has problem learning more due to the 300M size and my poor techniques
from transformers import pipeline
model_name="iryneko571/mt5-small-translation-ja_zh"
#pipe = pipeline("translation",model=model_name,tokenizer=model_name,repetition_penalty=1.4,batch_size=1,max_length=256)
pipe = pipeline("translation",
model=model_name,
repetition_penalty=1.4,
batch_size=1,
max_length=256
)
def translate_batch(batch, language='<-ja2zh->'): # batch is an array of string
i=0 # quickly format the list
while i<len(batch):
batch[i]=f'{language} {batch[i]}'
i+=1
translated=pipe(batch)
result=[]
i=0
while i<len(translated):
result.append(translated[i]['translation_text'])
i+=1
return result
inputs=[]
print(translate_batch(inputs))