Librarian Bot: Add base_model information to model (#3)

717db52 about 1 year ago

7.11 kB

	---
	language:
	- ja
	license: apache-2.0
	tags:
	- summarization
	- generated_from_trainer
	- mt5
	metrics:
	- rouge
	widget:
	- text: 世界中では約120のワクチンの開発が進められている。英オックスフォード大学の専門家たちはすでに臨床試験を開始している。新しいアプローチ多くの従来のワクチンは、弱体化させたウイルスや改変したウイルスなどがもとになっている。しかし今回のワクチンは新しいアプローチに基づいたもので、遺伝子のRNA（リボ核酸）を使う。
	筋肉に注射すると、RNAは自己増殖し、新型ウイルスの表面にみられるスパイクタンパク質のコピーをつくるよう、体内の細胞に指示を出す。この方法で、COVID-19（新型ウイルスによる感染症）を発症することなく新型ウイルスを認識して戦うための免疫システムを訓練できるという。
	シャトック教授は、「我々はゼロからワクチンを製造し、わずか数カ月で臨床試験に持ち込むことができた」と述べた。
	- text: サッカーのワールドカップカタール大会、世界ランキング24位でグループEに属する日本は、23日の1次リーグ初戦において、世界11位で過去4回の優勝を誇るドイツと対戦しました。試合は前半、ドイツの一方的なペースではじまりましたが、後半、日本の森保監督は攻撃的な選手を積極的に動員して流れを変えました。結局、日本は前半に1点を奪われましたが、途中出場の堂安律選手と浅野拓磨選手が後半にゴールを決め、2対1で逆転勝ちしました。ゲームの流れをつかんだ森保采配が功を奏しました。
	base_model: google/mt5-small
	model-index:
	- name: mt5_summarize_japanese
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# mt5_summarize_japanese

	(Japanese caption : 日本語の要約のモデル)

	This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) trained for Japanese summarization.

	This model is fine-tuned on BBC news articles ([XL-Sum Japanese dataset](https://huggingface.co/datasets/csebuetnlp/xlsum/viewer/japanese)), in which the first sentence (headline sentence) is used for summary and others are used for article.<br>
	So, please fill news story (including, such as, event, background, result, and comment) as source text in the inferece widget. (Other corpra - such as, conversation, business document, academic paper, or short tale - are not seen in training set.)

	It achieves the following results on the evaluation set:
	- Loss: 1.8952
	- Rouge1: 0.4625
	- Rouge2: 0.2866
	- Rougel: 0.3656
	- Rougelsum: 0.3868

	## Intended uses

	```python
	from transformers import pipeline

	seq2seq = pipeline("summarization", model="tsmatz/mt5_summarize_japanese")
	sample_text = "サッカーのワールドカップカタール大会、世界ランキング24位でグループEに属する日本は、23日の1次リーグ初戦において、世界11位で過去4回の優勝を誇るドイツと対戦しました。試合は前半、ドイツの一方的なペースではじまりましたが、後半、日本の森保監督は攻撃的な選手を積極的に動員して流れを変えました。結局、日本は前半に1点を奪われましたが、途中出場の堂安律選手と浅野拓磨選手が後半にゴールを決め、2対1で逆転勝ちしました。ゲームの流れをつかんだ森保采配が功を奏しました。"
	result = seq2seq(sample_text)
	print(result)
	```

	## Training procedure

	You can download the source code for fine-tuning from [here](https://github.com/tsmatz/huggingface-finetune-japanese/blob/master/02-summarize.ipynb).

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0005
	- train_batch_size: 2
	- eval_batch_size: 1
	- seed: 42
	- gradient_accumulation_steps: 16
	- total_train_batch_size: 32
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 90
	- num_epochs: 10

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:------:\|:------:\|:------:\|:---------:\|
	\| 4.2501 \| 0.36 \| 100 \| 3.3685 \| 0.3114 \| 0.1654 \| 0.2627 \| 0.2694 \|
	\| 3.6436 \| 0.72 \| 200 \| 3.0095 \| 0.3023 \| 0.1634 \| 0.2684 \| 0.2764 \|
	\| 3.3044 \| 1.08 \| 300 \| 2.8025 \| 0.3414 \| 0.1789 \| 0.2912 \| 0.2984 \|
	\| 3.2693 \| 1.44 \| 400 \| 2.6284 \| 0.3616 \| 0.1935 \| 0.2979 \| 0.3132 \|
	\| 3.2025 \| 1.8 \| 500 \| 2.5271 \| 0.3790 \| 0.2042 \| 0.3046 \| 0.3192 \|
	\| 2.9772 \| 2.17 \| 600 \| 2.4203 \| 0.4083 \| 0.2374 \| 0.3422 \| 0.3542 \|
	\| 2.9133 \| 2.53 \| 700 \| 2.3863 \| 0.3847 \| 0.2096 \| 0.3316 \| 0.3406 \|
	\| 2.9383 \| 2.89 \| 800 \| 2.3573 \| 0.4016 \| 0.2297 \| 0.3361 \| 0.3500 \|
	\| 2.7608 \| 3.25 \| 900 \| 2.3223 \| 0.3999 \| 0.2249 \| 0.3461 \| 0.3566 \|
	\| 2.7864 \| 3.61 \| 1000 \| 2.2293 \| 0.3932 \| 0.2219 \| 0.3297 \| 0.3445 \|
	\| 2.7846 \| 3.97 \| 1100 \| 2.2097 \| 0.4386 \| 0.2617 \| 0.3766 \| 0.3826 \|
	\| 2.7495 \| 4.33 \| 1200 \| 2.1879 \| 0.4100 \| 0.2449 \| 0.3481 \| 0.3551 \|
	\| 2.6092 \| 4.69 \| 1300 \| 2.1515 \| 0.4398 \| 0.2714 \| 0.3787 \| 0.3842 \|
	\| 2.5598 \| 5.05 \| 1400 \| 2.1195 \| 0.4366 \| 0.2545 \| 0.3621 \| 0.3736 \|
	\| 2.5283 \| 5.41 \| 1500 \| 2.0637 \| 0.4274 \| 0.2551 \| 0.3649 \| 0.3753 \|
	\| 2.5947 \| 5.77 \| 1600 \| 2.0588 \| 0.4454 \| 0.2800 \| 0.3828 \| 0.3921 \|
	\| 2.5354 \| 6.14 \| 1700 \| 2.0357 \| 0.4253 \| 0.2582 \| 0.3546 \| 0.3687 \|
	\| 2.5203 \| 6.5 \| 1800 \| 2.0263 \| 0.4444 \| 0.2686 \| 0.3648 \| 0.3764 \|
	\| 2.5303 \| 6.86 \| 1900 \| 1.9926 \| 0.4455 \| 0.2771 \| 0.3795 \| 0.3948 \|
	\| 2.4953 \| 7.22 \| 2000 \| 1.9576 \| 0.4523 \| 0.2873 \| 0.3869 \| 0.4053 \|
	\| 2.4271 \| 7.58 \| 2100 \| 1.9384 \| 0.4455 \| 0.2811 \| 0.3713 \| 0.3862 \|
	\| 2.4462 \| 7.94 \| 2200 \| 1.9230 \| 0.4530 \| 0.2846 \| 0.3754 \| 0.3947 \|
	\| 2.3303 \| 8.3 \| 2300 \| 1.9311 \| 0.4519 \| 0.2814 \| 0.3755 \| 0.3887 \|
	\| 2.3916 \| 8.66 \| 2400 \| 1.9213 \| 0.4598 \| 0.2897 \| 0.3688 \| 0.3889 \|
	\| 2.5995 \| 9.03 \| 2500 \| 1.9060 \| 0.4526 \| 0.2820 \| 0.3733 \| 0.3946 \|
	\| 2.3348 \| 9.39 \| 2600 \| 1.9021 \| 0.4595 \| 0.2856 \| 0.3762 \| 0.3988 \|
	\| 2.4035 \| 9.74 \| 2700 \| 1.8952 \| 0.4625 \| 0.2866 \| 0.3656 \| 0.3868 \|


	### Framework versions

	- Transformers 4.23.1
	- Pytorch 1.12.1+cu102
	- Datasets 2.6.1
	- Tokenizers 0.13.1