t5-base-xlsum-ja / README.md

Update README.md

297724d over 1 year ago

4.48 kB

	---
	license: cc-by-sa-4.0
	base_model: retrieva-jp/t5-base-long
	tags:
	- generated_from_trainer
	- summarization
	- t5
	datasets:
	- csebuetnlp/xlsum
	metrics:
	- rouge
	model-index:
	- name: t5-base-xlsum-ja
	results:
	- task:
	name: Sequence-to-sequence Language Modeling
	type: text2text-generation
	dataset:
	name: csebuetnlp/xlsum
	type: xlsum
	config: japanese
	split: test
	args: japanese
	metrics:
	- name: Rouge1
	type: rouge
	value: 0.2719700031314344
	- name: Rouge2
	type: rouge
	value: 0.13633367129422308
	language:
	- ja
	library_name: transformers
	widget:
	- text: >-
	ブラジルのジャイル・ボルソナロ大統領の新型ウイルス対策は、国内外で大きな批判を受けている
	首都ブラジリアで自身の66歳の誕生日を祝うイベントに参加したボルソナロ大統領は、政府は新型ウイルス対策に全力を尽くしたとし、今は経済を再開させる時期だと述べた。
	ブラジルでは先週、保健省の研究機関、オズワルド・クルズ財団（FIOCRUZ）が、同国の保健サービスが歴史的な崩壊に陥っていると警告。国内の病院では集中治療室が満杯になってしまっていると指摘したばかり。
	- text: >-
	KAMITSUBAKI STUDIOの情報を網羅できる新たな配信プロジェクト、分散型放送局「神椿無電（KAMITSUBAKI
	RADIO）」がスタートしました！「神椿無電」プロジェクトでは、KAMITSUBAKI
	STUDIOに所属するアーティストやクリエイターの多彩なプログラムを集約。生放送のコンテンツを中心に、今後予定している配信番組をSCHEDULEページで一覧化が可能です。過去放送された配信番組情報もSCHEDULEページに記録されており、非公開になってしまった放送も含めてこれまでの配信の軌跡を辿ることができます。現在は2023年1月以降に放送された番組が記録されていますが、順次2022年以前の情報も更新していきますので今しばらくお待ちください。その他、PROGRAMページでは現在継続して放送されている番組情報がまとめられており、CHANNELページではKAMITSUBAKI
	STUDIOに関連するアーティストやクリエイターのSNSリンクを集約。
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# t5-base-xlsum-ja

	This model is a fine-tuned version of [retrieva-jp/t5-base-long](https://huggingface.co/retrieva-jp/t5-base-long) on the xlsum dataset.
	It achieves the following results on the evaluation set:
	- Loss: 2.6563
	- Rouge1: 0.3648
	- Rouge2: 0.1641
	- Rougel: 0.2965
	- Rougelsum: 0.3132

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0001
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- gradient_accumulation_steps: 16
	- total_train_batch_size: 128
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_ratio: 0.01
	- num_epochs: 15

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:------:\|:------:\|:------:\|:---------:\|
	\| 4.9166 \| 1.8 \| 100 \| 3.4095 \| 0.3569 \| 0.1509 \| 0.2416 \| 0.3209 \|
	\| 4.1162 \| 3.61 \| 200 \| 3.0980 \| 0.3262 \| 0.1354 \| 0.2557 \| 0.2805 \|
	\| 3.8578 \| 5.41 \| 300 \| 2.8853 \| 0.3428 \| 0.1445 \| 0.2628 \| 0.2881 \|
	\| 3.7309 \| 7.22 \| 400 \| 2.7714 \| 0.3621 \| 0.1615 \| 0.2951 \| 0.3151 \|
	\| 3.6716 \| 9.02 \| 500 \| 2.7042 \| 0.3727 \| 0.1668 \| 0.2982 \| 0.3225 \|
	\| 3.6393 \| 10.82 \| 600 \| 2.6666 \| 0.3676 \| 0.1592 \| 0.2987 \| 0.3206 \|
	\| 3.6291 \| 12.63 \| 700 \| 2.6587 \| 0.3654 \| 0.1576 \| 0.2955 \| 0.3108 \|
	\| 3.6224 \| 14.43 \| 800 \| 2.6563 \| 0.3648 \| 0.1641 \| 0.2965 \| 0.3132 \|


	### Framework versions

	- Transformers 4.34.0
	- Pytorch 2.0.0+cu118
	- Datasets 2.14.5
	- Tokenizers 0.14.0