Training in progress epoch 27

7939180 8 months ago

4.95 kB

	---
	license: apache-2.0
	base_model: google/mt5-large
	tags:
	- generated_from_keras_callback
	model-index:
	- name: pakawadeep/mt5-large-finetuned-ctfl
	results: []
	---

	<!-- This model card has been generated automatically according to the information Keras had access to. You should
	probably proofread and complete it, then remove this comment. -->

	# pakawadeep/mt5-large-finetuned-ctfl

	This model is a fine-tuned version of [google/mt5-large](https://huggingface.co/google/mt5-large) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Train Loss: 0.7379
	- Validation Loss: 0.8383
	- Train Rouge1: 8.9816
	- Train Rouge2: 2.3762
	- Train Rougel: 8.9109
	- Train Rougelsum: 8.9109
	- Train Gen Len: 11.9752
	- Epoch: 27

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
	- training_precision: float32

	### Training results

	\| Train Loss \| Validation Loss \| Train Rouge1 \| Train Rouge2 \| Train Rougel \| Train Rougelsum \| Train Gen Len \| Epoch \|
	\|:----------:\|:---------------:\|:------------:\|:------------:\|:------------:\|:---------------:\|:-------------:\|:-----:\|
	\| 11.3596 \| 5.2319 \| 2.5366 \| 0.5088 \| 2.5148 \| 2.4929 \| 19.0 \| 0 \|
	\| 6.1803 \| 3.1508 \| 2.7057 \| 0.5265 \| 2.6846 \| 2.6643 \| 19.0 \| 1 \|
	\| 4.6767 \| 2.5774 \| 2.7471 \| 0.5265 \| 2.7054 \| 2.6899 \| 18.3218 \| 2 \|
	\| 3.8698 \| 2.8216 \| 2.9763 \| 0.2200 \| 2.8792 \| 2.8987 \| 16.6238 \| 3 \|
	\| 4.7045 \| 2.7911 \| 3.2793 \| 0.5501 \| 3.1484 \| 3.2486 \| 14.1881 \| 4 \|
	\| 4.0342 \| 2.4191 \| 5.9406 \| 0.6365 \| 5.7206 \| 5.8306 \| 10.6980 \| 5 \|
	\| 3.4642 \| 2.1307 \| 5.9406 \| 0.9406 \| 5.7756 \| 5.8463 \| 11.2228 \| 6 \|
	\| 3.0690 \| 1.9079 \| 6.0644 \| 0.9901 \| 5.9406 \| 6.0644 \| 11.2228 \| 7 \|
	\| 2.6140 \| 1.7092 \| 5.7756 \| 0.8251 \| 5.6518 \| 5.8168 \| 11.4604 \| 8 \|
	\| 2.4520 \| 1.6478 \| 5.8581 \| 0.8251 \| 5.6931 \| 5.8581 \| 11.0842 \| 9 \|
	\| 2.2701 \| 1.5641 \| 5.9406 \| 0.8251 \| 5.8581 \| 5.8581 \| 10.8465 \| 10 \|
	\| 2.0735 \| 1.4839 \| 7.3020 \| 1.0726 \| 7.1370 \| 7.2814 \| 11.0891 \| 11 \|
	\| 1.8757 \| 1.3780 \| 7.4670 \| 1.0726 \| 7.3020 \| 7.4257 \| 11.2228 \| 12 \|
	\| 1.7313 \| 1.3204 \| 7.3020 \| 1.0726 \| 7.1370 \| 7.2814 \| 11.5842 \| 13 \|
	\| 1.5944 \| 1.2466 \| 7.4670 \| 1.0726 \| 7.3020 \| 7.4257 \| 11.6485 \| 14 \|
	\| 1.4894 \| 1.1993 \| 8.0858 \| 1.5677 \| 7.9208 \| 8.1271 \| 11.6139 \| 15 \|
	\| 1.3939 \| 1.1446 \| 8.1271 \| 2.0627 \| 8.0033 \| 8.0858 \| 11.7030 \| 16 \|
	\| 1.3065 \| 1.0837 \| 7.7558 \| 1.5677 \| 7.5083 \| 7.5908 \| 11.8168 \| 17 \|
	\| 1.2367 \| 1.0604 \| 8.0387 \| 1.9307 \| 7.9915 \| 7.9679 \| 11.9356 \| 18 \|
	\| 1.1569 \| 1.0071 \| 7.6143 \| 1.4356 \| 7.4257 \| 7.4965 \| 11.8515 \| 19 \|
	\| 1.0732 \| 0.9713 \| 8.5809 \| 1.9307 \| 8.4158 \| 8.4512 \| 11.8465 \| 20 \|
	\| 1.0204 \| 0.9582 \| 8.5809 \| 1.9307 \| 8.4158 \| 8.4512 \| 11.8317 \| 21 \|
	\| 0.9636 \| 0.9317 \| 8.5809 \| 1.9307 \| 8.4158 \| 8.4512 \| 11.8416 \| 22 \|
	\| 0.9054 \| 0.8921 \| 8.5809 \| 1.9307 \| 8.4158 \| 8.4512 \| 11.8663 \| 23 \|
	\| 0.8685 \| 0.8795 \| 9.0759 \| 2.4257 \| 8.9109 \| 8.9109 \| 11.8861 \| 24 \|
	\| 0.8100 \| 0.8666 \| 8.9816 \| 2.3762 \| 8.9109 \| 8.9109 \| 11.9455 \| 25 \|
	\| 0.7749 \| 0.8524 \| 8.9816 \| 2.3762 \| 8.9109 \| 8.9109 \| 11.9505 \| 26 \|
	\| 0.7379 \| 0.8383 \| 8.9816 \| 2.3762 \| 8.9109 \| 8.9109 \| 11.9752 \| 27 \|


	### Framework versions

	- Transformers 4.38.2
	- TensorFlow 2.15.0
	- Datasets 2.18.0
	- Tokenizers 0.15.2