finetune-t5-base-on-opus100-Ar2En-without-optimization

0bb8bd2 verified 5 months ago

3.01 kB

	---
	base_model: UBC-NLP/AraT5v2-base-1024
	tags:
	- generated_from_trainer
	datasets:
	- opus100
	metrics:
	- bleu
	model-index:
	- name: finetune-t5-base-on-opus100-Ar2En-without-optimization
	results:
	- task:
	name: Sequence-to-sequence Language Modeling
	type: text2text-generation
	dataset:
	name: opus100
	type: opus100
	config: ar-en
	split: train[:7000]
	args: ar-en
	metrics:
	- name: Bleu
	type: bleu
	value: 10.4288
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# finetune-t5-base-on-opus100-Ar2En-without-optimization

	This model is a fine-tuned version of [UBC-NLP/AraT5v2-base-1024](https://huggingface.co/UBC-NLP/AraT5v2-base-1024) on the opus100 dataset.
	It achieves the following results on the evaluation set:
	- Loss: 3.0042
	- Bleu: 10.4288
	- Gen Len: 10.739

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 10
	- eval_batch_size: 10
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 18
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Bleu \| Gen Len \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-------:\|:-------:\|
	\| 10.1448 \| 1.0 \| 210 \| 3.9256 \| 2.8335 \| 9.4988 \|
	\| 4.9822 \| 2.0 \| 420 \| 3.5760 \| 4.9001 \| 10.3329 \|
	\| 4.42 \| 3.0 \| 630 \| 3.4037 \| 5.6973 \| 10.301 \|
	\| 4.1414 \| 4.0 \| 840 \| 3.3057 \| 6.5224 \| 10.5559 \|
	\| 3.9451 \| 5.0 \| 1050 \| 3.2169 \| 7.409 \| 10.7571 \|
	\| 3.7972 \| 6.0 \| 1260 \| 3.1759 \| 8.1445 \| 10.5908 \|
	\| 3.6687 \| 7.0 \| 1470 \| 3.1340 \| 8.246 \| 10.7451 \|
	\| 3.5494 \| 8.0 \| 1680 \| 3.1098 \| 8.5656 \| 10.7616 \|
	\| 3.4748 \| 9.0 \| 1890 \| 3.0749 \| 9.052 \| 10.8798 \|
	\| 3.3945 \| 10.0 \| 2100 \| 3.0725 \| 9.3223 \| 10.6794 \|
	\| 3.314 \| 11.0 \| 2310 \| 3.0511 \| 9.67 \| 10.6871 \|
	\| 3.2606 \| 12.0 \| 2520 \| 3.0398 \| 9.6105 \| 10.6531 \|
	\| 3.2314 \| 13.0 \| 2730 \| 3.0211 \| 10.0661 \| 10.752 \|
	\| 3.1557 \| 14.0 \| 2940 \| 3.0188 \| 10.0724 \| 10.7188 \|
	\| 3.1571 \| 15.0 \| 3150 \| 3.0148 \| 10.3648 \| 10.7596 \|
	\| 3.1213 \| 16.0 \| 3360 \| 3.0061 \| 10.4008 \| 10.7784 \|
	\| 3.1111 \| 17.0 \| 3570 \| 3.0077 \| 10.4588 \| 10.7155 \|
	\| 3.0851 \| 18.0 \| 3780 \| 3.0042 \| 10.4288 \| 10.739 \|


	### Framework versions

	- Transformers 4.35.2
	- Pytorch 2.0.0
	- Datasets 2.1.0
	- Tokenizers 0.15.0