Trabis
/

Helsinki-NLPopus-mt-tc-big-en-moroccain_dialect

text2text-generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

Helsinki-NLPopus-mt-tc-big-en-moroccain_dialect / README.md

Trabis's picture

Update readme

8223eec over 1 year ago

|

history blame contribute delete

3.15 kB

	---
	tags:
	- generated_from_trainer
	metrics:
	- bleu
	model-index:
	- name: Helsinki-NLPopus-mt-tc-big-en-moroccain_dialect
	results: []
	pipeline_tag: translation
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	<!-- in this model i use transfer learning for translate english to Moroccain dialect (darija). -->

	<!-- about dataset used for training model : I used about 18,000 pairs of English and Moroccain Dialect. -->

	<!-- my model is trained three times, the last being one epoch. -->

	# Helsinki-NLPopus-mt-tc-big-en-moroccain_dialect

	This model was trained from scratch on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.6930
	- Bleu: 50.0607
	- Gen Len: 14.7048

	## Model description

	MarianConfig {
	"_name_or_path": "/content/drive/MyDrive/Colab Notebooks/big_helsinki_eng_dar",
	"activation_dropout": 0.0,
	"activation_function": "relu",
	"architectures": [
	"MarianMTModel"
	],
	"attention_dropout": 0.0,
	"bad_words_ids": [
	[
	61246
	]
	],
	"bos_token_id": 0,
	"classifier_dropout": 0.0,
	"d_model": 1024,
	"decoder_attention_heads": 16,
	"decoder_ffn_dim": 4096,
	"decoder_layerdrop": 0.0,
	"decoder_layers": 6,
	"decoder_start_token_id": 61246,
	"decoder_vocab_size": 61247,
	"dropout": 0.1,
	"encoder_attention_heads": 16,
	"encoder_ffn_dim": 4096,
	"encoder_layerdrop": 0.0,
	"encoder_layers": 6,
	"eos_token_id": 25897,
	"forced_eos_token_id": 25897,
	"init_std": 0.02,
	"is_encoder_decoder": true,
	"max_length": 512,
	"max_position_embeddings": 1024,
	"model_type": "marian",
	"normalize_embedding": false,
	"num_beams": 4,
	"num_hidden_layers": 6,
	"pad_token_id": 61246,
	"scale_embedding": true,
	"share_encoder_decoder_embeddings": true,
	"static_position_embeddings": true,
	"torch_dtype": "float32",
	"transformers_version": "4.28.0",
	"use_cache": true,
	"vocab_size": 61247
	}

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	DatasetDict({
	train: Dataset({
	features: ['input_ids', 'attention_mask', 'labels'],
	num_rows: 15443
	})
	test: Dataset({
	features: ['input_ids', 'attention_mask', 'labels'],
	num_rows: 813
	})
	})

	## Training procedure

	Using transfer learning due to limited data in the Moroccan dialect.

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-07
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 4000
	- num_epochs: 1
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Bleu \| Gen Len \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-------:\|:-------:\|
	\| 0.617 \| 1.0 \| 1931 \| 0.6930 \| 50.0607 \| 14.7048 \|


	### Framework versions

	- Transformers 4.28.0
	- Pytorch 2.0.0+cu118
	- Datasets 2.12.0
	- Tokenizers 0.13.3