context-mt
/

scat-marian-big-target-ctx4-cwd0-en-fr

text2text-generation

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

scat-marian-big-target-ctx4-cwd0-en-fr / translation_no_trainer /1687954715.63408 /hparams.yml

gsarti's picture

Training in progress epoch 0

3225018 over 1 year ago

history blame contribute delete

1.17 kB

	checkpointing_steps: '1000'
	config_name: null
	context_size: 4
	context_word_dropout: 0.0
	dataset_config_name: sentences
	dataset_name: inseq/scat
	gradient_accumulation_steps: 2
	hub_model_id: context-mt/scat-marian-big-target-ctx4-cwd0-en-fr
	hub_token: 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
	ignore_pad_token_for_loss: true
	learning_rate: 5.0e-05
	logging_steps: '200'
	lr_scheduler_type: linear
	max_length: 128
	max_source_length: 512
	max_target_length: 512
	max_train_steps: 1388
	model_name_or_path: context-mt/iwslt17-marian-big-target-ctx4-cwd0-en-fr
	model_type: null
	num_beams: 5
	num_train_epochs: 2
	num_warmup_steps: 0
	output_dir: /scratch/p305238/scat-marian-big-target-ctx4-cwd0-en-fr
	overwrite_cache: false
	pad_to_max_length: true
	per_device_eval_batch_size: 8
	per_device_train_batch_size: 8
	predict_with_generate: true
	preprocessing_num_workers: null
	push_to_hub: true
	report_to: tensorboard
	resume_from_checkpoint: null
	sample_context: true
	seed: null
	source_lang: en_XX
	target_lang: fr_XX
	tokenizer_name: null
	train_file: null
	use_slow_tokenizer: false
	use_target_context: true
	val_max_target_length: null
	validation_file: null
	weight_decay: 0.0
	with_tracking: true