ales
/

whisper-small-belarusian

Automatic Speech Recognition

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

whisper-small-belarusian / run_2 /readme.md

ales's picture

added logs from run 2 of fine-tuning

4b966a1 almost 2 years ago

|

1.57 kB

	## Fine-tuning run 2

	Tried to improve model fine-tuned during run 1.

	Checkpoint used: checkpoint-12000

	* Learning rate picked for fine-tuning in run 2 turned out to be too small.
	WER did not improve compared to run 1.
	* Fine-tuning during run 2 followed WER trajectory of the end of run 1:
	from checkpoint-8000 - checkpoint-10000
	* Have stopped run 2 after 3000 steps
	* do not upload checkpoints from that run
	* uploading training stdout logs and tensorboard logs

	## Advices

	* For the next fine-tuning it's better to use higher Learning Rates.
	As for LR Scheduler it's better to:
	* either use a constant Learning Rate Scheduler
	* or manually instantiate a LinearSchedulerWithWarmups and set `num_training_steps` to be larger
	than the actual number of optimization in the run, so that LR in the end would be >> 0 (much larger than 0)
	* need to use `seed` other than the one used during run 1. e.g. `seed=43`<br>
	actual seed used during train dataset reshuffling is computed as:
	`train_dataloader.dataset.set_epoch(train_dataloader.dataset._epoch + 1)`
	however, when resuming training `train_dataloader.dataset._epoch` is reset to 0.<br>
	thus need to provide different seed
	* can use original Mozilla Common Voice dataset instead of a HuggingFace's one.<br>
	the reason is that original contains multiple voicings of same sentence -
	so there is at least twice as more data.<br>
	to use this "additional" data, train, validation, test sets need to be enlarged using `validated` set -
	the one that is absent in HuggingFace's CV11 dataset