metadata
language:
- ta
metrics:
- wer
library_name: transformers
pipeline_tag: automatic-speech-recognition
Model Card for Model ID
This is the fine-tuned version of whisper-large-v2 model for Tamil language.
Training Hyperparameters
- Training regime: [More Information Needed] <training_args = Seq2SeqTrainingArguments( output_dir="./pretrainedwhisper-medium-native-v2", # change to a repo name of your choice per_device_train_batch_size=4, gradient_accumulation_steps=1, # increase by 2x for every 2x decrease in batch size learning_rate=1e-5, warmup_steps=200, max_steps=2000, gradient_checkpointing=True, fp16=True, evaluation_strategy="steps", per_device_eval_batch_size=8, predict_with_generate=True, generation_max_length=225, save_steps=500, eval_steps=500, logging_steps=25, report_to=["tensorboard"], load_best_model_at_end=True, metric_for_best_model="wer", greater_is_better=False, push_to_hub=True, optim="adamw_bnb_8bit" )>
Model Architecture and Objective
The model follows the whisper architecture with the encoder-decoder part. Where the encoder used to create the embeddings from the speech input and the decoder used to give the textual outputs.