--- license: mit language: - kbd datasets: - anzorq/kbd_speech - anzorq/sixuxar_yijiri_mak7 metrics: - wer pipeline_tag: automatic-speech-recognition --- # Circassian (Kabardian) ASR Model This is a fine-tuned model for Automatic Speech Recognition (ASR) in `kbd`, based on the `facebook/w2v-bert-2.0` model. The model was trained on a combination of the `anzorq/kbd_speech` (filtered on `country=russia`) and `anzorq/sixuxar_yijiri_mak7` datasets. ## Model Details - **Base Model**: facebook/w2v-bert-2.0 - **Language**: Kabardian - **Task**: Automatic Speech Recognition (ASR) - **Datasets**: anzorq/kbd_speech, anzorq/sixuxar_yijiri_mak7 - **Training Steps**: 5000 ## Training The model was fine-tuned using the following training arguments: ```python TrainingArguments( output_dir='output', group_by_length=True, per_device_train_batch_size=8, gradient_accumulation_steps=2, evaluation_strategy="steps", num_train_epochs=10, gradient_checkpointing=True, fp16=True, save_steps=1000, eval_steps=500, logging_steps=300, learning_rate=5e-5, warmup_steps=500, save_total_limit=2, push_to_hub=True, report_to="wandb" ) ``` ## Performance The model's performance during training: | Step | Training Loss | Validation Loss | WER | |------|---------------|-----------------|---------| | 500 | 2.859600 | inf | 0.870362| | 1000 | 0.355500 | inf | 0.703617| | 1500 | 0.247100 | inf | 0.549942| | 2000 | 0.196700 | inf | 0.471762| | 2500 | 0.181500 | inf | 0.361494| | 3000 | 0.152200 | inf | 0.314119| | 3500 | 0.135700 | inf | 0.275146| | 4000 | 0.113400 | inf | 0.252625| | 4500 | 0.102900 | inf | 0.277013| | 5000 | 0.078500 | inf | 0.250175|