|
--- |
|
model-index: |
|
- name: whisper-large-v2-ru |
|
results: |
|
- task: |
|
type: automatic-speech-recognition |
|
name: Automatic Speech Recognition |
|
dataset: |
|
name: mozilla-foundation/common_voice_11_0 ru |
|
type: mozilla-foundation/common_voice_11_0 |
|
config: ru |
|
split: test |
|
args: ru |
|
metrics: |
|
- type: wer |
|
value: 7.73 |
|
name: WER |
|
tags: |
|
- whisper-event |
|
--- |
|
|
|
Whisper model finetuned using audio data from Open STT Russian Dataset (https://github.com/snakers4/open_stt). |
|
|
|
There is a differences in tokenization of source data (in our data normalization process, we replace punctucation with "" rather than Whisper's " "). This mismatch leads to a slight degradation on CommonVoice. |
|
|