whisper-large-v2-uk / README.md
mitchelldehaven's picture
Update README.md
899e249
|
raw
history blame
740 Bytes
metadata
model-index:
  - name: whisper-large-v2-uk
    results:
      - task:
          type: automatic-speech-recognition
          name: Automatic Speech Recognition
        dataset:
          name: mozilla-foundation/common_voice_11_0
          type: mozilla-foundation/common_voice_11_0
          config: uk
          split: test
        metrics:
          - type: wer
            value: 13.01
            name: WER
tags:
  - whisper-event

Whisper model finetuned using audio data from CommonVoice Ukrainian v10 train and dev set with additional data via semi-supervised data.

There is a differences in tokenization of source data (in our data normalization process, we replace punctucation with "" rather than Whisper's " "). This mismatch leads to a slight degradation on CommonVoice.