mitchelldehaven commited on
Commit
752264b
1 Parent(s): dd398db

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -20,4 +20,4 @@ tags:
20
 
21
  Whisper model finetuned using audio data from Open STT Russian Dataset (https://github.com/snakers4/open_stt).
22
 
23
- Due to differences in tokenization of source data (in our data normalization process, we replace punctucation with `""` rather than Whisper's `" "`), there is a slight degredation on CommonVoice.
 
20
 
21
  Whisper model finetuned using audio data from Open STT Russian Dataset (https://github.com/snakers4/open_stt).
22
 
23
+ There is a differences in tokenization of source data (in our data normalization process, we replace punctucation with "" rather than Whisper's " "). This mismatch leads to a slight degradation on CommonVoice.