fitlemon commited on
Commit
20c3e8c
1 Parent(s): d27e933

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -41,7 +41,7 @@ More information needed
41
 
42
  ## Training and evaluation data
43
  ```python
44
- # datasets for each lang-id
45
  common_voice_train_uz = load_dataset("mozilla-foundation/common_voice_16_1", "uz", split='train', trust_remote_code=True, token=env('HUGGING_TOKEN'), streaming=True)
46
  common_voice_train_ru = load_dataset("mozilla-foundation/common_voice_16_1", "ru", split='train', trust_remote_code=True, token=env('HUGGING_TOKEN'), streaming=True)
47
  common_voice_train_en = load_dataset("mozilla-foundation/common_voice_16_1", "en", split='train', trust_remote_code=True, token=env('HUGGING_TOKEN'), streaming=True)
@@ -57,7 +57,7 @@ common_voice['train'] = concatenate_datasets([common_voice_train_uz, common_voic
57
  ## Training procedure
58
 
59
  Used Trainer from transformers.
60
- Training and evaluation process are described in the notebook from the repo:
61
 
62
  https://github.com/fitlemon/whisper-small-uz-en-ru-lang-id
63
 
 
41
 
42
  ## Training and evaluation data
43
  ```python
44
+ # datasets for each language it set {uz: Uzbek, en: English, ru: Russian}
45
  common_voice_train_uz = load_dataset("mozilla-foundation/common_voice_16_1", "uz", split='train', trust_remote_code=True, token=env('HUGGING_TOKEN'), streaming=True)
46
  common_voice_train_ru = load_dataset("mozilla-foundation/common_voice_16_1", "ru", split='train', trust_remote_code=True, token=env('HUGGING_TOKEN'), streaming=True)
47
  common_voice_train_en = load_dataset("mozilla-foundation/common_voice_16_1", "en", split='train', trust_remote_code=True, token=env('HUGGING_TOKEN'), streaming=True)
 
57
  ## Training procedure
58
 
59
  Used Trainer from transformers.
60
+ Training and evaluation process are described in the notebook, storing in the following github repository:
61
 
62
  https://github.com/fitlemon/whisper-small-uz-en-ru-lang-id
63