Commit
•
ecb83a9
1
Parent(s):
ec98b37
Update README.md (#7)
Browse files- Update README.md (f97bbd620972efd5f2c4d22652c2bbde29cd7746)
Co-authored-by: He Huang <steveheh@users.noreply.huggingface.co>
README.md
CHANGED
@@ -402,7 +402,7 @@ The model outputs the transcribed/translated text corresponding to the input aud
|
|
402 |
## Training
|
403 |
|
404 |
Canary-1B is trained using the NVIDIA NeMo toolkit [4] for 150k steps with dynamic bucketing and a batch duration of 360s per GPU on 128 NVIDIA A100 80GB GPUs.
|
405 |
-
The model can be trained using this [example script](https://github.com/NVIDIA/NeMo/blob/
|
406 |
|
407 |
The tokenizers for these models were built using the text transcripts of the train set with this [script](https://github.com/NVIDIA/NeMo/blob/main/scripts/tokenizers/process_asr_text_tokenizer.py).
|
408 |
|
|
|
402 |
## Training
|
403 |
|
404 |
Canary-1B is trained using the NVIDIA NeMo toolkit [4] for 150k steps with dynamic bucketing and a batch duration of 360s per GPU on 128 NVIDIA A100 80GB GPUs.
|
405 |
+
The model can be trained using this [example script](https://github.com/NVIDIA/NeMo/blob/main/examples/asr/speech_multitask/speech_to_text_aed.py) and [base config](https://github.com/NVIDIA/NeMo/blob/main/examples/asr/conf/speech_multitask/fast-conformer_aed.yaml).
|
406 |
|
407 |
The tokenizers for these models were built using the text transcripts of the train set with this [script](https://github.com/NVIDIA/NeMo/blob/main/scripts/tokenizers/process_asr_text_tokenizer.py).
|
408 |
|