JairamKanna
commited on
Commit
·
40ffc7e
1
Parent(s):
d27ff0e
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,49 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- ta
|
4 |
+
metrics:
|
5 |
+
- wer
|
6 |
+
library_name: transformers
|
7 |
+
pipeline_tag: automatic-speech-recognition
|
8 |
+
---
|
9 |
+
# Model Card for Model ID
|
10 |
+
|
11 |
+
<!-- Provide a quick summary of what the model is/does. -->
|
12 |
+
|
13 |
+
This is the fine-tuned version of whisper-large-v2 model for Tamil language.
|
14 |
+
|
15 |
+
|
16 |
+
#### Training Hyperparameters
|
17 |
+
|
18 |
+
- **Training regime:** [More Information Needed] <training_args = Seq2SeqTrainingArguments(
|
19 |
+
output_dir="./pretrainedwhisper-medium-native-v2", # change to a repo name of your choice
|
20 |
+
per_device_train_batch_size=4,
|
21 |
+
gradient_accumulation_steps=1, # increase by 2x for every 2x decrease in batch size
|
22 |
+
learning_rate=1e-5,
|
23 |
+
warmup_steps=200,
|
24 |
+
max_steps=2000,
|
25 |
+
gradient_checkpointing=True,
|
26 |
+
fp16=True,
|
27 |
+
evaluation_strategy="steps",
|
28 |
+
per_device_eval_batch_size=8,
|
29 |
+
predict_with_generate=True,
|
30 |
+
generation_max_length=225,
|
31 |
+
save_steps=500,
|
32 |
+
eval_steps=500,
|
33 |
+
logging_steps=25,
|
34 |
+
report_to=["tensorboard"],
|
35 |
+
load_best_model_at_end=True,
|
36 |
+
metric_for_best_model="wer",
|
37 |
+
greater_is_better=False,
|
38 |
+
push_to_hub=True,
|
39 |
+
optim="adamw_bnb_8bit"
|
40 |
+
)>
|
41 |
+
|
42 |
+
|
43 |
+
|
44 |
+
### Model Architecture and Objective
|
45 |
+
|
46 |
+
The model follows the whisper architecture with the encoder-decoder part. Where the encoder used to create the embeddings from the speech input and the decoder used to give the textual outputs.
|
47 |
+
|
48 |
+
|
49 |
+
|