File size: 3,988 Bytes
9d44a45
 
 
 
 
 
 
 
 
 
cbe5f88
 
 
 
 
 
9d44a45
 
 
 
 
 
 
cbe5f88
9d44a45
 
 
 
cbe5f88
9d44a45
cbe5f88
 
 
9d44a45
cbe5f88
 
 
 
 
9d44a45
cbe5f88
9d44a45
cbe5f88
 
 
9d44a45
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cbe5f88
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
---
license: apache-2.0
base_model: openai/whisper-tiny
tags:
- generated_from_trainer
metrics:
- wer
model-index:
- name: whisper-tiny-myanmar
  results: []
datasets:
- chuuhtetnaing/myanmar-speech-dataset-openslr-80
language:
- my
pipeline_tag: automatic-speech-recognition
library_name: transformers
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# whisper-tiny-myanmar

This model is a fine-tuned version of [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) on the [chuuhtetnaing/myanmar-speech-dataset-openslr-80](https://huggingface.co/datasets/chuuhtetnaing/myanmar-speech-dataset-openslr-80) dataset.
It achieves the following results on the evaluation set:
- Loss: 0.2353
- Wer: 61.8878

## Usage

```python
from datasets import Audio, load_dataset
from transformers import pipeline

# Load a sample audio
dataset = load_dataset("chuuhtetnaing/myanmar-speech-dataset-openslr-80")
dataset = dataset.cast_column("audio", Audio(sampling_rate=16000))
test_dataset = dataset['test']
input_speech = test_dataset[42]['audio']

pipe = pipeline(model='chuuhtetnaing/whisper-tiny-myanmar')

output = pipe(input_speech, generate_kwargs={"language": "myanmar", "task": "transcribe"})
print(output['text']) # ကျွန်မ ပြည်ပ မှာ ပညာ သင် တော့ စာမြီးပွဲ ကို တပတ်တခါ စစ်တယ်
```

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 128
- eval_batch_size: 128
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 200
- num_epochs: 30
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step | Validation Loss | Wer      |
|:-------------:|:-----:|:----:|:---------------:|:--------:|
| No log        | 1.0   | 18   | 1.2679          | 357.6135 |
| 1.483         | 2.0   | 36   | 1.0660          | 102.5378 |
| 1.0703        | 3.0   | 54   | 0.9530          | 106.3669 |
| 1.0703        | 4.0   | 72   | 0.8399          | 100.5343 |
| 0.8951        | 5.0   | 90   | 0.7728          | 107.6581 |
| 0.7857        | 6.0   | 108  | 0.7143          | 107.5245 |
| 0.6614        | 7.0   | 126  | 0.5174          | 104.4078 |
| 0.6614        | 8.0   | 144  | 0.3004          | 90.3384  |
| 0.3519        | 9.0   | 162  | 0.2447          | 82.4577  |
| 0.2165        | 10.0  | 180  | 0.2333          | 83.8825  |
| 0.2165        | 11.0  | 198  | 0.2022          | 77.0258  |
| 0.1532        | 12.0  | 216  | 0.1759          | 73.0632  |
| 0.1039        | 13.0  | 234  | 0.1852          | 72.0837  |
| 0.0675        | 14.0  | 252  | 0.1902          | 71.2823  |
| 0.0675        | 15.0  | 270  | 0.1882          | 70.5254  |
| 0.0517        | 16.0  | 288  | 0.2002          | 69.7240  |
| 0.0522        | 17.0  | 306  | 0.1965          | 67.7649  |
| 0.0522        | 18.0  | 324  | 0.1935          | 68.2102  |
| 0.0404        | 19.0  | 342  | 0.2132          | 67.9430  |
| 0.0308        | 20.0  | 360  | 0.2110          | 66.6963  |
| 0.0236        | 21.0  | 378  | 0.2141          | 65.9394  |
| 0.0236        | 22.0  | 396  | 0.2200          | 64.4702  |
| 0.0116        | 23.0  | 414  | 0.2227          | 63.4016  |
| 0.0055        | 24.0  | 432  | 0.2244          | 64.1585  |
| 0.0025        | 25.0  | 450  | 0.2254          | 62.4666  |
| 0.0025        | 26.0  | 468  | 0.2282          | 63.1790  |
| 0.0006        | 27.0  | 486  | 0.2320          | 61.7097  |
| 0.0002        | 28.0  | 504  | 0.2342          | 62.0659  |
| 0.0002        | 29.0  | 522  | 0.2350          | 62.0214  |
| 0.0001        | 30.0  | 540  | 0.2353          | 61.8878  |


### Framework versions

- Transformers 4.35.2
- Pytorch 2.1.1+cu121
- Datasets 2.14.5
- Tokenizers 0.15.1