DewiBrynJones commited on
Commit
b364455
1 Parent(s): a676134

Model save

Browse files
Files changed (2) hide show
  1. README.md +22 -25
  2. generation_config.json +1 -1
README.md CHANGED
@@ -1,16 +1,13 @@
1
  ---
 
 
2
  tags:
3
  - generated_from_trainer
4
  metrics:
5
  - wer
6
  model-index:
7
- - name: whisper-tiny-ft-cy
8
  results: []
9
- license: apache-2.0
10
- language:
11
- - cy
12
- - en
13
- pipeline_tag: automatic-speech-recognition
14
  ---
15
 
16
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -18,21 +15,22 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  # whisper-tiny-ft-cy-en
20
 
21
- This model is a fine-tune of [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) using custom splits from
22
- Common Voice 16.1 Welsh and English datasets as well as normalized verbatim transcriptions from
23
- [techiaith/banc-trawsgrifiadau-bangor](https://huggingface.co/datasets/techiaith/banc-trawsgrifiadau-bangor)
 
 
 
 
 
24
 
25
  ## Intended uses & limitations
26
 
27
- Due to its small size, this model is intended to be used as the basis for offline speech recognition on devices such as
28
- Android phones.
29
 
30
  ## Training and evaluation data
31
 
32
- It achieves the following results on the evaluation set:
33
-
34
- - Loss: 0.7176
35
- - Wer: 53.1135
36
 
37
  ## Training procedure
38
 
@@ -41,28 +39,27 @@ It achieves the following results on the evaluation set:
41
  The following hyperparameters were used during training:
42
  - learning_rate: 1e-05
43
  - train_batch_size: 4
44
- - eval_batch_size: 8
45
  - seed: 42
46
  - gradient_accumulation_steps: 8
47
  - total_train_batch_size: 32
48
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
49
  - lr_scheduler_type: linear
50
  - lr_scheduler_warmup_steps: 500
51
- - training_steps: 4000
52
 
53
  ### Training results
54
 
55
  | Training Loss | Epoch | Step | Validation Loss | Wer |
56
  |:-------------:|:-----:|:----:|:---------------:|:-------:|
57
- | 0.8115 | 1.41 | 1000 | 0.8426 | 60.0795 |
58
- | 0.6396 | 2.83 | 2000 | 0.7508 | 54.4259 |
59
- | 0.5259 | 4.24 | 3000 | 0.7255 | 53.1328 |
60
- | 0.4854 | 5.66 | 4000 | 0.7176 | 53.1135 |
61
 
62
 
63
  ### Framework versions
64
 
65
- - Transformers 4.37.2
66
- - Pytorch 2.2.0+cu121
67
- - Datasets 2.16.1
68
- - Tokenizers 0.15.1
 
1
  ---
2
+ license: apache-2.0
3
+ base_model: openai/whisper-tiny
4
  tags:
5
  - generated_from_trainer
6
  metrics:
7
  - wer
8
  model-index:
9
+ - name: whisper-tiny-ft-cy-en
10
  results: []
 
 
 
 
 
11
  ---
12
 
13
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
15
 
16
  # whisper-tiny-ft-cy-en
17
 
18
+ This model is a fine-tuned version of [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) on an unknown dataset.
19
+ It achieves the following results on the evaluation set:
20
+ - Loss: 0.5668
21
+ - Wer: 36.8865
22
+
23
+ ## Model description
24
+
25
+ More information needed
26
 
27
  ## Intended uses & limitations
28
 
29
+ More information needed
 
30
 
31
  ## Training and evaluation data
32
 
33
+ More information needed
 
 
 
34
 
35
  ## Training procedure
36
 
 
39
  The following hyperparameters were used during training:
40
  - learning_rate: 1e-05
41
  - train_batch_size: 4
42
+ - eval_batch_size: 1
43
  - seed: 42
44
  - gradient_accumulation_steps: 8
45
  - total_train_batch_size: 32
46
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
47
  - lr_scheduler_type: linear
48
  - lr_scheduler_warmup_steps: 500
49
+ - training_steps: 3000
50
 
51
  ### Training results
52
 
53
  | Training Loss | Epoch | Step | Validation Loss | Wer |
54
  |:-------------:|:-----:|:----:|:---------------:|:-------:|
55
+ | 0.7039 | 0.25 | 1000 | 0.6932 | 43.4217 |
56
+ | 0.5689 | 0.5 | 2000 | 0.5930 | 38.4145 |
57
+ | 0.5255 | 0.75 | 3000 | 0.5668 | 36.8865 |
 
58
 
59
 
60
  ### Framework versions
61
 
62
+ - Transformers 4.39.3
63
+ - Pytorch 2.2.2+cu121
64
+ - Datasets 2.18.0
65
+ - Tokenizers 0.15.2
generation_config.json CHANGED
@@ -244,5 +244,5 @@
244
  "transcribe": 50359,
245
  "translate": 50358
246
  },
247
- "transformers_version": "4.37.2"
248
  }
 
244
  "transcribe": 50359,
245
  "translate": 50358
246
  },
247
+ "transformers_version": "4.39.3"
248
  }