yesj1234 commited on
Commit
8b8b1c8
1 Parent(s): 03bc364

Upload folder using huggingface_hub

Browse files
Files changed (43) hide show
  1. README.md +110 -0
  2. added_tokens.json +6 -0
  3. all_results.json +14 -0
  4. checkpoint-6750/added_tokens.json +6 -0
  5. checkpoint-6750/config.json +117 -0
  6. checkpoint-6750/optimizer.pt +3 -0
  7. checkpoint-6750/preprocessor_config.json +10 -0
  8. checkpoint-6750/pytorch_model.bin +3 -0
  9. checkpoint-6750/rng_state_0.pth +3 -0
  10. checkpoint-6750/rng_state_1.pth +3 -0
  11. checkpoint-6750/rng_state_2.pth +3 -0
  12. checkpoint-6750/rng_state_3.pth +3 -0
  13. checkpoint-6750/scheduler.pt +3 -0
  14. checkpoint-6750/special_tokens_map.json +10 -0
  15. checkpoint-6750/tokenizer_config.json +56 -0
  16. checkpoint-6750/trainer_state.json +1234 -0
  17. checkpoint-6750/training_args.bin +3 -0
  18. checkpoint-6750/vocab.json +679 -0
  19. checkpoint-6900/added_tokens.json +6 -0
  20. checkpoint-6900/config.json +117 -0
  21. checkpoint-6900/optimizer.pt +3 -0
  22. checkpoint-6900/preprocessor_config.json +10 -0
  23. checkpoint-6900/pytorch_model.bin +3 -0
  24. checkpoint-6900/rng_state_0.pth +3 -0
  25. checkpoint-6900/rng_state_1.pth +3 -0
  26. checkpoint-6900/rng_state_2.pth +3 -0
  27. checkpoint-6900/rng_state_3.pth +3 -0
  28. checkpoint-6900/scheduler.pt +3 -0
  29. checkpoint-6900/special_tokens_map.json +10 -0
  30. checkpoint-6900/tokenizer_config.json +56 -0
  31. checkpoint-6900/trainer_state.json +1261 -0
  32. checkpoint-6900/training_args.bin +3 -0
  33. checkpoint-6900/vocab.json +679 -0
  34. config.json +117 -0
  35. eval_results.json +9 -0
  36. preprocessor_config.json +10 -0
  37. pytorch_model.bin +3 -0
  38. special_tokens_map.json +10 -0
  39. tokenizer_config.json +56 -0
  40. train_results.json +8 -0
  41. trainer_state.json +1270 -0
  42. training_args.bin +3 -0
  43. vocab.json +679 -0
README.md ADDED
@@ -0,0 +1,110 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: facebook/wav2vec2-large-xlsr-53
4
+ tags:
5
+ - automatic-speech-recognition
6
+ - ./sample_speech.py
7
+ - generated_from_trainer
8
+ model-index:
9
+ - name: ja-xlsr
10
+ results: []
11
+ ---
12
+
13
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
+ should probably proofread and complete it, then remove this comment. -->
15
+
16
+ # ja-xlsr
17
+
18
+ This model is a fine-tuned version of [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on the ./SAMPLE_SPEECH.PY - NA dataset.
19
+ It achieves the following results on the evaluation set:
20
+ - Loss: 2.5952
21
+ - Cer: 0.3240
22
+
23
+ ## Model description
24
+
25
+ More information needed
26
+
27
+ ## Intended uses & limitations
28
+
29
+ More information needed
30
+
31
+ ## Training and evaluation data
32
+
33
+ More information needed
34
+
35
+ ## Training procedure
36
+
37
+ ### Training hyperparameters
38
+
39
+ The following hyperparameters were used during training:
40
+ - learning_rate: 0.0003
41
+ - train_batch_size: 4
42
+ - eval_batch_size: 4
43
+ - seed: 42
44
+ - distributed_type: multi-GPU
45
+ - num_devices: 4
46
+ - total_train_batch_size: 16
47
+ - total_eval_batch_size: 16
48
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
49
+ - lr_scheduler_type: linear
50
+ - lr_scheduler_warmup_steps: 50
51
+ - num_epochs: 300
52
+
53
+ ### Training results
54
+
55
+ | Training Loss | Epoch | Step | Validation Loss | Cer |
56
+ |:-------------:|:------:|:----:|:---------------:|:------:|
57
+ | 4.9138 | 6.52 | 150 | 4.7965 | 1.0 |
58
+ | 4.7484 | 13.04 | 300 | 4.6081 | 1.0 |
59
+ | 4.5894 | 19.57 | 450 | 4.4697 | 0.9851 |
60
+ | 4.2024 | 26.09 | 600 | 4.0373 | 0.9077 |
61
+ | 2.7314 | 32.61 | 750 | 2.5507 | 0.5341 |
62
+ | 1.2293 | 39.13 | 900 | 2.0146 | 0.4139 |
63
+ | 0.5544 | 45.65 | 1050 | 1.9821 | 0.3556 |
64
+ | 0.3224 | 52.17 | 1200 | 2.0190 | 0.3587 |
65
+ | 0.1951 | 58.7 | 1350 | 2.1229 | 0.3612 |
66
+ | 0.1539 | 65.22 | 1500 | 2.1114 | 0.3470 |
67
+ | 0.1165 | 71.74 | 1650 | 2.2748 | 0.3315 |
68
+ | 0.1119 | 78.26 | 1800 | 2.2391 | 0.3488 |
69
+ | 0.0989 | 84.78 | 1950 | 2.3438 | 0.3383 |
70
+ | 0.0915 | 91.3 | 2100 | 2.1218 | 0.3587 |
71
+ | 0.0721 | 97.83 | 2250 | 2.2428 | 0.3519 |
72
+ | 0.0742 | 104.35 | 2400 | 2.2293 | 0.3364 |
73
+ | 0.0629 | 110.87 | 2550 | 2.2878 | 0.3371 |
74
+ | 0.0495 | 117.39 | 2700 | 2.2672 | 0.3408 |
75
+ | 0.0466 | 123.91 | 2850 | 2.2532 | 0.3525 |
76
+ | 0.0424 | 130.43 | 3000 | 2.2844 | 0.3259 |
77
+ | 0.0446 | 136.96 | 3150 | 2.2763 | 0.3253 |
78
+ | 0.0411 | 143.48 | 3300 | 2.3011 | 0.3302 |
79
+ | 0.0419 | 150.0 | 3450 | 2.3201 | 0.3420 |
80
+ | 0.0333 | 156.52 | 3600 | 2.3644 | 0.3439 |
81
+ | 0.0384 | 163.04 | 3750 | 2.3685 | 0.3532 |
82
+ | 0.0367 | 169.57 | 3900 | 2.3970 | 0.3470 |
83
+ | 0.0307 | 176.09 | 4050 | 2.3530 | 0.3309 |
84
+ | 0.0328 | 182.61 | 4200 | 2.3415 | 0.3315 |
85
+ | 0.0271 | 189.13 | 4350 | 2.4165 | 0.3309 |
86
+ | 0.0213 | 195.65 | 4500 | 2.4478 | 0.3451 |
87
+ | 0.0193 | 202.17 | 4650 | 2.5241 | 0.3556 |
88
+ | 0.0204 | 208.7 | 4800 | 2.5700 | 0.3463 |
89
+ | 0.0185 | 215.22 | 4950 | 2.5837 | 0.3178 |
90
+ | 0.0161 | 221.74 | 5100 | 2.5139 | 0.3377 |
91
+ | 0.0167 | 228.26 | 5250 | 2.5288 | 0.3352 |
92
+ | 0.0148 | 234.78 | 5400 | 2.5741 | 0.3389 |
93
+ | 0.0141 | 241.3 | 5550 | 2.5174 | 0.3389 |
94
+ | 0.0122 | 247.83 | 5700 | 2.5573 | 0.3352 |
95
+ | 0.0115 | 254.35 | 5850 | 2.5790 | 0.3296 |
96
+ | 0.0141 | 260.87 | 6000 | 2.5774 | 0.3203 |
97
+ | 0.0123 | 267.39 | 6150 | 2.6147 | 0.3309 |
98
+ | 0.0214 | 273.91 | 6300 | 2.6202 | 0.3302 |
99
+ | 0.0107 | 280.43 | 6450 | 2.6264 | 0.3234 |
100
+ | 0.0086 | 286.96 | 6600 | 2.6075 | 0.3216 |
101
+ | 0.0106 | 293.48 | 6750 | 2.5960 | 0.3247 |
102
+ | 0.0085 | 300.0 | 6900 | 2.5952 | 0.3240 |
103
+
104
+
105
+ ### Framework versions
106
+
107
+ - Transformers 4.34.0
108
+ - Pytorch 2.1.0+cu121
109
+ - Datasets 2.14.5
110
+ - Tokenizers 0.14.1
added_tokens.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "</s>": 678,
3
+ "<s>": 677,
4
+ "[PAD]": 676,
5
+ "[UNK]": 675
6
+ }
all_results.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 300.0,
3
+ "eval_cer": 0.32403965303593557,
4
+ "eval_loss": 2.5951595306396484,
5
+ "eval_runtime": 1.15,
6
+ "eval_samples": 45,
7
+ "eval_samples_per_second": 39.131,
8
+ "eval_steps_per_second": 2.609,
9
+ "train_loss": 0.8083851718038753,
10
+ "train_runtime": 4592.71,
11
+ "train_samples": 359,
12
+ "train_samples_per_second": 23.45,
13
+ "train_steps_per_second": 1.502
14
+ }
checkpoint-6750/added_tokens.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "</s>": 678,
3
+ "<s>": 677,
4
+ "[PAD]": 676,
5
+ "[UNK]": 675
6
+ }
checkpoint-6750/config.json ADDED
@@ -0,0 +1,117 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "facebook/wav2vec2-large-xlsr-53",
3
+ "activation_dropout": 0.0,
4
+ "adapter_attn_dim": null,
5
+ "adapter_kernel_size": 3,
6
+ "adapter_stride": 2,
7
+ "add_adapter": false,
8
+ "apply_spec_augment": true,
9
+ "architectures": [
10
+ "Wav2Vec2ForCTC"
11
+ ],
12
+ "attention_dropout": 0.1,
13
+ "bos_token_id": 1,
14
+ "classifier_proj_size": 256,
15
+ "codevector_dim": 768,
16
+ "contrastive_logits_temperature": 0.1,
17
+ "conv_bias": true,
18
+ "conv_dim": [
19
+ 512,
20
+ 512,
21
+ 512,
22
+ 512,
23
+ 512,
24
+ 512,
25
+ 512
26
+ ],
27
+ "conv_kernel": [
28
+ 10,
29
+ 3,
30
+ 3,
31
+ 3,
32
+ 3,
33
+ 2,
34
+ 2
35
+ ],
36
+ "conv_stride": [
37
+ 5,
38
+ 2,
39
+ 2,
40
+ 2,
41
+ 2,
42
+ 2,
43
+ 2
44
+ ],
45
+ "ctc_loss_reduction": "mean",
46
+ "ctc_zero_infinity": false,
47
+ "diversity_loss_weight": 0.1,
48
+ "do_stable_layer_norm": true,
49
+ "eos_token_id": 2,
50
+ "feat_extract_activation": "gelu",
51
+ "feat_extract_dropout": 0.0,
52
+ "feat_extract_norm": "layer",
53
+ "feat_proj_dropout": 0.05,
54
+ "feat_quantizer_dropout": 0.0,
55
+ "final_dropout": 0.0,
56
+ "gradient_checkpointing": false,
57
+ "hidden_act": "gelu",
58
+ "hidden_dropout": 0.05,
59
+ "hidden_size": 1024,
60
+ "initializer_range": 0.02,
61
+ "intermediate_size": 4096,
62
+ "layer_norm_eps": 1e-05,
63
+ "layerdrop": 0.05,
64
+ "mask_channel_length": 10,
65
+ "mask_channel_min_space": 1,
66
+ "mask_channel_other": 0.0,
67
+ "mask_channel_prob": 0.0,
68
+ "mask_channel_selection": "static",
69
+ "mask_feature_length": 10,
70
+ "mask_feature_min_masks": 0,
71
+ "mask_feature_prob": 0.0,
72
+ "mask_time_length": 10,
73
+ "mask_time_min_masks": 2,
74
+ "mask_time_min_space": 1,
75
+ "mask_time_other": 0.0,
76
+ "mask_time_prob": 0.05,
77
+ "mask_time_selection": "static",
78
+ "model_type": "wav2vec2",
79
+ "num_adapter_layers": 3,
80
+ "num_attention_heads": 16,
81
+ "num_codevector_groups": 2,
82
+ "num_codevectors_per_group": 320,
83
+ "num_conv_pos_embedding_groups": 16,
84
+ "num_conv_pos_embeddings": 128,
85
+ "num_feat_extract_layers": 7,
86
+ "num_hidden_layers": 24,
87
+ "num_negatives": 100,
88
+ "output_hidden_size": 1024,
89
+ "pad_token_id": 676,
90
+ "proj_codevector_dim": 768,
91
+ "tdnn_dilation": [
92
+ 1,
93
+ 2,
94
+ 3,
95
+ 1,
96
+ 1
97
+ ],
98
+ "tdnn_dim": [
99
+ 512,
100
+ 512,
101
+ 512,
102
+ 512,
103
+ 1500
104
+ ],
105
+ "tdnn_kernel": [
106
+ 5,
107
+ 3,
108
+ 3,
109
+ 1,
110
+ 1
111
+ ],
112
+ "torch_dtype": "float32",
113
+ "transformers_version": "4.34.0",
114
+ "use_weighted_layer_sum": false,
115
+ "vocab_size": 679,
116
+ "xvector_output_dim": 512
117
+ }
checkpoint-6750/optimizer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c8daec975379a76bfdf399bf91e3a3bbd6c15f5ba6d356d3ce5e603c47d25404
3
+ size 2495727542
checkpoint-6750/preprocessor_config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "do_normalize": true,
3
+ "feature_extractor_type": "Wav2Vec2FeatureExtractor",
4
+ "feature_size": 1,
5
+ "padding_side": "right",
6
+ "padding_value": 0,
7
+ "processor_class": "Wav2Vec2Processor",
8
+ "return_attention_mask": true,
9
+ "sampling_rate": 16000
10
+ }
checkpoint-6750/pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b17410927892415a7457024d513c3b2a1c577f6e7069f28e38903bc839265581
3
+ size 1264686250
checkpoint-6750/rng_state_0.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9463127f1583b222ce5d7b9ac8b3b7a262658dbfc27fe01496ed3b356881b274
3
+ size 15024
checkpoint-6750/rng_state_1.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:712146ec2b46d8aa5a69d7d6c52e8f3df3aeda3960e40b713509a3f29fd0b8c7
3
+ size 15088
checkpoint-6750/rng_state_2.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fe1c33fa4f9b3268e0fa630c5389145b247b3ef88df73b2a672b869dbe1f14f2
3
+ size 15024
checkpoint-6750/rng_state_3.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ae6bc567733fbdec5df2244a91a8731abd7cca88032ec6e1be21ba8eadc79e86
3
+ size 15024
checkpoint-6750/scheduler.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a15bc5149d23b4fbf0dec535f6501039fd33564ee898ebce9b8b537524a2f244
3
+ size 1064
checkpoint-6750/special_tokens_map.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<s>",
4
+ "</s>"
5
+ ],
6
+ "bos_token": "<s>",
7
+ "eos_token": "</s>",
8
+ "pad_token": "[PAD]",
9
+ "unk_token": "[UNK]"
10
+ }
checkpoint-6750/tokenizer_config.json ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "675": {
4
+ "content": "[UNK]",
5
+ "lstrip": true,
6
+ "normalized": false,
7
+ "rstrip": true,
8
+ "single_word": false,
9
+ "special": false
10
+ },
11
+ "676": {
12
+ "content": "[PAD]",
13
+ "lstrip": true,
14
+ "normalized": false,
15
+ "rstrip": true,
16
+ "single_word": false,
17
+ "special": false
18
+ },
19
+ "677": {
20
+ "content": "<s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "678": {
28
+ "content": "</s>",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ }
35
+ },
36
+ "additional_special_tokens": [
37
+ "<s>",
38
+ "</s>"
39
+ ],
40
+ "bos_token": "<s>",
41
+ "clean_up_tokenization_spaces": true,
42
+ "config": null,
43
+ "do_lower_case": false,
44
+ "eos_token": "</s>",
45
+ "model_max_length": 1000000000000000019884624838656,
46
+ "pad_token": "[PAD]",
47
+ "processor_class": "Wav2Vec2Processor",
48
+ "replace_word_delimiter_char": " ",
49
+ "target_lang": null,
50
+ "tokenizer_class": "Wav2Vec2CTCTokenizer",
51
+ "tokenizer_file": null,
52
+ "tokenizer_type": "wav2vec2",
53
+ "trust_remote_code": false,
54
+ "unk_token": "[UNK]",
55
+ "word_delimiter_token": "|"
56
+ }
checkpoint-6750/trainer_state.json ADDED
@@ -0,0 +1,1234 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": null,
3
+ "best_model_checkpoint": null,
4
+ "epoch": 293.4782608695652,
5
+ "eval_steps": 150,
6
+ "global_step": 6750,
7
+ "is_hyper_param_search": false,
8
+ "is_local_process_zero": true,
9
+ "is_world_process_zero": true,
10
+ "log_history": [
11
+ {
12
+ "epoch": 2.17,
13
+ "learning_rate": 0.0003,
14
+ "loss": 35.2887,
15
+ "step": 50
16
+ },
17
+ {
18
+ "epoch": 4.35,
19
+ "learning_rate": 0.00029781021897810217,
20
+ "loss": 5.9569,
21
+ "step": 100
22
+ },
23
+ {
24
+ "epoch": 6.52,
25
+ "learning_rate": 0.00029562043795620436,
26
+ "loss": 4.9138,
27
+ "step": 150
28
+ },
29
+ {
30
+ "epoch": 6.52,
31
+ "eval_cer": 1.0,
32
+ "eval_loss": 4.7965407371521,
33
+ "eval_runtime": 1.256,
34
+ "eval_samples_per_second": 35.828,
35
+ "eval_steps_per_second": 2.389,
36
+ "step": 150
37
+ },
38
+ {
39
+ "epoch": 8.7,
40
+ "learning_rate": 0.00029343065693430656,
41
+ "loss": 4.887,
42
+ "step": 200
43
+ },
44
+ {
45
+ "epoch": 10.87,
46
+ "learning_rate": 0.00029124087591240875,
47
+ "loss": 4.8447,
48
+ "step": 250
49
+ },
50
+ {
51
+ "epoch": 13.04,
52
+ "learning_rate": 0.00028905109489051094,
53
+ "loss": 4.7484,
54
+ "step": 300
55
+ },
56
+ {
57
+ "epoch": 13.04,
58
+ "eval_cer": 1.0,
59
+ "eval_loss": 4.608075141906738,
60
+ "eval_runtime": 1.2451,
61
+ "eval_samples_per_second": 36.142,
62
+ "eval_steps_per_second": 2.409,
63
+ "step": 300
64
+ },
65
+ {
66
+ "epoch": 15.22,
67
+ "learning_rate": 0.00028686131386861314,
68
+ "loss": 4.6529,
69
+ "step": 350
70
+ },
71
+ {
72
+ "epoch": 17.39,
73
+ "learning_rate": 0.0002846715328467153,
74
+ "loss": 4.6373,
75
+ "step": 400
76
+ },
77
+ {
78
+ "epoch": 19.57,
79
+ "learning_rate": 0.00028248175182481747,
80
+ "loss": 4.5894,
81
+ "step": 450
82
+ },
83
+ {
84
+ "epoch": 19.57,
85
+ "eval_cer": 0.9851301115241635,
86
+ "eval_loss": 4.469708442687988,
87
+ "eval_runtime": 1.2325,
88
+ "eval_samples_per_second": 36.51,
89
+ "eval_steps_per_second": 2.434,
90
+ "step": 450
91
+ },
92
+ {
93
+ "epoch": 21.74,
94
+ "learning_rate": 0.00028029197080291966,
95
+ "loss": 4.5045,
96
+ "step": 500
97
+ },
98
+ {
99
+ "epoch": 23.91,
100
+ "learning_rate": 0.00027810218978102186,
101
+ "loss": 4.4076,
102
+ "step": 550
103
+ },
104
+ {
105
+ "epoch": 26.09,
106
+ "learning_rate": 0.00027591240875912405,
107
+ "loss": 4.2024,
108
+ "step": 600
109
+ },
110
+ {
111
+ "epoch": 26.09,
112
+ "eval_cer": 0.9076827757125155,
113
+ "eval_loss": 4.037315845489502,
114
+ "eval_runtime": 1.2357,
115
+ "eval_samples_per_second": 36.417,
116
+ "eval_steps_per_second": 2.428,
117
+ "step": 600
118
+ },
119
+ {
120
+ "epoch": 28.26,
121
+ "learning_rate": 0.00027372262773722625,
122
+ "loss": 3.8743,
123
+ "step": 650
124
+ },
125
+ {
126
+ "epoch": 30.43,
127
+ "learning_rate": 0.00027153284671532844,
128
+ "loss": 3.3488,
129
+ "step": 700
130
+ },
131
+ {
132
+ "epoch": 32.61,
133
+ "learning_rate": 0.00026934306569343063,
134
+ "loss": 2.7314,
135
+ "step": 750
136
+ },
137
+ {
138
+ "epoch": 32.61,
139
+ "eval_cer": 0.5340768277571252,
140
+ "eval_loss": 2.5507473945617676,
141
+ "eval_runtime": 1.2278,
142
+ "eval_samples_per_second": 36.651,
143
+ "eval_steps_per_second": 2.443,
144
+ "step": 750
145
+ },
146
+ {
147
+ "epoch": 34.78,
148
+ "learning_rate": 0.00026715328467153283,
149
+ "loss": 2.1968,
150
+ "step": 800
151
+ },
152
+ {
153
+ "epoch": 36.96,
154
+ "learning_rate": 0.000264963503649635,
155
+ "loss": 1.6522,
156
+ "step": 850
157
+ },
158
+ {
159
+ "epoch": 39.13,
160
+ "learning_rate": 0.0002627737226277372,
161
+ "loss": 1.2293,
162
+ "step": 900
163
+ },
164
+ {
165
+ "epoch": 39.13,
166
+ "eval_cer": 0.4138785625774473,
167
+ "eval_loss": 2.01461124420166,
168
+ "eval_runtime": 1.2246,
169
+ "eval_samples_per_second": 36.746,
170
+ "eval_steps_per_second": 2.45,
171
+ "step": 900
172
+ },
173
+ {
174
+ "epoch": 41.3,
175
+ "learning_rate": 0.0002605839416058394,
176
+ "loss": 0.9292,
177
+ "step": 950
178
+ },
179
+ {
180
+ "epoch": 43.48,
181
+ "learning_rate": 0.00025839416058394155,
182
+ "loss": 0.7208,
183
+ "step": 1000
184
+ },
185
+ {
186
+ "epoch": 45.65,
187
+ "learning_rate": 0.00025620437956204374,
188
+ "loss": 0.5544,
189
+ "step": 1050
190
+ },
191
+ {
192
+ "epoch": 45.65,
193
+ "eval_cer": 0.355638166047088,
194
+ "eval_loss": 1.9821244478225708,
195
+ "eval_runtime": 1.2073,
196
+ "eval_samples_per_second": 37.275,
197
+ "eval_steps_per_second": 2.485,
198
+ "step": 1050
199
+ },
200
+ {
201
+ "epoch": 47.83,
202
+ "learning_rate": 0.00025401459854014594,
203
+ "loss": 0.4757,
204
+ "step": 1100
205
+ },
206
+ {
207
+ "epoch": 50.0,
208
+ "learning_rate": 0.00025182481751824813,
209
+ "loss": 0.3895,
210
+ "step": 1150
211
+ },
212
+ {
213
+ "epoch": 52.17,
214
+ "learning_rate": 0.0002496350364963503,
215
+ "loss": 0.3224,
216
+ "step": 1200
217
+ },
218
+ {
219
+ "epoch": 52.17,
220
+ "eval_cer": 0.3587360594795539,
221
+ "eval_loss": 2.0189881324768066,
222
+ "eval_runtime": 1.1983,
223
+ "eval_samples_per_second": 37.554,
224
+ "eval_steps_per_second": 2.504,
225
+ "step": 1200
226
+ },
227
+ {
228
+ "epoch": 54.35,
229
+ "learning_rate": 0.0002474452554744525,
230
+ "loss": 0.279,
231
+ "step": 1250
232
+ },
233
+ {
234
+ "epoch": 56.52,
235
+ "learning_rate": 0.0002452554744525547,
236
+ "loss": 0.2285,
237
+ "step": 1300
238
+ },
239
+ {
240
+ "epoch": 58.7,
241
+ "learning_rate": 0.0002430656934306569,
242
+ "loss": 0.1951,
243
+ "step": 1350
244
+ },
245
+ {
246
+ "epoch": 58.7,
247
+ "eval_cer": 0.36121437422552666,
248
+ "eval_loss": 2.1229116916656494,
249
+ "eval_runtime": 1.2603,
250
+ "eval_samples_per_second": 35.706,
251
+ "eval_steps_per_second": 2.38,
252
+ "step": 1350
253
+ },
254
+ {
255
+ "epoch": 60.87,
256
+ "learning_rate": 0.0002408759124087591,
257
+ "loss": 0.1964,
258
+ "step": 1400
259
+ },
260
+ {
261
+ "epoch": 63.04,
262
+ "learning_rate": 0.0002386861313868613,
263
+ "loss": 0.1622,
264
+ "step": 1450
265
+ },
266
+ {
267
+ "epoch": 65.22,
268
+ "learning_rate": 0.0002364963503649635,
269
+ "loss": 0.1539,
270
+ "step": 1500
271
+ },
272
+ {
273
+ "epoch": 65.22,
274
+ "eval_cer": 0.3469640644361834,
275
+ "eval_loss": 2.111368179321289,
276
+ "eval_runtime": 1.2194,
277
+ "eval_samples_per_second": 36.903,
278
+ "eval_steps_per_second": 2.46,
279
+ "step": 1500
280
+ },
281
+ {
282
+ "epoch": 67.39,
283
+ "learning_rate": 0.00023430656934306568,
284
+ "loss": 0.1492,
285
+ "step": 1550
286
+ },
287
+ {
288
+ "epoch": 69.57,
289
+ "learning_rate": 0.00023211678832116788,
290
+ "loss": 0.1404,
291
+ "step": 1600
292
+ },
293
+ {
294
+ "epoch": 71.74,
295
+ "learning_rate": 0.00022992700729927004,
296
+ "loss": 0.1165,
297
+ "step": 1650
298
+ },
299
+ {
300
+ "epoch": 71.74,
301
+ "eval_cer": 0.33147459727385375,
302
+ "eval_loss": 2.274796485900879,
303
+ "eval_runtime": 1.1874,
304
+ "eval_samples_per_second": 37.898,
305
+ "eval_steps_per_second": 2.527,
306
+ "step": 1650
307
+ },
308
+ {
309
+ "epoch": 73.91,
310
+ "learning_rate": 0.00022773722627737224,
311
+ "loss": 0.1268,
312
+ "step": 1700
313
+ },
314
+ {
315
+ "epoch": 76.09,
316
+ "learning_rate": 0.00022554744525547443,
317
+ "loss": 0.1186,
318
+ "step": 1750
319
+ },
320
+ {
321
+ "epoch": 78.26,
322
+ "learning_rate": 0.00022335766423357663,
323
+ "loss": 0.1119,
324
+ "step": 1800
325
+ },
326
+ {
327
+ "epoch": 78.26,
328
+ "eval_cer": 0.34882280049566294,
329
+ "eval_loss": 2.2390518188476562,
330
+ "eval_runtime": 1.3465,
331
+ "eval_samples_per_second": 33.42,
332
+ "eval_steps_per_second": 2.228,
333
+ "step": 1800
334
+ },
335
+ {
336
+ "epoch": 80.43,
337
+ "learning_rate": 0.00022116788321167882,
338
+ "loss": 0.0988,
339
+ "step": 1850
340
+ },
341
+ {
342
+ "epoch": 82.61,
343
+ "learning_rate": 0.00021897810218978101,
344
+ "loss": 0.112,
345
+ "step": 1900
346
+ },
347
+ {
348
+ "epoch": 84.78,
349
+ "learning_rate": 0.0002167883211678832,
350
+ "loss": 0.0989,
351
+ "step": 1950
352
+ },
353
+ {
354
+ "epoch": 84.78,
355
+ "eval_cer": 0.3382899628252788,
356
+ "eval_loss": 2.343754529953003,
357
+ "eval_runtime": 1.2055,
358
+ "eval_samples_per_second": 37.329,
359
+ "eval_steps_per_second": 2.489,
360
+ "step": 1950
361
+ },
362
+ {
363
+ "epoch": 86.96,
364
+ "learning_rate": 0.00021459854014598537,
365
+ "loss": 0.097,
366
+ "step": 2000
367
+ },
368
+ {
369
+ "epoch": 89.13,
370
+ "learning_rate": 0.00021240875912408757,
371
+ "loss": 0.0854,
372
+ "step": 2050
373
+ },
374
+ {
375
+ "epoch": 91.3,
376
+ "learning_rate": 0.00021021897810218976,
377
+ "loss": 0.0915,
378
+ "step": 2100
379
+ },
380
+ {
381
+ "epoch": 91.3,
382
+ "eval_cer": 0.3587360594795539,
383
+ "eval_loss": 2.121840000152588,
384
+ "eval_runtime": 1.2037,
385
+ "eval_samples_per_second": 37.386,
386
+ "eval_steps_per_second": 2.492,
387
+ "step": 2100
388
+ },
389
+ {
390
+ "epoch": 93.48,
391
+ "learning_rate": 0.00020802919708029196,
392
+ "loss": 0.078,
393
+ "step": 2150
394
+ },
395
+ {
396
+ "epoch": 95.65,
397
+ "learning_rate": 0.00020583941605839415,
398
+ "loss": 0.0857,
399
+ "step": 2200
400
+ },
401
+ {
402
+ "epoch": 97.83,
403
+ "learning_rate": 0.00020364963503649632,
404
+ "loss": 0.0721,
405
+ "step": 2250
406
+ },
407
+ {
408
+ "epoch": 97.83,
409
+ "eval_cer": 0.35192069392812886,
410
+ "eval_loss": 2.242812395095825,
411
+ "eval_runtime": 1.1964,
412
+ "eval_samples_per_second": 37.614,
413
+ "eval_steps_per_second": 2.508,
414
+ "step": 2250
415
+ },
416
+ {
417
+ "epoch": 100.0,
418
+ "learning_rate": 0.0002014598540145985,
419
+ "loss": 0.0799,
420
+ "step": 2300
421
+ },
422
+ {
423
+ "epoch": 102.17,
424
+ "learning_rate": 0.0001992700729927007,
425
+ "loss": 0.0798,
426
+ "step": 2350
427
+ },
428
+ {
429
+ "epoch": 104.35,
430
+ "learning_rate": 0.0001970802919708029,
431
+ "loss": 0.0742,
432
+ "step": 2400
433
+ },
434
+ {
435
+ "epoch": 104.35,
436
+ "eval_cer": 0.33643122676579923,
437
+ "eval_loss": 2.229339838027954,
438
+ "eval_runtime": 1.2156,
439
+ "eval_samples_per_second": 37.019,
440
+ "eval_steps_per_second": 2.468,
441
+ "step": 2400
442
+ },
443
+ {
444
+ "epoch": 106.52,
445
+ "learning_rate": 0.0001948905109489051,
446
+ "loss": 0.0692,
447
+ "step": 2450
448
+ },
449
+ {
450
+ "epoch": 108.7,
451
+ "learning_rate": 0.0001927007299270073,
452
+ "loss": 0.0664,
453
+ "step": 2500
454
+ },
455
+ {
456
+ "epoch": 110.87,
457
+ "learning_rate": 0.00019051094890510948,
458
+ "loss": 0.0629,
459
+ "step": 2550
460
+ },
461
+ {
462
+ "epoch": 110.87,
463
+ "eval_cer": 0.33705080545229243,
464
+ "eval_loss": 2.2878150939941406,
465
+ "eval_runtime": 1.2044,
466
+ "eval_samples_per_second": 37.364,
467
+ "eval_steps_per_second": 2.491,
468
+ "step": 2550
469
+ },
470
+ {
471
+ "epoch": 113.04,
472
+ "learning_rate": 0.00018832116788321167,
473
+ "loss": 0.0619,
474
+ "step": 2600
475
+ },
476
+ {
477
+ "epoch": 115.22,
478
+ "learning_rate": 0.00018613138686131387,
479
+ "loss": 0.0582,
480
+ "step": 2650
481
+ },
482
+ {
483
+ "epoch": 117.39,
484
+ "learning_rate": 0.00018394160583941606,
485
+ "loss": 0.0495,
486
+ "step": 2700
487
+ },
488
+ {
489
+ "epoch": 117.39,
490
+ "eval_cer": 0.34076827757125155,
491
+ "eval_loss": 2.2671637535095215,
492
+ "eval_runtime": 1.2039,
493
+ "eval_samples_per_second": 37.379,
494
+ "eval_steps_per_second": 2.492,
495
+ "step": 2700
496
+ },
497
+ {
498
+ "epoch": 119.57,
499
+ "learning_rate": 0.00018175182481751826,
500
+ "loss": 0.0614,
501
+ "step": 2750
502
+ },
503
+ {
504
+ "epoch": 121.74,
505
+ "learning_rate": 0.00017956204379562042,
506
+ "loss": 0.0565,
507
+ "step": 2800
508
+ },
509
+ {
510
+ "epoch": 123.91,
511
+ "learning_rate": 0.00017737226277372262,
512
+ "loss": 0.0466,
513
+ "step": 2850
514
+ },
515
+ {
516
+ "epoch": 123.91,
517
+ "eval_cer": 0.35254027261462206,
518
+ "eval_loss": 2.2532107830047607,
519
+ "eval_runtime": 1.3563,
520
+ "eval_samples_per_second": 33.179,
521
+ "eval_steps_per_second": 2.212,
522
+ "step": 2850
523
+ },
524
+ {
525
+ "epoch": 126.09,
526
+ "learning_rate": 0.00017518248175182478,
527
+ "loss": 0.0465,
528
+ "step": 2900
529
+ },
530
+ {
531
+ "epoch": 128.26,
532
+ "learning_rate": 0.00017299270072992698,
533
+ "loss": 0.0496,
534
+ "step": 2950
535
+ },
536
+ {
537
+ "epoch": 130.43,
538
+ "learning_rate": 0.00017080291970802917,
539
+ "loss": 0.0424,
540
+ "step": 3000
541
+ },
542
+ {
543
+ "epoch": 130.43,
544
+ "eval_cer": 0.32589838909541513,
545
+ "eval_loss": 2.2844393253326416,
546
+ "eval_runtime": 1.2006,
547
+ "eval_samples_per_second": 37.48,
548
+ "eval_steps_per_second": 2.499,
549
+ "step": 3000
550
+ },
551
+ {
552
+ "epoch": 132.61,
553
+ "learning_rate": 0.00016861313868613137,
554
+ "loss": 0.0483,
555
+ "step": 3050
556
+ },
557
+ {
558
+ "epoch": 134.78,
559
+ "learning_rate": 0.00016642335766423356,
560
+ "loss": 0.0488,
561
+ "step": 3100
562
+ },
563
+ {
564
+ "epoch": 136.96,
565
+ "learning_rate": 0.00016423357664233575,
566
+ "loss": 0.0446,
567
+ "step": 3150
568
+ },
569
+ {
570
+ "epoch": 136.96,
571
+ "eval_cer": 0.3252788104089219,
572
+ "eval_loss": 2.2763445377349854,
573
+ "eval_runtime": 1.2043,
574
+ "eval_samples_per_second": 37.368,
575
+ "eval_steps_per_second": 2.491,
576
+ "step": 3150
577
+ },
578
+ {
579
+ "epoch": 139.13,
580
+ "learning_rate": 0.00016204379562043795,
581
+ "loss": 0.0424,
582
+ "step": 3200
583
+ },
584
+ {
585
+ "epoch": 141.3,
586
+ "learning_rate": 0.00015985401459854014,
587
+ "loss": 0.0429,
588
+ "step": 3250
589
+ },
590
+ {
591
+ "epoch": 143.48,
592
+ "learning_rate": 0.00015766423357664234,
593
+ "loss": 0.0411,
594
+ "step": 3300
595
+ },
596
+ {
597
+ "epoch": 143.48,
598
+ "eval_cer": 0.3302354399008674,
599
+ "eval_loss": 2.301079034805298,
600
+ "eval_runtime": 1.345,
601
+ "eval_samples_per_second": 33.458,
602
+ "eval_steps_per_second": 2.231,
603
+ "step": 3300
604
+ },
605
+ {
606
+ "epoch": 145.65,
607
+ "learning_rate": 0.00015547445255474453,
608
+ "loss": 0.0392,
609
+ "step": 3350
610
+ },
611
+ {
612
+ "epoch": 147.83,
613
+ "learning_rate": 0.00015328467153284672,
614
+ "loss": 0.0426,
615
+ "step": 3400
616
+ },
617
+ {
618
+ "epoch": 150.0,
619
+ "learning_rate": 0.00015109489051094892,
620
+ "loss": 0.0419,
621
+ "step": 3450
622
+ },
623
+ {
624
+ "epoch": 150.0,
625
+ "eval_cer": 0.3420074349442379,
626
+ "eval_loss": 2.320059299468994,
627
+ "eval_runtime": 1.2411,
628
+ "eval_samples_per_second": 36.259,
629
+ "eval_steps_per_second": 2.417,
630
+ "step": 3450
631
+ },
632
+ {
633
+ "epoch": 152.17,
634
+ "learning_rate": 0.00014890510948905108,
635
+ "loss": 0.0386,
636
+ "step": 3500
637
+ },
638
+ {
639
+ "epoch": 154.35,
640
+ "learning_rate": 0.00014671532846715328,
641
+ "loss": 0.0402,
642
+ "step": 3550
643
+ },
644
+ {
645
+ "epoch": 156.52,
646
+ "learning_rate": 0.00014452554744525547,
647
+ "loss": 0.0333,
648
+ "step": 3600
649
+ },
650
+ {
651
+ "epoch": 156.52,
652
+ "eval_cer": 0.34386617100371747,
653
+ "eval_loss": 2.364445209503174,
654
+ "eval_runtime": 1.2337,
655
+ "eval_samples_per_second": 36.475,
656
+ "eval_steps_per_second": 2.432,
657
+ "step": 3600
658
+ },
659
+ {
660
+ "epoch": 158.7,
661
+ "learning_rate": 0.00014233576642335764,
662
+ "loss": 0.0434,
663
+ "step": 3650
664
+ },
665
+ {
666
+ "epoch": 160.87,
667
+ "learning_rate": 0.00014014598540145983,
668
+ "loss": 0.0393,
669
+ "step": 3700
670
+ },
671
+ {
672
+ "epoch": 163.04,
673
+ "learning_rate": 0.00013795620437956203,
674
+ "loss": 0.0384,
675
+ "step": 3750
676
+ },
677
+ {
678
+ "epoch": 163.04,
679
+ "eval_cer": 0.35315985130111527,
680
+ "eval_loss": 2.3685200214385986,
681
+ "eval_runtime": 1.2136,
682
+ "eval_samples_per_second": 37.081,
683
+ "eval_steps_per_second": 2.472,
684
+ "step": 3750
685
+ },
686
+ {
687
+ "epoch": 165.22,
688
+ "learning_rate": 0.00013576642335766422,
689
+ "loss": 0.0324,
690
+ "step": 3800
691
+ },
692
+ {
693
+ "epoch": 167.39,
694
+ "learning_rate": 0.00013357664233576641,
695
+ "loss": 0.0438,
696
+ "step": 3850
697
+ },
698
+ {
699
+ "epoch": 169.57,
700
+ "learning_rate": 0.0001313868613138686,
701
+ "loss": 0.0367,
702
+ "step": 3900
703
+ },
704
+ {
705
+ "epoch": 169.57,
706
+ "eval_cer": 0.3469640644361834,
707
+ "eval_loss": 2.397036552429199,
708
+ "eval_runtime": 1.2259,
709
+ "eval_samples_per_second": 36.708,
710
+ "eval_steps_per_second": 2.447,
711
+ "step": 3900
712
+ },
713
+ {
714
+ "epoch": 171.74,
715
+ "learning_rate": 0.00012919708029197077,
716
+ "loss": 0.0336,
717
+ "step": 3950
718
+ },
719
+ {
720
+ "epoch": 173.91,
721
+ "learning_rate": 0.00012700729927007297,
722
+ "loss": 0.037,
723
+ "step": 4000
724
+ },
725
+ {
726
+ "epoch": 176.09,
727
+ "learning_rate": 0.00012481751824817516,
728
+ "loss": 0.0307,
729
+ "step": 4050
730
+ },
731
+ {
732
+ "epoch": 176.09,
733
+ "eval_cer": 0.3308550185873606,
734
+ "eval_loss": 2.3530125617980957,
735
+ "eval_runtime": 1.2484,
736
+ "eval_samples_per_second": 36.047,
737
+ "eval_steps_per_second": 2.403,
738
+ "step": 4050
739
+ },
740
+ {
741
+ "epoch": 178.26,
742
+ "learning_rate": 0.00012262773722627736,
743
+ "loss": 0.0284,
744
+ "step": 4100
745
+ },
746
+ {
747
+ "epoch": 180.43,
748
+ "learning_rate": 0.00012043795620437955,
749
+ "loss": 0.0233,
750
+ "step": 4150
751
+ },
752
+ {
753
+ "epoch": 182.61,
754
+ "learning_rate": 0.00011824817518248174,
755
+ "loss": 0.0328,
756
+ "step": 4200
757
+ },
758
+ {
759
+ "epoch": 182.61,
760
+ "eval_cer": 0.33147459727385375,
761
+ "eval_loss": 2.3414556980133057,
762
+ "eval_runtime": 1.2281,
763
+ "eval_samples_per_second": 36.64,
764
+ "eval_steps_per_second": 2.443,
765
+ "step": 4200
766
+ },
767
+ {
768
+ "epoch": 184.78,
769
+ "learning_rate": 0.00011605839416058394,
770
+ "loss": 0.0285,
771
+ "step": 4250
772
+ },
773
+ {
774
+ "epoch": 186.96,
775
+ "learning_rate": 0.00011386861313868612,
776
+ "loss": 0.0222,
777
+ "step": 4300
778
+ },
779
+ {
780
+ "epoch": 189.13,
781
+ "learning_rate": 0.00011167883211678831,
782
+ "loss": 0.0271,
783
+ "step": 4350
784
+ },
785
+ {
786
+ "epoch": 189.13,
787
+ "eval_cer": 0.3308550185873606,
788
+ "eval_loss": 2.4165024757385254,
789
+ "eval_runtime": 1.1891,
790
+ "eval_samples_per_second": 37.844,
791
+ "eval_steps_per_second": 2.523,
792
+ "step": 4350
793
+ },
794
+ {
795
+ "epoch": 191.3,
796
+ "learning_rate": 0.00010948905109489051,
797
+ "loss": 0.0307,
798
+ "step": 4400
799
+ },
800
+ {
801
+ "epoch": 193.48,
802
+ "learning_rate": 0.00010729927007299269,
803
+ "loss": 0.023,
804
+ "step": 4450
805
+ },
806
+ {
807
+ "epoch": 195.65,
808
+ "learning_rate": 0.00010510948905109488,
809
+ "loss": 0.0213,
810
+ "step": 4500
811
+ },
812
+ {
813
+ "epoch": 195.65,
814
+ "eval_cer": 0.3451053283767038,
815
+ "eval_loss": 2.447828769683838,
816
+ "eval_runtime": 1.1406,
817
+ "eval_samples_per_second": 39.452,
818
+ "eval_steps_per_second": 2.63,
819
+ "step": 4500
820
+ },
821
+ {
822
+ "epoch": 197.83,
823
+ "learning_rate": 0.00010291970802919708,
824
+ "loss": 0.021,
825
+ "step": 4550
826
+ },
827
+ {
828
+ "epoch": 200.0,
829
+ "learning_rate": 0.00010072992700729926,
830
+ "loss": 0.0246,
831
+ "step": 4600
832
+ },
833
+ {
834
+ "epoch": 202.17,
835
+ "learning_rate": 9.854014598540145e-05,
836
+ "loss": 0.0193,
837
+ "step": 4650
838
+ },
839
+ {
840
+ "epoch": 202.17,
841
+ "eval_cer": 0.355638166047088,
842
+ "eval_loss": 2.524061918258667,
843
+ "eval_runtime": 1.203,
844
+ "eval_samples_per_second": 37.406,
845
+ "eval_steps_per_second": 2.494,
846
+ "step": 4650
847
+ },
848
+ {
849
+ "epoch": 204.35,
850
+ "learning_rate": 9.635036496350364e-05,
851
+ "loss": 0.0223,
852
+ "step": 4700
853
+ },
854
+ {
855
+ "epoch": 206.52,
856
+ "learning_rate": 9.416058394160584e-05,
857
+ "loss": 0.0223,
858
+ "step": 4750
859
+ },
860
+ {
861
+ "epoch": 208.7,
862
+ "learning_rate": 9.197080291970803e-05,
863
+ "loss": 0.0204,
864
+ "step": 4800
865
+ },
866
+ {
867
+ "epoch": 208.7,
868
+ "eval_cer": 0.34634448574969023,
869
+ "eval_loss": 2.570009708404541,
870
+ "eval_runtime": 1.2664,
871
+ "eval_samples_per_second": 35.533,
872
+ "eval_steps_per_second": 2.369,
873
+ "step": 4800
874
+ },
875
+ {
876
+ "epoch": 210.87,
877
+ "learning_rate": 8.978102189781021e-05,
878
+ "loss": 0.0202,
879
+ "step": 4850
880
+ },
881
+ {
882
+ "epoch": 213.04,
883
+ "learning_rate": 8.759124087591239e-05,
884
+ "loss": 0.0193,
885
+ "step": 4900
886
+ },
887
+ {
888
+ "epoch": 215.22,
889
+ "learning_rate": 8.540145985401459e-05,
890
+ "loss": 0.0185,
891
+ "step": 4950
892
+ },
893
+ {
894
+ "epoch": 215.22,
895
+ "eval_cer": 0.31784386617100374,
896
+ "eval_loss": 2.583724021911621,
897
+ "eval_runtime": 1.2549,
898
+ "eval_samples_per_second": 35.859,
899
+ "eval_steps_per_second": 2.391,
900
+ "step": 4950
901
+ },
902
+ {
903
+ "epoch": 217.39,
904
+ "learning_rate": 8.321167883211678e-05,
905
+ "loss": 0.0191,
906
+ "step": 5000
907
+ },
908
+ {
909
+ "epoch": 219.57,
910
+ "learning_rate": 8.102189781021897e-05,
911
+ "loss": 0.0169,
912
+ "step": 5050
913
+ },
914
+ {
915
+ "epoch": 221.74,
916
+ "learning_rate": 7.883211678832117e-05,
917
+ "loss": 0.0161,
918
+ "step": 5100
919
+ },
920
+ {
921
+ "epoch": 221.74,
922
+ "eval_cer": 0.33767038413878564,
923
+ "eval_loss": 2.513859987258911,
924
+ "eval_runtime": 1.2515,
925
+ "eval_samples_per_second": 35.958,
926
+ "eval_steps_per_second": 2.397,
927
+ "step": 5100
928
+ },
929
+ {
930
+ "epoch": 223.91,
931
+ "learning_rate": 7.664233576642336e-05,
932
+ "loss": 0.0183,
933
+ "step": 5150
934
+ },
935
+ {
936
+ "epoch": 226.09,
937
+ "learning_rate": 7.445255474452554e-05,
938
+ "loss": 0.0228,
939
+ "step": 5200
940
+ },
941
+ {
942
+ "epoch": 228.26,
943
+ "learning_rate": 7.226277372262774e-05,
944
+ "loss": 0.0167,
945
+ "step": 5250
946
+ },
947
+ {
948
+ "epoch": 228.26,
949
+ "eval_cer": 0.3351920693928129,
950
+ "eval_loss": 2.5287766456604004,
951
+ "eval_runtime": 1.2044,
952
+ "eval_samples_per_second": 37.363,
953
+ "eval_steps_per_second": 2.491,
954
+ "step": 5250
955
+ },
956
+ {
957
+ "epoch": 230.43,
958
+ "learning_rate": 7.007299270072992e-05,
959
+ "loss": 0.0181,
960
+ "step": 5300
961
+ },
962
+ {
963
+ "epoch": 232.61,
964
+ "learning_rate": 6.788321167883211e-05,
965
+ "loss": 0.0144,
966
+ "step": 5350
967
+ },
968
+ {
969
+ "epoch": 234.78,
970
+ "learning_rate": 6.56934306569343e-05,
971
+ "loss": 0.0148,
972
+ "step": 5400
973
+ },
974
+ {
975
+ "epoch": 234.78,
976
+ "eval_cer": 0.338909541511772,
977
+ "eval_loss": 2.574066400527954,
978
+ "eval_runtime": 1.2534,
979
+ "eval_samples_per_second": 35.904,
980
+ "eval_steps_per_second": 2.394,
981
+ "step": 5400
982
+ },
983
+ {
984
+ "epoch": 236.96,
985
+ "learning_rate": 6.350364963503648e-05,
986
+ "loss": 0.0143,
987
+ "step": 5450
988
+ },
989
+ {
990
+ "epoch": 239.13,
991
+ "learning_rate": 6.131386861313868e-05,
992
+ "loss": 0.0197,
993
+ "step": 5500
994
+ },
995
+ {
996
+ "epoch": 241.3,
997
+ "learning_rate": 5.912408759124087e-05,
998
+ "loss": 0.0141,
999
+ "step": 5550
1000
+ },
1001
+ {
1002
+ "epoch": 241.3,
1003
+ "eval_cer": 0.338909541511772,
1004
+ "eval_loss": 2.5173895359039307,
1005
+ "eval_runtime": 1.1989,
1006
+ "eval_samples_per_second": 37.536,
1007
+ "eval_steps_per_second": 2.502,
1008
+ "step": 5550
1009
+ },
1010
+ {
1011
+ "epoch": 243.48,
1012
+ "learning_rate": 5.693430656934306e-05,
1013
+ "loss": 0.0165,
1014
+ "step": 5600
1015
+ },
1016
+ {
1017
+ "epoch": 245.65,
1018
+ "learning_rate": 5.4744525547445253e-05,
1019
+ "loss": 0.0127,
1020
+ "step": 5650
1021
+ },
1022
+ {
1023
+ "epoch": 247.83,
1024
+ "learning_rate": 5.255474452554744e-05,
1025
+ "loss": 0.0122,
1026
+ "step": 5700
1027
+ },
1028
+ {
1029
+ "epoch": 247.83,
1030
+ "eval_cer": 0.3351920693928129,
1031
+ "eval_loss": 2.5573315620422363,
1032
+ "eval_runtime": 1.2363,
1033
+ "eval_samples_per_second": 36.4,
1034
+ "eval_steps_per_second": 2.427,
1035
+ "step": 5700
1036
+ },
1037
+ {
1038
+ "epoch": 250.0,
1039
+ "learning_rate": 5.036496350364963e-05,
1040
+ "loss": 0.0135,
1041
+ "step": 5750
1042
+ },
1043
+ {
1044
+ "epoch": 252.17,
1045
+ "learning_rate": 4.817518248175182e-05,
1046
+ "loss": 0.0116,
1047
+ "step": 5800
1048
+ },
1049
+ {
1050
+ "epoch": 254.35,
1051
+ "learning_rate": 4.5985401459854016e-05,
1052
+ "loss": 0.0115,
1053
+ "step": 5850
1054
+ },
1055
+ {
1056
+ "epoch": 254.35,
1057
+ "eval_cer": 0.32961586121437425,
1058
+ "eval_loss": 2.579023838043213,
1059
+ "eval_runtime": 1.2327,
1060
+ "eval_samples_per_second": 36.506,
1061
+ "eval_steps_per_second": 2.434,
1062
+ "step": 5850
1063
+ },
1064
+ {
1065
+ "epoch": 256.52,
1066
+ "learning_rate": 4.3795620437956196e-05,
1067
+ "loss": 0.0141,
1068
+ "step": 5900
1069
+ },
1070
+ {
1071
+ "epoch": 258.7,
1072
+ "learning_rate": 4.160583941605839e-05,
1073
+ "loss": 0.0143,
1074
+ "step": 5950
1075
+ },
1076
+ {
1077
+ "epoch": 260.87,
1078
+ "learning_rate": 3.9416058394160584e-05,
1079
+ "loss": 0.0141,
1080
+ "step": 6000
1081
+ },
1082
+ {
1083
+ "epoch": 260.87,
1084
+ "eval_cer": 0.32032218091697645,
1085
+ "eval_loss": 2.577375888824463,
1086
+ "eval_runtime": 1.2321,
1087
+ "eval_samples_per_second": 36.524,
1088
+ "eval_steps_per_second": 2.435,
1089
+ "step": 6000
1090
+ },
1091
+ {
1092
+ "epoch": 263.04,
1093
+ "learning_rate": 3.722627737226277e-05,
1094
+ "loss": 0.0116,
1095
+ "step": 6050
1096
+ },
1097
+ {
1098
+ "epoch": 265.22,
1099
+ "learning_rate": 3.503649635036496e-05,
1100
+ "loss": 0.0101,
1101
+ "step": 6100
1102
+ },
1103
+ {
1104
+ "epoch": 267.39,
1105
+ "learning_rate": 3.284671532846715e-05,
1106
+ "loss": 0.0123,
1107
+ "step": 6150
1108
+ },
1109
+ {
1110
+ "epoch": 267.39,
1111
+ "eval_cer": 0.3308550185873606,
1112
+ "eval_loss": 2.614670753479004,
1113
+ "eval_runtime": 1.1319,
1114
+ "eval_samples_per_second": 39.755,
1115
+ "eval_steps_per_second": 2.65,
1116
+ "step": 6150
1117
+ },
1118
+ {
1119
+ "epoch": 269.57,
1120
+ "learning_rate": 3.065693430656934e-05,
1121
+ "loss": 0.0151,
1122
+ "step": 6200
1123
+ },
1124
+ {
1125
+ "epoch": 271.74,
1126
+ "learning_rate": 2.846715328467153e-05,
1127
+ "loss": 0.0099,
1128
+ "step": 6250
1129
+ },
1130
+ {
1131
+ "epoch": 273.91,
1132
+ "learning_rate": 2.627737226277372e-05,
1133
+ "loss": 0.0214,
1134
+ "step": 6300
1135
+ },
1136
+ {
1137
+ "epoch": 273.91,
1138
+ "eval_cer": 0.3302354399008674,
1139
+ "eval_loss": 2.620166778564453,
1140
+ "eval_runtime": 1.262,
1141
+ "eval_samples_per_second": 35.657,
1142
+ "eval_steps_per_second": 2.377,
1143
+ "step": 6300
1144
+ },
1145
+ {
1146
+ "epoch": 276.09,
1147
+ "learning_rate": 2.408759124087591e-05,
1148
+ "loss": 0.0085,
1149
+ "step": 6350
1150
+ },
1151
+ {
1152
+ "epoch": 278.26,
1153
+ "learning_rate": 2.1897810218978098e-05,
1154
+ "loss": 0.0119,
1155
+ "step": 6400
1156
+ },
1157
+ {
1158
+ "epoch": 280.43,
1159
+ "learning_rate": 1.9708029197080292e-05,
1160
+ "loss": 0.0107,
1161
+ "step": 6450
1162
+ },
1163
+ {
1164
+ "epoch": 280.43,
1165
+ "eval_cer": 0.32342007434944237,
1166
+ "eval_loss": 2.6263809204101562,
1167
+ "eval_runtime": 1.2547,
1168
+ "eval_samples_per_second": 35.867,
1169
+ "eval_steps_per_second": 2.391,
1170
+ "step": 6450
1171
+ },
1172
+ {
1173
+ "epoch": 282.61,
1174
+ "learning_rate": 1.751824817518248e-05,
1175
+ "loss": 0.0107,
1176
+ "step": 6500
1177
+ },
1178
+ {
1179
+ "epoch": 284.78,
1180
+ "learning_rate": 1.532846715328467e-05,
1181
+ "loss": 0.0105,
1182
+ "step": 6550
1183
+ },
1184
+ {
1185
+ "epoch": 286.96,
1186
+ "learning_rate": 1.313868613138686e-05,
1187
+ "loss": 0.0086,
1188
+ "step": 6600
1189
+ },
1190
+ {
1191
+ "epoch": 286.96,
1192
+ "eval_cer": 0.3215613382899628,
1193
+ "eval_loss": 2.607461452484131,
1194
+ "eval_runtime": 1.204,
1195
+ "eval_samples_per_second": 37.374,
1196
+ "eval_steps_per_second": 2.492,
1197
+ "step": 6600
1198
+ },
1199
+ {
1200
+ "epoch": 289.13,
1201
+ "learning_rate": 1.0948905109489049e-05,
1202
+ "loss": 0.0095,
1203
+ "step": 6650
1204
+ },
1205
+ {
1206
+ "epoch": 291.3,
1207
+ "learning_rate": 8.75912408759124e-06,
1208
+ "loss": 0.0108,
1209
+ "step": 6700
1210
+ },
1211
+ {
1212
+ "epoch": 293.48,
1213
+ "learning_rate": 6.56934306569343e-06,
1214
+ "loss": 0.0106,
1215
+ "step": 6750
1216
+ },
1217
+ {
1218
+ "epoch": 293.48,
1219
+ "eval_cer": 0.3246592317224288,
1220
+ "eval_loss": 2.595982789993286,
1221
+ "eval_runtime": 1.1323,
1222
+ "eval_samples_per_second": 39.741,
1223
+ "eval_steps_per_second": 2.649,
1224
+ "step": 6750
1225
+ }
1226
+ ],
1227
+ "logging_steps": 50,
1228
+ "max_steps": 6900,
1229
+ "num_train_epochs": 300,
1230
+ "save_steps": 150,
1231
+ "total_flos": 2.260883648900445e+19,
1232
+ "trial_name": null,
1233
+ "trial_params": null
1234
+ }
checkpoint-6750/training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c0741fe1648758c067baeb587c00ff9d0528d818e60814b62c8d0f8ca82d1c4d
3
+ size 4472
checkpoint-6750/vocab.json ADDED
@@ -0,0 +1,679 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "0": 1,
3
+ "1": 2,
4
+ "2": 3,
5
+ "3": 4,
6
+ "4": 5,
7
+ "5": 6,
8
+ "6": 7,
9
+ "7": 8,
10
+ "8": 9,
11
+ "9": 10,
12
+ "[PAD]": 676,
13
+ "[UNK]": 675,
14
+ "|": 0,
15
+ " ": 11,
16
+ "、": 12,
17
+ "。": 13,
18
+ "々": 14,
19
+ "ぁ": 15,
20
+ "あ": 16,
21
+ "い": 17,
22
+ "う": 18,
23
+ "え": 19,
24
+ "お": 20,
25
+ "か": 21,
26
+ "が": 22,
27
+ "き": 23,
28
+ "ぎ": 24,
29
+ "く": 25,
30
+ "ぐ": 26,
31
+ "け": 27,
32
+ "げ": 28,
33
+ "こ": 29,
34
+ "ご": 30,
35
+ "さ": 31,
36
+ "ざ": 32,
37
+ "し": 33,
38
+ "じ": 34,
39
+ "す": 35,
40
+ "ず": 36,
41
+ "せ": 37,
42
+ "ぜ": 38,
43
+ "そ": 39,
44
+ "た": 40,
45
+ "だ": 41,
46
+ "ち": 42,
47
+ "っ": 43,
48
+ "つ": 44,
49
+ "て": 45,
50
+ "で": 46,
51
+ "と": 47,
52
+ "ど": 48,
53
+ "な": 49,
54
+ "に": 50,
55
+ "ぬ": 51,
56
+ "ね": 52,
57
+ "の": 53,
58
+ "は": 54,
59
+ "ば": 55,
60
+ "ぱ": 56,
61
+ "ひ": 57,
62
+ "び": 58,
63
+ "ふ": 59,
64
+ "ぶ": 60,
65
+ "ぷ": 61,
66
+ "へ": 62,
67
+ "べ": 63,
68
+ "ほ": 64,
69
+ "ぼ": 65,
70
+ "ぽ": 66,
71
+ "ま": 67,
72
+ "み": 68,
73
+ "む": 69,
74
+ "め": 70,
75
+ "も": 71,
76
+ "ゃ": 72,
77
+ "や": 73,
78
+ "ゆ": 74,
79
+ "ょ": 75,
80
+ "よ": 76,
81
+ "ら": 77,
82
+ "り": 78,
83
+ "る": 79,
84
+ "れ": 80,
85
+ "ろ": 81,
86
+ "わ": 82,
87
+ "を": 83,
88
+ "ん": 84,
89
+ "ァ": 85,
90
+ "ア": 86,
91
+ "ィ": 87,
92
+ "イ": 88,
93
+ "ウ": 89,
94
+ "ェ": 90,
95
+ "エ": 91,
96
+ "ォ": 92,
97
+ "オ": 93,
98
+ "カ": 94,
99
+ "ガ": 95,
100
+ "キ": 96,
101
+ "ギ": 97,
102
+ "ク": 98,
103
+ "グ": 99,
104
+ "ケ": 100,
105
+ "ゲ": 101,
106
+ "コ": 102,
107
+ "ゴ": 103,
108
+ "サ": 104,
109
+ "ザ": 105,
110
+ "シ": 106,
111
+ "ジ": 107,
112
+ "ス": 108,
113
+ "ズ": 109,
114
+ "セ": 110,
115
+ "ソ": 111,
116
+ "タ": 112,
117
+ "ダ": 113,
118
+ "チ": 114,
119
+ "ッ": 115,
120
+ "ツ": 116,
121
+ "テ": 117,
122
+ "デ": 118,
123
+ "ト": 119,
124
+ "ド": 120,
125
+ "ナ": 121,
126
+ "ニ": 122,
127
+ "ネ": 123,
128
+ "ノ": 124,
129
+ "ハ": 125,
130
+ "バ": 126,
131
+ "パ": 127,
132
+ "ヒ": 128,
133
+ "ビ": 129,
134
+ "ピ": 130,
135
+ "フ": 131,
136
+ "ブ": 132,
137
+ "プ": 133,
138
+ "ベ": 134,
139
+ "ペ": 135,
140
+ "ホ": 136,
141
+ "ボ": 137,
142
+ "ポ": 138,
143
+ "マ": 139,
144
+ "ミ": 140,
145
+ "ム": 141,
146
+ "メ": 142,
147
+ "モ": 143,
148
+ "ャ": 144,
149
+ "ヤ": 145,
150
+ "ュ": 146,
151
+ "ヨ": 147,
152
+ "ラ": 148,
153
+ "リ": 149,
154
+ "ル": 150,
155
+ "レ": 151,
156
+ "ロ": 152,
157
+ "ワ": 153,
158
+ "ン": 154,
159
+ "ヶ": 155,
160
+ "ー": 156,
161
+ "一": 157,
162
+ "万": 158,
163
+ "丈": 159,
164
+ "三": 160,
165
+ "上": 161,
166
+ "下": 162,
167
+ "不": 163,
168
+ "中": 164,
169
+ "丸": 165,
170
+ "主": 166,
171
+ "久": 167,
172
+ "九": 168,
173
+ "乾": 169,
174
+ "予": 170,
175
+ "事": 171,
176
+ "二": 172,
177
+ "五": 173,
178
+ "井": 174,
179
+ "交": 175,
180
+ "京": 176,
181
+ "人": 177,
182
+ "今": 178,
183
+ "仏": 179,
184
+ "仕": 180,
185
+ "他": 181,
186
+ "付": 182,
187
+ "代": 183,
188
+ "以": 184,
189
+ "件": 185,
190
+ "企": 186,
191
+ "伊": 187,
192
+ "休": 188,
193
+ "会": 189,
194
+ "伸": 190,
195
+ "住": 191,
196
+ "体": 192,
197
+ "何": 193,
198
+ "余": 194,
199
+ "作": 195,
200
+ "使": 196,
201
+ "例": 197,
202
+ "保": 198,
203
+ "信": 199,
204
+ "俣": 200,
205
+ "個": 201,
206
+ "倒": 202,
207
+ "候": 203,
208
+ "健": 204,
209
+ "備": 205,
210
+ "元": 206,
211
+ "充": 207,
212
+ "先": 208,
213
+ "入": 209,
214
+ "全": 210,
215
+ "公": 211,
216
+ "共": 212,
217
+ "内": 213,
218
+ "円": 214,
219
+ "写": 215,
220
+ "冬": 216,
221
+ "冷": 217,
222
+ "凍": 218,
223
+ "出": 219,
224
+ "分": 220,
225
+ "切": 221,
226
+ "初": 222,
227
+ "到": 223,
228
+ "制": 224,
229
+ "前": 225,
230
+ "力": 226,
231
+ "加": 227,
232
+ "動": 228,
233
+ "募": 229,
234
+ "勧": 230,
235
+ "化": 231,
236
+ "北": 232,
237
+ "南": 233,
238
+ "厚": 234,
239
+ "原": 235,
240
+ "去": 236,
241
+ "参": 237,
242
+ "友": 238,
243
+ "取": 239,
244
+ "口": 240,
245
+ "古": 241,
246
+ "可": 242,
247
+ "台": 243,
248
+ "号": 244,
249
+ "司": 245,
250
+ "合": 246,
251
+ "吉": 247,
252
+ "吊": 248,
253
+ "同": 249,
254
+ "名": 250,
255
+ "吹": 251,
256
+ "味": 252,
257
+ "呼": 253,
258
+ "和": 254,
259
+ "品": 255,
260
+ "唇": 256,
261
+ "商": 257,
262
+ "問": 258,
263
+ "噌": 259,
264
+ "回": 260,
265
+ "固": 261,
266
+ "国": 262,
267
+ "園": 263,
268
+ "地": 264,
269
+ "型": 265,
270
+ "域": 266,
271
+ "報": 267,
272
+ "場": 268,
273
+ "塗": 269,
274
+ "増": 270,
275
+ "声": 271,
276
+ "売": 272,
277
+ "変": 273,
278
+ "夏": 274,
279
+ "外": 275,
280
+ "多": 276,
281
+ "大": 277,
282
+ "天": 278,
283
+ "太": 279,
284
+ "夫": 280,
285
+ "失": 281,
286
+ "奈": 282,
287
+ "奥": 283,
288
+ "女": 284,
289
+ "好": 285,
290
+ "始": 286,
291
+ "嫌": 287,
292
+ "嬉": 288,
293
+ "子": 289,
294
+ "存": 290,
295
+ "孝": 291,
296
+ "学": 292,
297
+ "定": 293,
298
+ "実": 294,
299
+ "室": 295,
300
+ "宮": 296,
301
+ "家": 297,
302
+ "容": 298,
303
+ "寝": 299,
304
+ "寺": 300,
305
+ "対": 301,
306
+ "小": 302,
307
+ "少": 303,
308
+ "尾": 304,
309
+ "局": 305,
310
+ "届": 306,
311
+ "屋": 307,
312
+ "山": 308,
313
+ "岐": 309,
314
+ "岡": 310,
315
+ "岩": 311,
316
+ "岳": 312,
317
+ "島": 313,
318
+ "川": 314,
319
+ "帰": 315,
320
+ "常": 316,
321
+ "平": 317,
322
+ "年": 318,
323
+ "幻": 319,
324
+ "広": 320,
325
+ "底": 321,
326
+ "店": 322,
327
+ "座": 323,
328
+ "庫": 324,
329
+ "弁": 325,
330
+ "式": 326,
331
+ "張": 327,
332
+ "強": 328,
333
+ "当": 329,
334
+ "形": 330,
335
+ "影": 331,
336
+ "待": 332,
337
+ "後": 333,
338
+ "得": 334,
339
+ "忘": 335,
340
+ "応": 336,
341
+ "思": 337,
342
+ "怠": 338,
343
+ "恥": 339,
344
+ "悪": 340,
345
+ "情": 341,
346
+ "想": 342,
347
+ "意": 343,
348
+ "愛": 344,
349
+ "感": 345,
350
+ "慢": 346,
351
+ "成": 347,
352
+ "我": 348,
353
+ "戦": 349,
354
+ "戻": 350,
355
+ "所": 351,
356
+ "手": 352,
357
+ "打": 353,
358
+ "抜": 354,
359
+ "押": 355,
360
+ "拝": 356,
361
+ "拶": 357,
362
+ "持": 358,
363
+ "指": 359,
364
+ "挨": 360,
365
+ "掃": 361,
366
+ "援": 362,
367
+ "教": 363,
368
+ "数": 364,
369
+ "文": 365,
370
+ "料": 366,
371
+ "断": 367,
372
+ "新": 368,
373
+ "方": 369,
374
+ "旗": 370,
375
+ "日": 371,
376
+ "旦": 372,
377
+ "早": 373,
378
+ "明": 374,
379
+ "映": 375,
380
+ "春": 376,
381
+ "昨": 377,
382
+ "是": 378,
383
+ "昼": 379,
384
+ "時": 380,
385
+ "普": 381,
386
+ "景": 382,
387
+ "晴": 383,
388
+ "暑": 384,
389
+ "暗": 385,
390
+ "書": 386,
391
+ "最": 387,
392
+ "月": 388,
393
+ "有": 389,
394
+ "望": 390,
395
+ "期": 391,
396
+ "木": 392,
397
+ "本": 393,
398
+ "机": 394,
399
+ "村": 395,
400
+ "来": 396,
401
+ "杯": 397,
402
+ "東": 398,
403
+ "林": 399,
404
+ "枚": 400,
405
+ "柴": 401,
406
+ "校": 402,
407
+ "梨": 403,
408
+ "棒": 404,
409
+ "森": 405,
410
+ "椿": 406,
411
+ "楽": 407,
412
+ "構": 408,
413
+ "横": 409,
414
+ "樹": 410,
415
+ "機": 411,
416
+ "欄": 412,
417
+ "次": 413,
418
+ "欲": 414,
419
+ "正": 415,
420
+ "残": 416,
421
+ "段": 417,
422
+ "母": 418,
423
+ "毎": 419,
424
+ "比": 420,
425
+ "毛": 421,
426
+ "気": 422,
427
+ "水": 423,
428
+ "汁": 424,
429
+ "汗": 425,
430
+ "況": 426,
431
+ "泉": 427,
432
+ "泊": 428,
433
+ "法": 429,
434
+ "注": 430,
435
+ "洋": 431,
436
+ "活": 432,
437
+ "流": 433,
438
+ "海": 434,
439
+ "消": 435,
440
+ "減": 436,
441
+ "渡": 437,
442
+ "温": 438,
443
+ "準": 439,
444
+ "漫": 440,
445
+ "激": 441,
446
+ "濃": 442,
447
+ "瀬": 443,
448
+ "火": 444,
449
+ "炎": 445,
450
+ "炭": 446,
451
+ "焚": 447,
452
+ "焦": 448,
453
+ "然": 449,
454
+ "焼": 450,
455
+ "照": 451,
456
+ "煮": 452,
457
+ "熊": 453,
458
+ "熱": 454,
459
+ "燃": 455,
460
+ "燕": 456,
461
+ "燥": 457,
462
+ "父": 458,
463
+ "物": 459,
464
+ "特": 460,
465
+ "犬": 461,
466
+ "状": 462,
467
+ "狙": 463,
468
+ "独": 464,
469
+ "狭": 465,
470
+ "猫": 466,
471
+ "獣": 467,
472
+ "王": 468,
473
+ "球": 469,
474
+ "理": 470,
475
+ "生": 471,
476
+ "用": 472,
477
+ "田": 473,
478
+ "甲": 474,
479
+ "申": 475,
480
+ "町": 476,
481
+ "画": 477,
482
+ "界": 478,
483
+ "留": 479,
484
+ "番": 480,
485
+ "疲": 481,
486
+ "癒": 482,
487
+ "発": 483,
488
+ "登": 484,
489
+ "白": 485,
490
+ "百": 486,
491
+ "的": 487,
492
+ "皆": 488,
493
+ "皿": 489,
494
+ "監": 490,
495
+ "目": 491,
496
+ "直": 492,
497
+ "相": 493,
498
+ "省": 494,
499
+ "県": 495,
500
+ "真": 496,
501
+ "督": 497,
502
+ "瞬": 498,
503
+ "知": 499,
504
+ "硬": 500,
505
+ "確": 501,
506
+ "礼": 502,
507
+ "社": 503,
508
+ "神": 504,
509
+ "福": 505,
510
+ "私": 506,
511
+ "移": 507,
512
+ "稲": 508,
513
+ "穂": 509,
514
+ "空": 510,
515
+ "立": 511,
516
+ "端": 512,
517
+ "答": 513,
518
+ "箇": 514,
519
+ "箱": 515,
520
+ "籍": 516,
521
+ "米": 517,
522
+ "粛": 518,
523
+ "精": 519,
524
+ "糖": 520,
525
+ "系": 521,
526
+ "納": 522,
527
+ "素": 523,
528
+ "細": 524,
529
+ "終": 525,
530
+ "結": 526,
531
+ "絶": 527,
532
+ "継": 528,
533
+ "綺": 529,
534
+ "綿": 530,
535
+ "緒": 531,
536
+ "締": 532,
537
+ "練": 533,
538
+ "縁": 534,
539
+ "繰": 535,
540
+ "缶": 536,
541
+ "置": 537,
542
+ "羊": 538,
543
+ "美": 539,
544
+ "義": 540,
545
+ "考": 541,
546
+ "者": 542,
547
+ "耳": 543,
548
+ "聞": 544,
549
+ "肉": 545,
550
+ "育": 546,
551
+ "腹": 547,
552
+ "自": 548,
553
+ "良": 549,
554
+ "色": 550,
555
+ "若": 551,
556
+ "茶": 552,
557
+ "荒": 553,
558
+ "荘": 554,
559
+ "荷": 555,
560
+ "落": 556,
561
+ "蔵": 557,
562
+ "薬": 558,
563
+ "蝶": 559,
564
+ "行": 560,
565
+ "街": 561,
566
+ "褒": 562,
567
+ "西": 563,
568
+ "要": 564,
569
+ "見": 565,
570
+ "視": 566,
571
+ "覧": 567,
572
+ "親": 568,
573
+ "観": 569,
574
+ "言": 570,
575
+ "記": 571,
576
+ "設": 572,
577
+ "許": 573,
578
+ "訳": 574,
579
+ "試": 575,
580
+ "話": 576,
581
+ "詳": 577,
582
+ "説": 578,
583
+ "読": 579,
584
+ "誰": 580,
585
+ "調": 581,
586
+ "請": 582,
587
+ "謝": 583,
588
+ "識": 584,
589
+ "議": 585,
590
+ "谷": 586,
591
+ "買": 587,
592
+ "質": 588,
593
+ "赤": 589,
594
+ "走": 590,
595
+ "越": 591,
596
+ "路": 592,
597
+ "身": 593,
598
+ "車": 594,
599
+ "転": 595,
600
+ "載": 596,
601
+ "辛": 597,
602
+ "辺": 598,
603
+ "込": 599,
604
+ "近": 600,
605
+ "返": 601,
606
+ "追": 602,
607
+ "途": 603,
608
+ "通": 604,
609
+ "速": 605,
610
+ "連": 606,
611
+ "週": 607,
612
+ "遅": 608,
613
+ "運": 609,
614
+ "過": 610,
615
+ "達": 611,
616
+ "違": 612,
617
+ "適": 613,
618
+ "選": 614,
619
+ "郎": 615,
620
+ "部": 616,
621
+ "配": 617,
622
+ "酒": 618,
623
+ "重": 619,
624
+ "野": 620,
625
+ "量": 621,
626
+ "釣": 622,
627
+ "録": 623,
628
+ "鍵": 624,
629
+ "長": 625,
630
+ "開": 626,
631
+ "間": 627,
632
+ "関": 628,
633
+ "閣": 629,
634
+ "阜": 630,
635
+ "降": 631,
636
+ "限": 632,
637
+ "院": 633,
638
+ "除": 634,
639
+ "陸": 635,
640
+ "雅": 636,
641
+ "集": 637,
642
+ "雉": 638,
643
+ "難": 639,
644
+ "雨": 640,
645
+ "雪": 641,
646
+ "電": 642,
647
+ "青": 643,
648
+ "非": 644,
649
+ "面": 645,
650
+ "音": 646,
651
+ "響": 647,
652
+ "頂": 648,
653
+ "頃": 649,
654
+ "順": 650,
655
+ "頼": 651,
656
+ "顔": 652,
657
+ "風": 653,
658
+ "食": 654,
659
+ "飲": 655,
660
+ "飼": 656,
661
+ "馬": 657,
662
+ "験": 658,
663
+ "驚": 659,
664
+ "高": 660,
665
+ "髪": 661,
666
+ "鬼": 662,
667
+ "鶏": 663,
668
+ "鹿": 664,
669
+ "麗": 665,
670
+ "!": 666,
671
+ "(": 667,
672
+ ")": 668,
673
+ "/": 669,
674
+ "1": 670,
675
+ "2": 671,
676
+ "3": 672,
677
+ "?": 673,
678
+ "m": 674
679
+ }
checkpoint-6900/added_tokens.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "</s>": 678,
3
+ "<s>": 677,
4
+ "[PAD]": 676,
5
+ "[UNK]": 675
6
+ }
checkpoint-6900/config.json ADDED
@@ -0,0 +1,117 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "facebook/wav2vec2-large-xlsr-53",
3
+ "activation_dropout": 0.0,
4
+ "adapter_attn_dim": null,
5
+ "adapter_kernel_size": 3,
6
+ "adapter_stride": 2,
7
+ "add_adapter": false,
8
+ "apply_spec_augment": true,
9
+ "architectures": [
10
+ "Wav2Vec2ForCTC"
11
+ ],
12
+ "attention_dropout": 0.1,
13
+ "bos_token_id": 1,
14
+ "classifier_proj_size": 256,
15
+ "codevector_dim": 768,
16
+ "contrastive_logits_temperature": 0.1,
17
+ "conv_bias": true,
18
+ "conv_dim": [
19
+ 512,
20
+ 512,
21
+ 512,
22
+ 512,
23
+ 512,
24
+ 512,
25
+ 512
26
+ ],
27
+ "conv_kernel": [
28
+ 10,
29
+ 3,
30
+ 3,
31
+ 3,
32
+ 3,
33
+ 2,
34
+ 2
35
+ ],
36
+ "conv_stride": [
37
+ 5,
38
+ 2,
39
+ 2,
40
+ 2,
41
+ 2,
42
+ 2,
43
+ 2
44
+ ],
45
+ "ctc_loss_reduction": "mean",
46
+ "ctc_zero_infinity": false,
47
+ "diversity_loss_weight": 0.1,
48
+ "do_stable_layer_norm": true,
49
+ "eos_token_id": 2,
50
+ "feat_extract_activation": "gelu",
51
+ "feat_extract_dropout": 0.0,
52
+ "feat_extract_norm": "layer",
53
+ "feat_proj_dropout": 0.05,
54
+ "feat_quantizer_dropout": 0.0,
55
+ "final_dropout": 0.0,
56
+ "gradient_checkpointing": false,
57
+ "hidden_act": "gelu",
58
+ "hidden_dropout": 0.05,
59
+ "hidden_size": 1024,
60
+ "initializer_range": 0.02,
61
+ "intermediate_size": 4096,
62
+ "layer_norm_eps": 1e-05,
63
+ "layerdrop": 0.05,
64
+ "mask_channel_length": 10,
65
+ "mask_channel_min_space": 1,
66
+ "mask_channel_other": 0.0,
67
+ "mask_channel_prob": 0.0,
68
+ "mask_channel_selection": "static",
69
+ "mask_feature_length": 10,
70
+ "mask_feature_min_masks": 0,
71
+ "mask_feature_prob": 0.0,
72
+ "mask_time_length": 10,
73
+ "mask_time_min_masks": 2,
74
+ "mask_time_min_space": 1,
75
+ "mask_time_other": 0.0,
76
+ "mask_time_prob": 0.05,
77
+ "mask_time_selection": "static",
78
+ "model_type": "wav2vec2",
79
+ "num_adapter_layers": 3,
80
+ "num_attention_heads": 16,
81
+ "num_codevector_groups": 2,
82
+ "num_codevectors_per_group": 320,
83
+ "num_conv_pos_embedding_groups": 16,
84
+ "num_conv_pos_embeddings": 128,
85
+ "num_feat_extract_layers": 7,
86
+ "num_hidden_layers": 24,
87
+ "num_negatives": 100,
88
+ "output_hidden_size": 1024,
89
+ "pad_token_id": 676,
90
+ "proj_codevector_dim": 768,
91
+ "tdnn_dilation": [
92
+ 1,
93
+ 2,
94
+ 3,
95
+ 1,
96
+ 1
97
+ ],
98
+ "tdnn_dim": [
99
+ 512,
100
+ 512,
101
+ 512,
102
+ 512,
103
+ 1500
104
+ ],
105
+ "tdnn_kernel": [
106
+ 5,
107
+ 3,
108
+ 3,
109
+ 1,
110
+ 1
111
+ ],
112
+ "torch_dtype": "float32",
113
+ "transformers_version": "4.34.0",
114
+ "use_weighted_layer_sum": false,
115
+ "vocab_size": 679,
116
+ "xvector_output_dim": 512
117
+ }
checkpoint-6900/optimizer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0f04f9516a42bb196e6cc56b5cd1d974b58484227ad668914a82979adf8cd0a7
3
+ size 2495727542
checkpoint-6900/preprocessor_config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "do_normalize": true,
3
+ "feature_extractor_type": "Wav2Vec2FeatureExtractor",
4
+ "feature_size": 1,
5
+ "padding_side": "right",
6
+ "padding_value": 0,
7
+ "processor_class": "Wav2Vec2Processor",
8
+ "return_attention_mask": true,
9
+ "sampling_rate": 16000
10
+ }
checkpoint-6900/pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9b61e6d7c21931997e82553ee1094451457e43f95812c2ee82aeedf4e89cd76d
3
+ size 1264686250
checkpoint-6900/rng_state_0.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ff4da61760aaf758f2249f49785073e751c7c6b42d97b629295e03370b0a75be
3
+ size 15024
checkpoint-6900/rng_state_1.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f76c96944cc56b3748e8c49262f1123ebca7ad10cc063156d199e514526afb6c
3
+ size 15024
checkpoint-6900/rng_state_2.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dae4d9f5026d88180e0d86b0cbfe9e17a45f3cdd811ef47e83023919fadebe6f
3
+ size 15088
checkpoint-6900/rng_state_3.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:33ce8c97c0237f1aab733e8352d232b013d6ac9640ccd659e47357857fe20a3c
3
+ size 15024
checkpoint-6900/scheduler.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4aff4c539403cd22a610cf466ddff5a536ad52bb5e611a9ba04bb4e840639794
3
+ size 1064
checkpoint-6900/special_tokens_map.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<s>",
4
+ "</s>"
5
+ ],
6
+ "bos_token": "<s>",
7
+ "eos_token": "</s>",
8
+ "pad_token": "[PAD]",
9
+ "unk_token": "[UNK]"
10
+ }
checkpoint-6900/tokenizer_config.json ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "675": {
4
+ "content": "[UNK]",
5
+ "lstrip": true,
6
+ "normalized": false,
7
+ "rstrip": true,
8
+ "single_word": false,
9
+ "special": false
10
+ },
11
+ "676": {
12
+ "content": "[PAD]",
13
+ "lstrip": true,
14
+ "normalized": false,
15
+ "rstrip": true,
16
+ "single_word": false,
17
+ "special": false
18
+ },
19
+ "677": {
20
+ "content": "<s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "678": {
28
+ "content": "</s>",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ }
35
+ },
36
+ "additional_special_tokens": [
37
+ "<s>",
38
+ "</s>"
39
+ ],
40
+ "bos_token": "<s>",
41
+ "clean_up_tokenization_spaces": true,
42
+ "config": null,
43
+ "do_lower_case": false,
44
+ "eos_token": "</s>",
45
+ "model_max_length": 1000000000000000019884624838656,
46
+ "pad_token": "[PAD]",
47
+ "processor_class": "Wav2Vec2Processor",
48
+ "replace_word_delimiter_char": " ",
49
+ "target_lang": null,
50
+ "tokenizer_class": "Wav2Vec2CTCTokenizer",
51
+ "tokenizer_file": null,
52
+ "tokenizer_type": "wav2vec2",
53
+ "trust_remote_code": false,
54
+ "unk_token": "[UNK]",
55
+ "word_delimiter_token": "|"
56
+ }
checkpoint-6900/trainer_state.json ADDED
@@ -0,0 +1,1261 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": null,
3
+ "best_model_checkpoint": null,
4
+ "epoch": 300.0,
5
+ "eval_steps": 150,
6
+ "global_step": 6900,
7
+ "is_hyper_param_search": false,
8
+ "is_local_process_zero": true,
9
+ "is_world_process_zero": true,
10
+ "log_history": [
11
+ {
12
+ "epoch": 2.17,
13
+ "learning_rate": 0.0003,
14
+ "loss": 35.2887,
15
+ "step": 50
16
+ },
17
+ {
18
+ "epoch": 4.35,
19
+ "learning_rate": 0.00029781021897810217,
20
+ "loss": 5.9569,
21
+ "step": 100
22
+ },
23
+ {
24
+ "epoch": 6.52,
25
+ "learning_rate": 0.00029562043795620436,
26
+ "loss": 4.9138,
27
+ "step": 150
28
+ },
29
+ {
30
+ "epoch": 6.52,
31
+ "eval_cer": 1.0,
32
+ "eval_loss": 4.7965407371521,
33
+ "eval_runtime": 1.256,
34
+ "eval_samples_per_second": 35.828,
35
+ "eval_steps_per_second": 2.389,
36
+ "step": 150
37
+ },
38
+ {
39
+ "epoch": 8.7,
40
+ "learning_rate": 0.00029343065693430656,
41
+ "loss": 4.887,
42
+ "step": 200
43
+ },
44
+ {
45
+ "epoch": 10.87,
46
+ "learning_rate": 0.00029124087591240875,
47
+ "loss": 4.8447,
48
+ "step": 250
49
+ },
50
+ {
51
+ "epoch": 13.04,
52
+ "learning_rate": 0.00028905109489051094,
53
+ "loss": 4.7484,
54
+ "step": 300
55
+ },
56
+ {
57
+ "epoch": 13.04,
58
+ "eval_cer": 1.0,
59
+ "eval_loss": 4.608075141906738,
60
+ "eval_runtime": 1.2451,
61
+ "eval_samples_per_second": 36.142,
62
+ "eval_steps_per_second": 2.409,
63
+ "step": 300
64
+ },
65
+ {
66
+ "epoch": 15.22,
67
+ "learning_rate": 0.00028686131386861314,
68
+ "loss": 4.6529,
69
+ "step": 350
70
+ },
71
+ {
72
+ "epoch": 17.39,
73
+ "learning_rate": 0.0002846715328467153,
74
+ "loss": 4.6373,
75
+ "step": 400
76
+ },
77
+ {
78
+ "epoch": 19.57,
79
+ "learning_rate": 0.00028248175182481747,
80
+ "loss": 4.5894,
81
+ "step": 450
82
+ },
83
+ {
84
+ "epoch": 19.57,
85
+ "eval_cer": 0.9851301115241635,
86
+ "eval_loss": 4.469708442687988,
87
+ "eval_runtime": 1.2325,
88
+ "eval_samples_per_second": 36.51,
89
+ "eval_steps_per_second": 2.434,
90
+ "step": 450
91
+ },
92
+ {
93
+ "epoch": 21.74,
94
+ "learning_rate": 0.00028029197080291966,
95
+ "loss": 4.5045,
96
+ "step": 500
97
+ },
98
+ {
99
+ "epoch": 23.91,
100
+ "learning_rate": 0.00027810218978102186,
101
+ "loss": 4.4076,
102
+ "step": 550
103
+ },
104
+ {
105
+ "epoch": 26.09,
106
+ "learning_rate": 0.00027591240875912405,
107
+ "loss": 4.2024,
108
+ "step": 600
109
+ },
110
+ {
111
+ "epoch": 26.09,
112
+ "eval_cer": 0.9076827757125155,
113
+ "eval_loss": 4.037315845489502,
114
+ "eval_runtime": 1.2357,
115
+ "eval_samples_per_second": 36.417,
116
+ "eval_steps_per_second": 2.428,
117
+ "step": 600
118
+ },
119
+ {
120
+ "epoch": 28.26,
121
+ "learning_rate": 0.00027372262773722625,
122
+ "loss": 3.8743,
123
+ "step": 650
124
+ },
125
+ {
126
+ "epoch": 30.43,
127
+ "learning_rate": 0.00027153284671532844,
128
+ "loss": 3.3488,
129
+ "step": 700
130
+ },
131
+ {
132
+ "epoch": 32.61,
133
+ "learning_rate": 0.00026934306569343063,
134
+ "loss": 2.7314,
135
+ "step": 750
136
+ },
137
+ {
138
+ "epoch": 32.61,
139
+ "eval_cer": 0.5340768277571252,
140
+ "eval_loss": 2.5507473945617676,
141
+ "eval_runtime": 1.2278,
142
+ "eval_samples_per_second": 36.651,
143
+ "eval_steps_per_second": 2.443,
144
+ "step": 750
145
+ },
146
+ {
147
+ "epoch": 34.78,
148
+ "learning_rate": 0.00026715328467153283,
149
+ "loss": 2.1968,
150
+ "step": 800
151
+ },
152
+ {
153
+ "epoch": 36.96,
154
+ "learning_rate": 0.000264963503649635,
155
+ "loss": 1.6522,
156
+ "step": 850
157
+ },
158
+ {
159
+ "epoch": 39.13,
160
+ "learning_rate": 0.0002627737226277372,
161
+ "loss": 1.2293,
162
+ "step": 900
163
+ },
164
+ {
165
+ "epoch": 39.13,
166
+ "eval_cer": 0.4138785625774473,
167
+ "eval_loss": 2.01461124420166,
168
+ "eval_runtime": 1.2246,
169
+ "eval_samples_per_second": 36.746,
170
+ "eval_steps_per_second": 2.45,
171
+ "step": 900
172
+ },
173
+ {
174
+ "epoch": 41.3,
175
+ "learning_rate": 0.0002605839416058394,
176
+ "loss": 0.9292,
177
+ "step": 950
178
+ },
179
+ {
180
+ "epoch": 43.48,
181
+ "learning_rate": 0.00025839416058394155,
182
+ "loss": 0.7208,
183
+ "step": 1000
184
+ },
185
+ {
186
+ "epoch": 45.65,
187
+ "learning_rate": 0.00025620437956204374,
188
+ "loss": 0.5544,
189
+ "step": 1050
190
+ },
191
+ {
192
+ "epoch": 45.65,
193
+ "eval_cer": 0.355638166047088,
194
+ "eval_loss": 1.9821244478225708,
195
+ "eval_runtime": 1.2073,
196
+ "eval_samples_per_second": 37.275,
197
+ "eval_steps_per_second": 2.485,
198
+ "step": 1050
199
+ },
200
+ {
201
+ "epoch": 47.83,
202
+ "learning_rate": 0.00025401459854014594,
203
+ "loss": 0.4757,
204
+ "step": 1100
205
+ },
206
+ {
207
+ "epoch": 50.0,
208
+ "learning_rate": 0.00025182481751824813,
209
+ "loss": 0.3895,
210
+ "step": 1150
211
+ },
212
+ {
213
+ "epoch": 52.17,
214
+ "learning_rate": 0.0002496350364963503,
215
+ "loss": 0.3224,
216
+ "step": 1200
217
+ },
218
+ {
219
+ "epoch": 52.17,
220
+ "eval_cer": 0.3587360594795539,
221
+ "eval_loss": 2.0189881324768066,
222
+ "eval_runtime": 1.1983,
223
+ "eval_samples_per_second": 37.554,
224
+ "eval_steps_per_second": 2.504,
225
+ "step": 1200
226
+ },
227
+ {
228
+ "epoch": 54.35,
229
+ "learning_rate": 0.0002474452554744525,
230
+ "loss": 0.279,
231
+ "step": 1250
232
+ },
233
+ {
234
+ "epoch": 56.52,
235
+ "learning_rate": 0.0002452554744525547,
236
+ "loss": 0.2285,
237
+ "step": 1300
238
+ },
239
+ {
240
+ "epoch": 58.7,
241
+ "learning_rate": 0.0002430656934306569,
242
+ "loss": 0.1951,
243
+ "step": 1350
244
+ },
245
+ {
246
+ "epoch": 58.7,
247
+ "eval_cer": 0.36121437422552666,
248
+ "eval_loss": 2.1229116916656494,
249
+ "eval_runtime": 1.2603,
250
+ "eval_samples_per_second": 35.706,
251
+ "eval_steps_per_second": 2.38,
252
+ "step": 1350
253
+ },
254
+ {
255
+ "epoch": 60.87,
256
+ "learning_rate": 0.0002408759124087591,
257
+ "loss": 0.1964,
258
+ "step": 1400
259
+ },
260
+ {
261
+ "epoch": 63.04,
262
+ "learning_rate": 0.0002386861313868613,
263
+ "loss": 0.1622,
264
+ "step": 1450
265
+ },
266
+ {
267
+ "epoch": 65.22,
268
+ "learning_rate": 0.0002364963503649635,
269
+ "loss": 0.1539,
270
+ "step": 1500
271
+ },
272
+ {
273
+ "epoch": 65.22,
274
+ "eval_cer": 0.3469640644361834,
275
+ "eval_loss": 2.111368179321289,
276
+ "eval_runtime": 1.2194,
277
+ "eval_samples_per_second": 36.903,
278
+ "eval_steps_per_second": 2.46,
279
+ "step": 1500
280
+ },
281
+ {
282
+ "epoch": 67.39,
283
+ "learning_rate": 0.00023430656934306568,
284
+ "loss": 0.1492,
285
+ "step": 1550
286
+ },
287
+ {
288
+ "epoch": 69.57,
289
+ "learning_rate": 0.00023211678832116788,
290
+ "loss": 0.1404,
291
+ "step": 1600
292
+ },
293
+ {
294
+ "epoch": 71.74,
295
+ "learning_rate": 0.00022992700729927004,
296
+ "loss": 0.1165,
297
+ "step": 1650
298
+ },
299
+ {
300
+ "epoch": 71.74,
301
+ "eval_cer": 0.33147459727385375,
302
+ "eval_loss": 2.274796485900879,
303
+ "eval_runtime": 1.1874,
304
+ "eval_samples_per_second": 37.898,
305
+ "eval_steps_per_second": 2.527,
306
+ "step": 1650
307
+ },
308
+ {
309
+ "epoch": 73.91,
310
+ "learning_rate": 0.00022773722627737224,
311
+ "loss": 0.1268,
312
+ "step": 1700
313
+ },
314
+ {
315
+ "epoch": 76.09,
316
+ "learning_rate": 0.00022554744525547443,
317
+ "loss": 0.1186,
318
+ "step": 1750
319
+ },
320
+ {
321
+ "epoch": 78.26,
322
+ "learning_rate": 0.00022335766423357663,
323
+ "loss": 0.1119,
324
+ "step": 1800
325
+ },
326
+ {
327
+ "epoch": 78.26,
328
+ "eval_cer": 0.34882280049566294,
329
+ "eval_loss": 2.2390518188476562,
330
+ "eval_runtime": 1.3465,
331
+ "eval_samples_per_second": 33.42,
332
+ "eval_steps_per_second": 2.228,
333
+ "step": 1800
334
+ },
335
+ {
336
+ "epoch": 80.43,
337
+ "learning_rate": 0.00022116788321167882,
338
+ "loss": 0.0988,
339
+ "step": 1850
340
+ },
341
+ {
342
+ "epoch": 82.61,
343
+ "learning_rate": 0.00021897810218978101,
344
+ "loss": 0.112,
345
+ "step": 1900
346
+ },
347
+ {
348
+ "epoch": 84.78,
349
+ "learning_rate": 0.0002167883211678832,
350
+ "loss": 0.0989,
351
+ "step": 1950
352
+ },
353
+ {
354
+ "epoch": 84.78,
355
+ "eval_cer": 0.3382899628252788,
356
+ "eval_loss": 2.343754529953003,
357
+ "eval_runtime": 1.2055,
358
+ "eval_samples_per_second": 37.329,
359
+ "eval_steps_per_second": 2.489,
360
+ "step": 1950
361
+ },
362
+ {
363
+ "epoch": 86.96,
364
+ "learning_rate": 0.00021459854014598537,
365
+ "loss": 0.097,
366
+ "step": 2000
367
+ },
368
+ {
369
+ "epoch": 89.13,
370
+ "learning_rate": 0.00021240875912408757,
371
+ "loss": 0.0854,
372
+ "step": 2050
373
+ },
374
+ {
375
+ "epoch": 91.3,
376
+ "learning_rate": 0.00021021897810218976,
377
+ "loss": 0.0915,
378
+ "step": 2100
379
+ },
380
+ {
381
+ "epoch": 91.3,
382
+ "eval_cer": 0.3587360594795539,
383
+ "eval_loss": 2.121840000152588,
384
+ "eval_runtime": 1.2037,
385
+ "eval_samples_per_second": 37.386,
386
+ "eval_steps_per_second": 2.492,
387
+ "step": 2100
388
+ },
389
+ {
390
+ "epoch": 93.48,
391
+ "learning_rate": 0.00020802919708029196,
392
+ "loss": 0.078,
393
+ "step": 2150
394
+ },
395
+ {
396
+ "epoch": 95.65,
397
+ "learning_rate": 0.00020583941605839415,
398
+ "loss": 0.0857,
399
+ "step": 2200
400
+ },
401
+ {
402
+ "epoch": 97.83,
403
+ "learning_rate": 0.00020364963503649632,
404
+ "loss": 0.0721,
405
+ "step": 2250
406
+ },
407
+ {
408
+ "epoch": 97.83,
409
+ "eval_cer": 0.35192069392812886,
410
+ "eval_loss": 2.242812395095825,
411
+ "eval_runtime": 1.1964,
412
+ "eval_samples_per_second": 37.614,
413
+ "eval_steps_per_second": 2.508,
414
+ "step": 2250
415
+ },
416
+ {
417
+ "epoch": 100.0,
418
+ "learning_rate": 0.0002014598540145985,
419
+ "loss": 0.0799,
420
+ "step": 2300
421
+ },
422
+ {
423
+ "epoch": 102.17,
424
+ "learning_rate": 0.0001992700729927007,
425
+ "loss": 0.0798,
426
+ "step": 2350
427
+ },
428
+ {
429
+ "epoch": 104.35,
430
+ "learning_rate": 0.0001970802919708029,
431
+ "loss": 0.0742,
432
+ "step": 2400
433
+ },
434
+ {
435
+ "epoch": 104.35,
436
+ "eval_cer": 0.33643122676579923,
437
+ "eval_loss": 2.229339838027954,
438
+ "eval_runtime": 1.2156,
439
+ "eval_samples_per_second": 37.019,
440
+ "eval_steps_per_second": 2.468,
441
+ "step": 2400
442
+ },
443
+ {
444
+ "epoch": 106.52,
445
+ "learning_rate": 0.0001948905109489051,
446
+ "loss": 0.0692,
447
+ "step": 2450
448
+ },
449
+ {
450
+ "epoch": 108.7,
451
+ "learning_rate": 0.0001927007299270073,
452
+ "loss": 0.0664,
453
+ "step": 2500
454
+ },
455
+ {
456
+ "epoch": 110.87,
457
+ "learning_rate": 0.00019051094890510948,
458
+ "loss": 0.0629,
459
+ "step": 2550
460
+ },
461
+ {
462
+ "epoch": 110.87,
463
+ "eval_cer": 0.33705080545229243,
464
+ "eval_loss": 2.2878150939941406,
465
+ "eval_runtime": 1.2044,
466
+ "eval_samples_per_second": 37.364,
467
+ "eval_steps_per_second": 2.491,
468
+ "step": 2550
469
+ },
470
+ {
471
+ "epoch": 113.04,
472
+ "learning_rate": 0.00018832116788321167,
473
+ "loss": 0.0619,
474
+ "step": 2600
475
+ },
476
+ {
477
+ "epoch": 115.22,
478
+ "learning_rate": 0.00018613138686131387,
479
+ "loss": 0.0582,
480
+ "step": 2650
481
+ },
482
+ {
483
+ "epoch": 117.39,
484
+ "learning_rate": 0.00018394160583941606,
485
+ "loss": 0.0495,
486
+ "step": 2700
487
+ },
488
+ {
489
+ "epoch": 117.39,
490
+ "eval_cer": 0.34076827757125155,
491
+ "eval_loss": 2.2671637535095215,
492
+ "eval_runtime": 1.2039,
493
+ "eval_samples_per_second": 37.379,
494
+ "eval_steps_per_second": 2.492,
495
+ "step": 2700
496
+ },
497
+ {
498
+ "epoch": 119.57,
499
+ "learning_rate": 0.00018175182481751826,
500
+ "loss": 0.0614,
501
+ "step": 2750
502
+ },
503
+ {
504
+ "epoch": 121.74,
505
+ "learning_rate": 0.00017956204379562042,
506
+ "loss": 0.0565,
507
+ "step": 2800
508
+ },
509
+ {
510
+ "epoch": 123.91,
511
+ "learning_rate": 0.00017737226277372262,
512
+ "loss": 0.0466,
513
+ "step": 2850
514
+ },
515
+ {
516
+ "epoch": 123.91,
517
+ "eval_cer": 0.35254027261462206,
518
+ "eval_loss": 2.2532107830047607,
519
+ "eval_runtime": 1.3563,
520
+ "eval_samples_per_second": 33.179,
521
+ "eval_steps_per_second": 2.212,
522
+ "step": 2850
523
+ },
524
+ {
525
+ "epoch": 126.09,
526
+ "learning_rate": 0.00017518248175182478,
527
+ "loss": 0.0465,
528
+ "step": 2900
529
+ },
530
+ {
531
+ "epoch": 128.26,
532
+ "learning_rate": 0.00017299270072992698,
533
+ "loss": 0.0496,
534
+ "step": 2950
535
+ },
536
+ {
537
+ "epoch": 130.43,
538
+ "learning_rate": 0.00017080291970802917,
539
+ "loss": 0.0424,
540
+ "step": 3000
541
+ },
542
+ {
543
+ "epoch": 130.43,
544
+ "eval_cer": 0.32589838909541513,
545
+ "eval_loss": 2.2844393253326416,
546
+ "eval_runtime": 1.2006,
547
+ "eval_samples_per_second": 37.48,
548
+ "eval_steps_per_second": 2.499,
549
+ "step": 3000
550
+ },
551
+ {
552
+ "epoch": 132.61,
553
+ "learning_rate": 0.00016861313868613137,
554
+ "loss": 0.0483,
555
+ "step": 3050
556
+ },
557
+ {
558
+ "epoch": 134.78,
559
+ "learning_rate": 0.00016642335766423356,
560
+ "loss": 0.0488,
561
+ "step": 3100
562
+ },
563
+ {
564
+ "epoch": 136.96,
565
+ "learning_rate": 0.00016423357664233575,
566
+ "loss": 0.0446,
567
+ "step": 3150
568
+ },
569
+ {
570
+ "epoch": 136.96,
571
+ "eval_cer": 0.3252788104089219,
572
+ "eval_loss": 2.2763445377349854,
573
+ "eval_runtime": 1.2043,
574
+ "eval_samples_per_second": 37.368,
575
+ "eval_steps_per_second": 2.491,
576
+ "step": 3150
577
+ },
578
+ {
579
+ "epoch": 139.13,
580
+ "learning_rate": 0.00016204379562043795,
581
+ "loss": 0.0424,
582
+ "step": 3200
583
+ },
584
+ {
585
+ "epoch": 141.3,
586
+ "learning_rate": 0.00015985401459854014,
587
+ "loss": 0.0429,
588
+ "step": 3250
589
+ },
590
+ {
591
+ "epoch": 143.48,
592
+ "learning_rate": 0.00015766423357664234,
593
+ "loss": 0.0411,
594
+ "step": 3300
595
+ },
596
+ {
597
+ "epoch": 143.48,
598
+ "eval_cer": 0.3302354399008674,
599
+ "eval_loss": 2.301079034805298,
600
+ "eval_runtime": 1.345,
601
+ "eval_samples_per_second": 33.458,
602
+ "eval_steps_per_second": 2.231,
603
+ "step": 3300
604
+ },
605
+ {
606
+ "epoch": 145.65,
607
+ "learning_rate": 0.00015547445255474453,
608
+ "loss": 0.0392,
609
+ "step": 3350
610
+ },
611
+ {
612
+ "epoch": 147.83,
613
+ "learning_rate": 0.00015328467153284672,
614
+ "loss": 0.0426,
615
+ "step": 3400
616
+ },
617
+ {
618
+ "epoch": 150.0,
619
+ "learning_rate": 0.00015109489051094892,
620
+ "loss": 0.0419,
621
+ "step": 3450
622
+ },
623
+ {
624
+ "epoch": 150.0,
625
+ "eval_cer": 0.3420074349442379,
626
+ "eval_loss": 2.320059299468994,
627
+ "eval_runtime": 1.2411,
628
+ "eval_samples_per_second": 36.259,
629
+ "eval_steps_per_second": 2.417,
630
+ "step": 3450
631
+ },
632
+ {
633
+ "epoch": 152.17,
634
+ "learning_rate": 0.00014890510948905108,
635
+ "loss": 0.0386,
636
+ "step": 3500
637
+ },
638
+ {
639
+ "epoch": 154.35,
640
+ "learning_rate": 0.00014671532846715328,
641
+ "loss": 0.0402,
642
+ "step": 3550
643
+ },
644
+ {
645
+ "epoch": 156.52,
646
+ "learning_rate": 0.00014452554744525547,
647
+ "loss": 0.0333,
648
+ "step": 3600
649
+ },
650
+ {
651
+ "epoch": 156.52,
652
+ "eval_cer": 0.34386617100371747,
653
+ "eval_loss": 2.364445209503174,
654
+ "eval_runtime": 1.2337,
655
+ "eval_samples_per_second": 36.475,
656
+ "eval_steps_per_second": 2.432,
657
+ "step": 3600
658
+ },
659
+ {
660
+ "epoch": 158.7,
661
+ "learning_rate": 0.00014233576642335764,
662
+ "loss": 0.0434,
663
+ "step": 3650
664
+ },
665
+ {
666
+ "epoch": 160.87,
667
+ "learning_rate": 0.00014014598540145983,
668
+ "loss": 0.0393,
669
+ "step": 3700
670
+ },
671
+ {
672
+ "epoch": 163.04,
673
+ "learning_rate": 0.00013795620437956203,
674
+ "loss": 0.0384,
675
+ "step": 3750
676
+ },
677
+ {
678
+ "epoch": 163.04,
679
+ "eval_cer": 0.35315985130111527,
680
+ "eval_loss": 2.3685200214385986,
681
+ "eval_runtime": 1.2136,
682
+ "eval_samples_per_second": 37.081,
683
+ "eval_steps_per_second": 2.472,
684
+ "step": 3750
685
+ },
686
+ {
687
+ "epoch": 165.22,
688
+ "learning_rate": 0.00013576642335766422,
689
+ "loss": 0.0324,
690
+ "step": 3800
691
+ },
692
+ {
693
+ "epoch": 167.39,
694
+ "learning_rate": 0.00013357664233576641,
695
+ "loss": 0.0438,
696
+ "step": 3850
697
+ },
698
+ {
699
+ "epoch": 169.57,
700
+ "learning_rate": 0.0001313868613138686,
701
+ "loss": 0.0367,
702
+ "step": 3900
703
+ },
704
+ {
705
+ "epoch": 169.57,
706
+ "eval_cer": 0.3469640644361834,
707
+ "eval_loss": 2.397036552429199,
708
+ "eval_runtime": 1.2259,
709
+ "eval_samples_per_second": 36.708,
710
+ "eval_steps_per_second": 2.447,
711
+ "step": 3900
712
+ },
713
+ {
714
+ "epoch": 171.74,
715
+ "learning_rate": 0.00012919708029197077,
716
+ "loss": 0.0336,
717
+ "step": 3950
718
+ },
719
+ {
720
+ "epoch": 173.91,
721
+ "learning_rate": 0.00012700729927007297,
722
+ "loss": 0.037,
723
+ "step": 4000
724
+ },
725
+ {
726
+ "epoch": 176.09,
727
+ "learning_rate": 0.00012481751824817516,
728
+ "loss": 0.0307,
729
+ "step": 4050
730
+ },
731
+ {
732
+ "epoch": 176.09,
733
+ "eval_cer": 0.3308550185873606,
734
+ "eval_loss": 2.3530125617980957,
735
+ "eval_runtime": 1.2484,
736
+ "eval_samples_per_second": 36.047,
737
+ "eval_steps_per_second": 2.403,
738
+ "step": 4050
739
+ },
740
+ {
741
+ "epoch": 178.26,
742
+ "learning_rate": 0.00012262773722627736,
743
+ "loss": 0.0284,
744
+ "step": 4100
745
+ },
746
+ {
747
+ "epoch": 180.43,
748
+ "learning_rate": 0.00012043795620437955,
749
+ "loss": 0.0233,
750
+ "step": 4150
751
+ },
752
+ {
753
+ "epoch": 182.61,
754
+ "learning_rate": 0.00011824817518248174,
755
+ "loss": 0.0328,
756
+ "step": 4200
757
+ },
758
+ {
759
+ "epoch": 182.61,
760
+ "eval_cer": 0.33147459727385375,
761
+ "eval_loss": 2.3414556980133057,
762
+ "eval_runtime": 1.2281,
763
+ "eval_samples_per_second": 36.64,
764
+ "eval_steps_per_second": 2.443,
765
+ "step": 4200
766
+ },
767
+ {
768
+ "epoch": 184.78,
769
+ "learning_rate": 0.00011605839416058394,
770
+ "loss": 0.0285,
771
+ "step": 4250
772
+ },
773
+ {
774
+ "epoch": 186.96,
775
+ "learning_rate": 0.00011386861313868612,
776
+ "loss": 0.0222,
777
+ "step": 4300
778
+ },
779
+ {
780
+ "epoch": 189.13,
781
+ "learning_rate": 0.00011167883211678831,
782
+ "loss": 0.0271,
783
+ "step": 4350
784
+ },
785
+ {
786
+ "epoch": 189.13,
787
+ "eval_cer": 0.3308550185873606,
788
+ "eval_loss": 2.4165024757385254,
789
+ "eval_runtime": 1.1891,
790
+ "eval_samples_per_second": 37.844,
791
+ "eval_steps_per_second": 2.523,
792
+ "step": 4350
793
+ },
794
+ {
795
+ "epoch": 191.3,
796
+ "learning_rate": 0.00010948905109489051,
797
+ "loss": 0.0307,
798
+ "step": 4400
799
+ },
800
+ {
801
+ "epoch": 193.48,
802
+ "learning_rate": 0.00010729927007299269,
803
+ "loss": 0.023,
804
+ "step": 4450
805
+ },
806
+ {
807
+ "epoch": 195.65,
808
+ "learning_rate": 0.00010510948905109488,
809
+ "loss": 0.0213,
810
+ "step": 4500
811
+ },
812
+ {
813
+ "epoch": 195.65,
814
+ "eval_cer": 0.3451053283767038,
815
+ "eval_loss": 2.447828769683838,
816
+ "eval_runtime": 1.1406,
817
+ "eval_samples_per_second": 39.452,
818
+ "eval_steps_per_second": 2.63,
819
+ "step": 4500
820
+ },
821
+ {
822
+ "epoch": 197.83,
823
+ "learning_rate": 0.00010291970802919708,
824
+ "loss": 0.021,
825
+ "step": 4550
826
+ },
827
+ {
828
+ "epoch": 200.0,
829
+ "learning_rate": 0.00010072992700729926,
830
+ "loss": 0.0246,
831
+ "step": 4600
832
+ },
833
+ {
834
+ "epoch": 202.17,
835
+ "learning_rate": 9.854014598540145e-05,
836
+ "loss": 0.0193,
837
+ "step": 4650
838
+ },
839
+ {
840
+ "epoch": 202.17,
841
+ "eval_cer": 0.355638166047088,
842
+ "eval_loss": 2.524061918258667,
843
+ "eval_runtime": 1.203,
844
+ "eval_samples_per_second": 37.406,
845
+ "eval_steps_per_second": 2.494,
846
+ "step": 4650
847
+ },
848
+ {
849
+ "epoch": 204.35,
850
+ "learning_rate": 9.635036496350364e-05,
851
+ "loss": 0.0223,
852
+ "step": 4700
853
+ },
854
+ {
855
+ "epoch": 206.52,
856
+ "learning_rate": 9.416058394160584e-05,
857
+ "loss": 0.0223,
858
+ "step": 4750
859
+ },
860
+ {
861
+ "epoch": 208.7,
862
+ "learning_rate": 9.197080291970803e-05,
863
+ "loss": 0.0204,
864
+ "step": 4800
865
+ },
866
+ {
867
+ "epoch": 208.7,
868
+ "eval_cer": 0.34634448574969023,
869
+ "eval_loss": 2.570009708404541,
870
+ "eval_runtime": 1.2664,
871
+ "eval_samples_per_second": 35.533,
872
+ "eval_steps_per_second": 2.369,
873
+ "step": 4800
874
+ },
875
+ {
876
+ "epoch": 210.87,
877
+ "learning_rate": 8.978102189781021e-05,
878
+ "loss": 0.0202,
879
+ "step": 4850
880
+ },
881
+ {
882
+ "epoch": 213.04,
883
+ "learning_rate": 8.759124087591239e-05,
884
+ "loss": 0.0193,
885
+ "step": 4900
886
+ },
887
+ {
888
+ "epoch": 215.22,
889
+ "learning_rate": 8.540145985401459e-05,
890
+ "loss": 0.0185,
891
+ "step": 4950
892
+ },
893
+ {
894
+ "epoch": 215.22,
895
+ "eval_cer": 0.31784386617100374,
896
+ "eval_loss": 2.583724021911621,
897
+ "eval_runtime": 1.2549,
898
+ "eval_samples_per_second": 35.859,
899
+ "eval_steps_per_second": 2.391,
900
+ "step": 4950
901
+ },
902
+ {
903
+ "epoch": 217.39,
904
+ "learning_rate": 8.321167883211678e-05,
905
+ "loss": 0.0191,
906
+ "step": 5000
907
+ },
908
+ {
909
+ "epoch": 219.57,
910
+ "learning_rate": 8.102189781021897e-05,
911
+ "loss": 0.0169,
912
+ "step": 5050
913
+ },
914
+ {
915
+ "epoch": 221.74,
916
+ "learning_rate": 7.883211678832117e-05,
917
+ "loss": 0.0161,
918
+ "step": 5100
919
+ },
920
+ {
921
+ "epoch": 221.74,
922
+ "eval_cer": 0.33767038413878564,
923
+ "eval_loss": 2.513859987258911,
924
+ "eval_runtime": 1.2515,
925
+ "eval_samples_per_second": 35.958,
926
+ "eval_steps_per_second": 2.397,
927
+ "step": 5100
928
+ },
929
+ {
930
+ "epoch": 223.91,
931
+ "learning_rate": 7.664233576642336e-05,
932
+ "loss": 0.0183,
933
+ "step": 5150
934
+ },
935
+ {
936
+ "epoch": 226.09,
937
+ "learning_rate": 7.445255474452554e-05,
938
+ "loss": 0.0228,
939
+ "step": 5200
940
+ },
941
+ {
942
+ "epoch": 228.26,
943
+ "learning_rate": 7.226277372262774e-05,
944
+ "loss": 0.0167,
945
+ "step": 5250
946
+ },
947
+ {
948
+ "epoch": 228.26,
949
+ "eval_cer": 0.3351920693928129,
950
+ "eval_loss": 2.5287766456604004,
951
+ "eval_runtime": 1.2044,
952
+ "eval_samples_per_second": 37.363,
953
+ "eval_steps_per_second": 2.491,
954
+ "step": 5250
955
+ },
956
+ {
957
+ "epoch": 230.43,
958
+ "learning_rate": 7.007299270072992e-05,
959
+ "loss": 0.0181,
960
+ "step": 5300
961
+ },
962
+ {
963
+ "epoch": 232.61,
964
+ "learning_rate": 6.788321167883211e-05,
965
+ "loss": 0.0144,
966
+ "step": 5350
967
+ },
968
+ {
969
+ "epoch": 234.78,
970
+ "learning_rate": 6.56934306569343e-05,
971
+ "loss": 0.0148,
972
+ "step": 5400
973
+ },
974
+ {
975
+ "epoch": 234.78,
976
+ "eval_cer": 0.338909541511772,
977
+ "eval_loss": 2.574066400527954,
978
+ "eval_runtime": 1.2534,
979
+ "eval_samples_per_second": 35.904,
980
+ "eval_steps_per_second": 2.394,
981
+ "step": 5400
982
+ },
983
+ {
984
+ "epoch": 236.96,
985
+ "learning_rate": 6.350364963503648e-05,
986
+ "loss": 0.0143,
987
+ "step": 5450
988
+ },
989
+ {
990
+ "epoch": 239.13,
991
+ "learning_rate": 6.131386861313868e-05,
992
+ "loss": 0.0197,
993
+ "step": 5500
994
+ },
995
+ {
996
+ "epoch": 241.3,
997
+ "learning_rate": 5.912408759124087e-05,
998
+ "loss": 0.0141,
999
+ "step": 5550
1000
+ },
1001
+ {
1002
+ "epoch": 241.3,
1003
+ "eval_cer": 0.338909541511772,
1004
+ "eval_loss": 2.5173895359039307,
1005
+ "eval_runtime": 1.1989,
1006
+ "eval_samples_per_second": 37.536,
1007
+ "eval_steps_per_second": 2.502,
1008
+ "step": 5550
1009
+ },
1010
+ {
1011
+ "epoch": 243.48,
1012
+ "learning_rate": 5.693430656934306e-05,
1013
+ "loss": 0.0165,
1014
+ "step": 5600
1015
+ },
1016
+ {
1017
+ "epoch": 245.65,
1018
+ "learning_rate": 5.4744525547445253e-05,
1019
+ "loss": 0.0127,
1020
+ "step": 5650
1021
+ },
1022
+ {
1023
+ "epoch": 247.83,
1024
+ "learning_rate": 5.255474452554744e-05,
1025
+ "loss": 0.0122,
1026
+ "step": 5700
1027
+ },
1028
+ {
1029
+ "epoch": 247.83,
1030
+ "eval_cer": 0.3351920693928129,
1031
+ "eval_loss": 2.5573315620422363,
1032
+ "eval_runtime": 1.2363,
1033
+ "eval_samples_per_second": 36.4,
1034
+ "eval_steps_per_second": 2.427,
1035
+ "step": 5700
1036
+ },
1037
+ {
1038
+ "epoch": 250.0,
1039
+ "learning_rate": 5.036496350364963e-05,
1040
+ "loss": 0.0135,
1041
+ "step": 5750
1042
+ },
1043
+ {
1044
+ "epoch": 252.17,
1045
+ "learning_rate": 4.817518248175182e-05,
1046
+ "loss": 0.0116,
1047
+ "step": 5800
1048
+ },
1049
+ {
1050
+ "epoch": 254.35,
1051
+ "learning_rate": 4.5985401459854016e-05,
1052
+ "loss": 0.0115,
1053
+ "step": 5850
1054
+ },
1055
+ {
1056
+ "epoch": 254.35,
1057
+ "eval_cer": 0.32961586121437425,
1058
+ "eval_loss": 2.579023838043213,
1059
+ "eval_runtime": 1.2327,
1060
+ "eval_samples_per_second": 36.506,
1061
+ "eval_steps_per_second": 2.434,
1062
+ "step": 5850
1063
+ },
1064
+ {
1065
+ "epoch": 256.52,
1066
+ "learning_rate": 4.3795620437956196e-05,
1067
+ "loss": 0.0141,
1068
+ "step": 5900
1069
+ },
1070
+ {
1071
+ "epoch": 258.7,
1072
+ "learning_rate": 4.160583941605839e-05,
1073
+ "loss": 0.0143,
1074
+ "step": 5950
1075
+ },
1076
+ {
1077
+ "epoch": 260.87,
1078
+ "learning_rate": 3.9416058394160584e-05,
1079
+ "loss": 0.0141,
1080
+ "step": 6000
1081
+ },
1082
+ {
1083
+ "epoch": 260.87,
1084
+ "eval_cer": 0.32032218091697645,
1085
+ "eval_loss": 2.577375888824463,
1086
+ "eval_runtime": 1.2321,
1087
+ "eval_samples_per_second": 36.524,
1088
+ "eval_steps_per_second": 2.435,
1089
+ "step": 6000
1090
+ },
1091
+ {
1092
+ "epoch": 263.04,
1093
+ "learning_rate": 3.722627737226277e-05,
1094
+ "loss": 0.0116,
1095
+ "step": 6050
1096
+ },
1097
+ {
1098
+ "epoch": 265.22,
1099
+ "learning_rate": 3.503649635036496e-05,
1100
+ "loss": 0.0101,
1101
+ "step": 6100
1102
+ },
1103
+ {
1104
+ "epoch": 267.39,
1105
+ "learning_rate": 3.284671532846715e-05,
1106
+ "loss": 0.0123,
1107
+ "step": 6150
1108
+ },
1109
+ {
1110
+ "epoch": 267.39,
1111
+ "eval_cer": 0.3308550185873606,
1112
+ "eval_loss": 2.614670753479004,
1113
+ "eval_runtime": 1.1319,
1114
+ "eval_samples_per_second": 39.755,
1115
+ "eval_steps_per_second": 2.65,
1116
+ "step": 6150
1117
+ },
1118
+ {
1119
+ "epoch": 269.57,
1120
+ "learning_rate": 3.065693430656934e-05,
1121
+ "loss": 0.0151,
1122
+ "step": 6200
1123
+ },
1124
+ {
1125
+ "epoch": 271.74,
1126
+ "learning_rate": 2.846715328467153e-05,
1127
+ "loss": 0.0099,
1128
+ "step": 6250
1129
+ },
1130
+ {
1131
+ "epoch": 273.91,
1132
+ "learning_rate": 2.627737226277372e-05,
1133
+ "loss": 0.0214,
1134
+ "step": 6300
1135
+ },
1136
+ {
1137
+ "epoch": 273.91,
1138
+ "eval_cer": 0.3302354399008674,
1139
+ "eval_loss": 2.620166778564453,
1140
+ "eval_runtime": 1.262,
1141
+ "eval_samples_per_second": 35.657,
1142
+ "eval_steps_per_second": 2.377,
1143
+ "step": 6300
1144
+ },
1145
+ {
1146
+ "epoch": 276.09,
1147
+ "learning_rate": 2.408759124087591e-05,
1148
+ "loss": 0.0085,
1149
+ "step": 6350
1150
+ },
1151
+ {
1152
+ "epoch": 278.26,
1153
+ "learning_rate": 2.1897810218978098e-05,
1154
+ "loss": 0.0119,
1155
+ "step": 6400
1156
+ },
1157
+ {
1158
+ "epoch": 280.43,
1159
+ "learning_rate": 1.9708029197080292e-05,
1160
+ "loss": 0.0107,
1161
+ "step": 6450
1162
+ },
1163
+ {
1164
+ "epoch": 280.43,
1165
+ "eval_cer": 0.32342007434944237,
1166
+ "eval_loss": 2.6263809204101562,
1167
+ "eval_runtime": 1.2547,
1168
+ "eval_samples_per_second": 35.867,
1169
+ "eval_steps_per_second": 2.391,
1170
+ "step": 6450
1171
+ },
1172
+ {
1173
+ "epoch": 282.61,
1174
+ "learning_rate": 1.751824817518248e-05,
1175
+ "loss": 0.0107,
1176
+ "step": 6500
1177
+ },
1178
+ {
1179
+ "epoch": 284.78,
1180
+ "learning_rate": 1.532846715328467e-05,
1181
+ "loss": 0.0105,
1182
+ "step": 6550
1183
+ },
1184
+ {
1185
+ "epoch": 286.96,
1186
+ "learning_rate": 1.313868613138686e-05,
1187
+ "loss": 0.0086,
1188
+ "step": 6600
1189
+ },
1190
+ {
1191
+ "epoch": 286.96,
1192
+ "eval_cer": 0.3215613382899628,
1193
+ "eval_loss": 2.607461452484131,
1194
+ "eval_runtime": 1.204,
1195
+ "eval_samples_per_second": 37.374,
1196
+ "eval_steps_per_second": 2.492,
1197
+ "step": 6600
1198
+ },
1199
+ {
1200
+ "epoch": 289.13,
1201
+ "learning_rate": 1.0948905109489049e-05,
1202
+ "loss": 0.0095,
1203
+ "step": 6650
1204
+ },
1205
+ {
1206
+ "epoch": 291.3,
1207
+ "learning_rate": 8.75912408759124e-06,
1208
+ "loss": 0.0108,
1209
+ "step": 6700
1210
+ },
1211
+ {
1212
+ "epoch": 293.48,
1213
+ "learning_rate": 6.56934306569343e-06,
1214
+ "loss": 0.0106,
1215
+ "step": 6750
1216
+ },
1217
+ {
1218
+ "epoch": 293.48,
1219
+ "eval_cer": 0.3246592317224288,
1220
+ "eval_loss": 2.595982789993286,
1221
+ "eval_runtime": 1.1323,
1222
+ "eval_samples_per_second": 39.741,
1223
+ "eval_steps_per_second": 2.649,
1224
+ "step": 6750
1225
+ },
1226
+ {
1227
+ "epoch": 295.65,
1228
+ "learning_rate": 4.37956204379562e-06,
1229
+ "loss": 0.0143,
1230
+ "step": 6800
1231
+ },
1232
+ {
1233
+ "epoch": 297.83,
1234
+ "learning_rate": 2.18978102189781e-06,
1235
+ "loss": 0.0105,
1236
+ "step": 6850
1237
+ },
1238
+ {
1239
+ "epoch": 300.0,
1240
+ "learning_rate": 0.0,
1241
+ "loss": 0.0085,
1242
+ "step": 6900
1243
+ },
1244
+ {
1245
+ "epoch": 300.0,
1246
+ "eval_cer": 0.32403965303593557,
1247
+ "eval_loss": 2.5951595306396484,
1248
+ "eval_runtime": 1.2068,
1249
+ "eval_samples_per_second": 37.288,
1250
+ "eval_steps_per_second": 2.486,
1251
+ "step": 6900
1252
+ }
1253
+ ],
1254
+ "logging_steps": 50,
1255
+ "max_steps": 6900,
1256
+ "num_train_epochs": 300,
1257
+ "save_steps": 150,
1258
+ "total_flos": 2.3112928880616276e+19,
1259
+ "trial_name": null,
1260
+ "trial_params": null
1261
+ }
checkpoint-6900/training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c0741fe1648758c067baeb587c00ff9d0528d818e60814b62c8d0f8ca82d1c4d
3
+ size 4472
checkpoint-6900/vocab.json ADDED
@@ -0,0 +1,679 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "0": 1,
3
+ "1": 2,
4
+ "2": 3,
5
+ "3": 4,
6
+ "4": 5,
7
+ "5": 6,
8
+ "6": 7,
9
+ "7": 8,
10
+ "8": 9,
11
+ "9": 10,
12
+ "[PAD]": 676,
13
+ "[UNK]": 675,
14
+ "|": 0,
15
+ " ": 11,
16
+ "、": 12,
17
+ "。": 13,
18
+ "々": 14,
19
+ "ぁ": 15,
20
+ "あ": 16,
21
+ "い": 17,
22
+ "う": 18,
23
+ "え": 19,
24
+ "お": 20,
25
+ "か": 21,
26
+ "が": 22,
27
+ "き": 23,
28
+ "ぎ": 24,
29
+ "く": 25,
30
+ "ぐ": 26,
31
+ "け": 27,
32
+ "げ": 28,
33
+ "こ": 29,
34
+ "ご": 30,
35
+ "さ": 31,
36
+ "ざ": 32,
37
+ "し": 33,
38
+ "じ": 34,
39
+ "す": 35,
40
+ "ず": 36,
41
+ "せ": 37,
42
+ "ぜ": 38,
43
+ "そ": 39,
44
+ "た": 40,
45
+ "だ": 41,
46
+ "ち": 42,
47
+ "っ": 43,
48
+ "つ": 44,
49
+ "て": 45,
50
+ "で": 46,
51
+ "と": 47,
52
+ "ど": 48,
53
+ "な": 49,
54
+ "に": 50,
55
+ "ぬ": 51,
56
+ "ね": 52,
57
+ "の": 53,
58
+ "は": 54,
59
+ "ば": 55,
60
+ "ぱ": 56,
61
+ "ひ": 57,
62
+ "び": 58,
63
+ "ふ": 59,
64
+ "ぶ": 60,
65
+ "ぷ": 61,
66
+ "へ": 62,
67
+ "べ": 63,
68
+ "ほ": 64,
69
+ "ぼ": 65,
70
+ "ぽ": 66,
71
+ "ま": 67,
72
+ "み": 68,
73
+ "む": 69,
74
+ "め": 70,
75
+ "も": 71,
76
+ "ゃ": 72,
77
+ "や": 73,
78
+ "ゆ": 74,
79
+ "ょ": 75,
80
+ "よ": 76,
81
+ "ら": 77,
82
+ "り": 78,
83
+ "る": 79,
84
+ "れ": 80,
85
+ "ろ": 81,
86
+ "わ": 82,
87
+ "を": 83,
88
+ "ん": 84,
89
+ "ァ": 85,
90
+ "ア": 86,
91
+ "ィ": 87,
92
+ "イ": 88,
93
+ "ウ": 89,
94
+ "ェ": 90,
95
+ "エ": 91,
96
+ "ォ": 92,
97
+ "オ": 93,
98
+ "カ": 94,
99
+ "ガ": 95,
100
+ "キ": 96,
101
+ "ギ": 97,
102
+ "ク": 98,
103
+ "グ": 99,
104
+ "ケ": 100,
105
+ "ゲ": 101,
106
+ "コ": 102,
107
+ "ゴ": 103,
108
+ "サ": 104,
109
+ "ザ": 105,
110
+ "シ": 106,
111
+ "ジ": 107,
112
+ "ス": 108,
113
+ "ズ": 109,
114
+ "セ": 110,
115
+ "ソ": 111,
116
+ "タ": 112,
117
+ "ダ": 113,
118
+ "チ": 114,
119
+ "ッ": 115,
120
+ "ツ": 116,
121
+ "テ": 117,
122
+ "デ": 118,
123
+ "ト": 119,
124
+ "ド": 120,
125
+ "ナ": 121,
126
+ "ニ": 122,
127
+ "ネ": 123,
128
+ "ノ": 124,
129
+ "ハ": 125,
130
+ "バ": 126,
131
+ "パ": 127,
132
+ "ヒ": 128,
133
+ "ビ": 129,
134
+ "ピ": 130,
135
+ "フ": 131,
136
+ "ブ": 132,
137
+ "プ": 133,
138
+ "ベ": 134,
139
+ "ペ": 135,
140
+ "ホ": 136,
141
+ "ボ": 137,
142
+ "ポ": 138,
143
+ "マ": 139,
144
+ "ミ": 140,
145
+ "ム": 141,
146
+ "メ": 142,
147
+ "モ": 143,
148
+ "ャ": 144,
149
+ "ヤ": 145,
150
+ "ュ": 146,
151
+ "ヨ": 147,
152
+ "ラ": 148,
153
+ "リ": 149,
154
+ "ル": 150,
155
+ "レ": 151,
156
+ "ロ": 152,
157
+ "ワ": 153,
158
+ "ン": 154,
159
+ "ヶ": 155,
160
+ "ー": 156,
161
+ "一": 157,
162
+ "万": 158,
163
+ "丈": 159,
164
+ "三": 160,
165
+ "上": 161,
166
+ "下": 162,
167
+ "不": 163,
168
+ "中": 164,
169
+ "丸": 165,
170
+ "主": 166,
171
+ "久": 167,
172
+ "九": 168,
173
+ "乾": 169,
174
+ "予": 170,
175
+ "事": 171,
176
+ "二": 172,
177
+ "五": 173,
178
+ "井": 174,
179
+ "交": 175,
180
+ "京": 176,
181
+ "人": 177,
182
+ "今": 178,
183
+ "仏": 179,
184
+ "仕": 180,
185
+ "他": 181,
186
+ "付": 182,
187
+ "代": 183,
188
+ "以": 184,
189
+ "件": 185,
190
+ "企": 186,
191
+ "伊": 187,
192
+ "休": 188,
193
+ "会": 189,
194
+ "伸": 190,
195
+ "住": 191,
196
+ "体": 192,
197
+ "何": 193,
198
+ "余": 194,
199
+ "作": 195,
200
+ "使": 196,
201
+ "例": 197,
202
+ "保": 198,
203
+ "信": 199,
204
+ "俣": 200,
205
+ "個": 201,
206
+ "倒": 202,
207
+ "候": 203,
208
+ "健": 204,
209
+ "備": 205,
210
+ "元": 206,
211
+ "充": 207,
212
+ "先": 208,
213
+ "入": 209,
214
+ "全": 210,
215
+ "公": 211,
216
+ "共": 212,
217
+ "内": 213,
218
+ "円": 214,
219
+ "写": 215,
220
+ "冬": 216,
221
+ "冷": 217,
222
+ "凍": 218,
223
+ "出": 219,
224
+ "分": 220,
225
+ "切": 221,
226
+ "初": 222,
227
+ "到": 223,
228
+ "制": 224,
229
+ "前": 225,
230
+ "力": 226,
231
+ "加": 227,
232
+ "動": 228,
233
+ "募": 229,
234
+ "勧": 230,
235
+ "化": 231,
236
+ "北": 232,
237
+ "南": 233,
238
+ "厚": 234,
239
+ "原": 235,
240
+ "去": 236,
241
+ "参": 237,
242
+ "友": 238,
243
+ "取": 239,
244
+ "口": 240,
245
+ "古": 241,
246
+ "可": 242,
247
+ "台": 243,
248
+ "号": 244,
249
+ "司": 245,
250
+ "合": 246,
251
+ "吉": 247,
252
+ "吊": 248,
253
+ "同": 249,
254
+ "名": 250,
255
+ "吹": 251,
256
+ "味": 252,
257
+ "呼": 253,
258
+ "和": 254,
259
+ "品": 255,
260
+ "唇": 256,
261
+ "商": 257,
262
+ "問": 258,
263
+ "噌": 259,
264
+ "回": 260,
265
+ "固": 261,
266
+ "国": 262,
267
+ "園": 263,
268
+ "地": 264,
269
+ "型": 265,
270
+ "域": 266,
271
+ "報": 267,
272
+ "場": 268,
273
+ "塗": 269,
274
+ "増": 270,
275
+ "声": 271,
276
+ "売": 272,
277
+ "変": 273,
278
+ "夏": 274,
279
+ "外": 275,
280
+ "多": 276,
281
+ "大": 277,
282
+ "天": 278,
283
+ "太": 279,
284
+ "夫": 280,
285
+ "失": 281,
286
+ "奈": 282,
287
+ "奥": 283,
288
+ "女": 284,
289
+ "好": 285,
290
+ "始": 286,
291
+ "嫌": 287,
292
+ "嬉": 288,
293
+ "子": 289,
294
+ "存": 290,
295
+ "孝": 291,
296
+ "学": 292,
297
+ "定": 293,
298
+ "実": 294,
299
+ "室": 295,
300
+ "宮": 296,
301
+ "家": 297,
302
+ "容": 298,
303
+ "寝": 299,
304
+ "寺": 300,
305
+ "対": 301,
306
+ "小": 302,
307
+ "少": 303,
308
+ "尾": 304,
309
+ "局": 305,
310
+ "届": 306,
311
+ "屋": 307,
312
+ "山": 308,
313
+ "岐": 309,
314
+ "岡": 310,
315
+ "岩": 311,
316
+ "岳": 312,
317
+ "島": 313,
318
+ "川": 314,
319
+ "帰": 315,
320
+ "常": 316,
321
+ "平": 317,
322
+ "年": 318,
323
+ "幻": 319,
324
+ "広": 320,
325
+ "底": 321,
326
+ "店": 322,
327
+ "座": 323,
328
+ "庫": 324,
329
+ "弁": 325,
330
+ "式": 326,
331
+ "張": 327,
332
+ "強": 328,
333
+ "当": 329,
334
+ "形": 330,
335
+ "影": 331,
336
+ "待": 332,
337
+ "後": 333,
338
+ "得": 334,
339
+ "忘": 335,
340
+ "応": 336,
341
+ "思": 337,
342
+ "怠": 338,
343
+ "恥": 339,
344
+ "悪": 340,
345
+ "情": 341,
346
+ "想": 342,
347
+ "意": 343,
348
+ "愛": 344,
349
+ "感": 345,
350
+ "慢": 346,
351
+ "成": 347,
352
+ "我": 348,
353
+ "戦": 349,
354
+ "戻": 350,
355
+ "所": 351,
356
+ "手": 352,
357
+ "打": 353,
358
+ "抜": 354,
359
+ "押": 355,
360
+ "拝": 356,
361
+ "拶": 357,
362
+ "持": 358,
363
+ "指": 359,
364
+ "挨": 360,
365
+ "掃": 361,
366
+ "援": 362,
367
+ "教": 363,
368
+ "数": 364,
369
+ "文": 365,
370
+ "料": 366,
371
+ "断": 367,
372
+ "新": 368,
373
+ "方": 369,
374
+ "旗": 370,
375
+ "日": 371,
376
+ "旦": 372,
377
+ "早": 373,
378
+ "明": 374,
379
+ "映": 375,
380
+ "春": 376,
381
+ "昨": 377,
382
+ "是": 378,
383
+ "昼": 379,
384
+ "時": 380,
385
+ "普": 381,
386
+ "景": 382,
387
+ "晴": 383,
388
+ "暑": 384,
389
+ "暗": 385,
390
+ "書": 386,
391
+ "最": 387,
392
+ "月": 388,
393
+ "有": 389,
394
+ "望": 390,
395
+ "期": 391,
396
+ "木": 392,
397
+ "本": 393,
398
+ "机": 394,
399
+ "村": 395,
400
+ "来": 396,
401
+ "杯": 397,
402
+ "東": 398,
403
+ "林": 399,
404
+ "枚": 400,
405
+ "柴": 401,
406
+ "校": 402,
407
+ "梨": 403,
408
+ "棒": 404,
409
+ "森": 405,
410
+ "椿": 406,
411
+ "楽": 407,
412
+ "構": 408,
413
+ "横": 409,
414
+ "樹": 410,
415
+ "機": 411,
416
+ "欄": 412,
417
+ "次": 413,
418
+ "欲": 414,
419
+ "正": 415,
420
+ "残": 416,
421
+ "段": 417,
422
+ "母": 418,
423
+ "毎": 419,
424
+ "比": 420,
425
+ "毛": 421,
426
+ "気": 422,
427
+ "水": 423,
428
+ "汁": 424,
429
+ "汗": 425,
430
+ "況": 426,
431
+ "泉": 427,
432
+ "泊": 428,
433
+ "法": 429,
434
+ "注": 430,
435
+ "洋": 431,
436
+ "活": 432,
437
+ "流": 433,
438
+ "海": 434,
439
+ "消": 435,
440
+ "減": 436,
441
+ "渡": 437,
442
+ "温": 438,
443
+ "準": 439,
444
+ "漫": 440,
445
+ "激": 441,
446
+ "濃": 442,
447
+ "瀬": 443,
448
+ "火": 444,
449
+ "炎": 445,
450
+ "炭": 446,
451
+ "焚": 447,
452
+ "焦": 448,
453
+ "然": 449,
454
+ "焼": 450,
455
+ "照": 451,
456
+ "煮": 452,
457
+ "熊": 453,
458
+ "熱": 454,
459
+ "燃": 455,
460
+ "燕": 456,
461
+ "燥": 457,
462
+ "父": 458,
463
+ "物": 459,
464
+ "特": 460,
465
+ "犬": 461,
466
+ "状": 462,
467
+ "狙": 463,
468
+ "独": 464,
469
+ "狭": 465,
470
+ "猫": 466,
471
+ "獣": 467,
472
+ "王": 468,
473
+ "球": 469,
474
+ "理": 470,
475
+ "生": 471,
476
+ "用": 472,
477
+ "田": 473,
478
+ "甲": 474,
479
+ "申": 475,
480
+ "町": 476,
481
+ "画": 477,
482
+ "界": 478,
483
+ "留": 479,
484
+ "番": 480,
485
+ "疲": 481,
486
+ "癒": 482,
487
+ "発": 483,
488
+ "登": 484,
489
+ "白": 485,
490
+ "百": 486,
491
+ "的": 487,
492
+ "皆": 488,
493
+ "皿": 489,
494
+ "監": 490,
495
+ "目": 491,
496
+ "直": 492,
497
+ "相": 493,
498
+ "省": 494,
499
+ "県": 495,
500
+ "真": 496,
501
+ "督": 497,
502
+ "瞬": 498,
503
+ "知": 499,
504
+ "硬": 500,
505
+ "確": 501,
506
+ "礼": 502,
507
+ "社": 503,
508
+ "神": 504,
509
+ "福": 505,
510
+ "私": 506,
511
+ "移": 507,
512
+ "稲": 508,
513
+ "穂": 509,
514
+ "空": 510,
515
+ "立": 511,
516
+ "端": 512,
517
+ "答": 513,
518
+ "箇": 514,
519
+ "箱": 515,
520
+ "籍": 516,
521
+ "米": 517,
522
+ "粛": 518,
523
+ "精": 519,
524
+ "糖": 520,
525
+ "系": 521,
526
+ "納": 522,
527
+ "素": 523,
528
+ "細": 524,
529
+ "終": 525,
530
+ "結": 526,
531
+ "絶": 527,
532
+ "継": 528,
533
+ "綺": 529,
534
+ "綿": 530,
535
+ "緒": 531,
536
+ "締": 532,
537
+ "練": 533,
538
+ "縁": 534,
539
+ "繰": 535,
540
+ "缶": 536,
541
+ "置": 537,
542
+ "羊": 538,
543
+ "美": 539,
544
+ "義": 540,
545
+ "考": 541,
546
+ "者": 542,
547
+ "耳": 543,
548
+ "聞": 544,
549
+ "肉": 545,
550
+ "育": 546,
551
+ "腹": 547,
552
+ "自": 548,
553
+ "良": 549,
554
+ "色": 550,
555
+ "若": 551,
556
+ "茶": 552,
557
+ "荒": 553,
558
+ "荘": 554,
559
+ "荷": 555,
560
+ "落": 556,
561
+ "蔵": 557,
562
+ "薬": 558,
563
+ "蝶": 559,
564
+ "行": 560,
565
+ "街": 561,
566
+ "褒": 562,
567
+ "西": 563,
568
+ "要": 564,
569
+ "見": 565,
570
+ "視": 566,
571
+ "覧": 567,
572
+ "親": 568,
573
+ "観": 569,
574
+ "言": 570,
575
+ "記": 571,
576
+ "設": 572,
577
+ "許": 573,
578
+ "訳": 574,
579
+ "試": 575,
580
+ "話": 576,
581
+ "詳": 577,
582
+ "説": 578,
583
+ "読": 579,
584
+ "誰": 580,
585
+ "調": 581,
586
+ "請": 582,
587
+ "謝": 583,
588
+ "識": 584,
589
+ "議": 585,
590
+ "谷": 586,
591
+ "買": 587,
592
+ "質": 588,
593
+ "赤": 589,
594
+ "走": 590,
595
+ "越": 591,
596
+ "路": 592,
597
+ "身": 593,
598
+ "車": 594,
599
+ "転": 595,
600
+ "載": 596,
601
+ "辛": 597,
602
+ "辺": 598,
603
+ "込": 599,
604
+ "近": 600,
605
+ "返": 601,
606
+ "追": 602,
607
+ "途": 603,
608
+ "通": 604,
609
+ "速": 605,
610
+ "連": 606,
611
+ "週": 607,
612
+ "遅": 608,
613
+ "運": 609,
614
+ "過": 610,
615
+ "達": 611,
616
+ "違": 612,
617
+ "適": 613,
618
+ "選": 614,
619
+ "郎": 615,
620
+ "部": 616,
621
+ "配": 617,
622
+ "酒": 618,
623
+ "重": 619,
624
+ "野": 620,
625
+ "量": 621,
626
+ "釣": 622,
627
+ "録": 623,
628
+ "鍵": 624,
629
+ "長": 625,
630
+ "開": 626,
631
+ "間": 627,
632
+ "関": 628,
633
+ "閣": 629,
634
+ "阜": 630,
635
+ "降": 631,
636
+ "限": 632,
637
+ "院": 633,
638
+ "除": 634,
639
+ "陸": 635,
640
+ "雅": 636,
641
+ "集": 637,
642
+ "雉": 638,
643
+ "難": 639,
644
+ "雨": 640,
645
+ "雪": 641,
646
+ "電": 642,
647
+ "青": 643,
648
+ "非": 644,
649
+ "面": 645,
650
+ "音": 646,
651
+ "響": 647,
652
+ "頂": 648,
653
+ "頃": 649,
654
+ "順": 650,
655
+ "頼": 651,
656
+ "顔": 652,
657
+ "風": 653,
658
+ "食": 654,
659
+ "飲": 655,
660
+ "飼": 656,
661
+ "馬": 657,
662
+ "験": 658,
663
+ "驚": 659,
664
+ "高": 660,
665
+ "髪": 661,
666
+ "鬼": 662,
667
+ "鶏": 663,
668
+ "鹿": 664,
669
+ "麗": 665,
670
+ "!": 666,
671
+ "(": 667,
672
+ ")": 668,
673
+ "/": 669,
674
+ "1": 670,
675
+ "2": 671,
676
+ "3": 672,
677
+ "?": 673,
678
+ "m": 674
679
+ }
config.json ADDED
@@ -0,0 +1,117 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "facebook/wav2vec2-large-xlsr-53",
3
+ "activation_dropout": 0.0,
4
+ "adapter_attn_dim": null,
5
+ "adapter_kernel_size": 3,
6
+ "adapter_stride": 2,
7
+ "add_adapter": false,
8
+ "apply_spec_augment": true,
9
+ "architectures": [
10
+ "Wav2Vec2ForCTC"
11
+ ],
12
+ "attention_dropout": 0.1,
13
+ "bos_token_id": 1,
14
+ "classifier_proj_size": 256,
15
+ "codevector_dim": 768,
16
+ "contrastive_logits_temperature": 0.1,
17
+ "conv_bias": true,
18
+ "conv_dim": [
19
+ 512,
20
+ 512,
21
+ 512,
22
+ 512,
23
+ 512,
24
+ 512,
25
+ 512
26
+ ],
27
+ "conv_kernel": [
28
+ 10,
29
+ 3,
30
+ 3,
31
+ 3,
32
+ 3,
33
+ 2,
34
+ 2
35
+ ],
36
+ "conv_stride": [
37
+ 5,
38
+ 2,
39
+ 2,
40
+ 2,
41
+ 2,
42
+ 2,
43
+ 2
44
+ ],
45
+ "ctc_loss_reduction": "mean",
46
+ "ctc_zero_infinity": false,
47
+ "diversity_loss_weight": 0.1,
48
+ "do_stable_layer_norm": true,
49
+ "eos_token_id": 2,
50
+ "feat_extract_activation": "gelu",
51
+ "feat_extract_dropout": 0.0,
52
+ "feat_extract_norm": "layer",
53
+ "feat_proj_dropout": 0.05,
54
+ "feat_quantizer_dropout": 0.0,
55
+ "final_dropout": 0.0,
56
+ "gradient_checkpointing": false,
57
+ "hidden_act": "gelu",
58
+ "hidden_dropout": 0.05,
59
+ "hidden_size": 1024,
60
+ "initializer_range": 0.02,
61
+ "intermediate_size": 4096,
62
+ "layer_norm_eps": 1e-05,
63
+ "layerdrop": 0.05,
64
+ "mask_channel_length": 10,
65
+ "mask_channel_min_space": 1,
66
+ "mask_channel_other": 0.0,
67
+ "mask_channel_prob": 0.0,
68
+ "mask_channel_selection": "static",
69
+ "mask_feature_length": 10,
70
+ "mask_feature_min_masks": 0,
71
+ "mask_feature_prob": 0.0,
72
+ "mask_time_length": 10,
73
+ "mask_time_min_masks": 2,
74
+ "mask_time_min_space": 1,
75
+ "mask_time_other": 0.0,
76
+ "mask_time_prob": 0.05,
77
+ "mask_time_selection": "static",
78
+ "model_type": "wav2vec2",
79
+ "num_adapter_layers": 3,
80
+ "num_attention_heads": 16,
81
+ "num_codevector_groups": 2,
82
+ "num_codevectors_per_group": 320,
83
+ "num_conv_pos_embedding_groups": 16,
84
+ "num_conv_pos_embeddings": 128,
85
+ "num_feat_extract_layers": 7,
86
+ "num_hidden_layers": 24,
87
+ "num_negatives": 100,
88
+ "output_hidden_size": 1024,
89
+ "pad_token_id": 676,
90
+ "proj_codevector_dim": 768,
91
+ "tdnn_dilation": [
92
+ 1,
93
+ 2,
94
+ 3,
95
+ 1,
96
+ 1
97
+ ],
98
+ "tdnn_dim": [
99
+ 512,
100
+ 512,
101
+ 512,
102
+ 512,
103
+ 1500
104
+ ],
105
+ "tdnn_kernel": [
106
+ 5,
107
+ 3,
108
+ 3,
109
+ 1,
110
+ 1
111
+ ],
112
+ "torch_dtype": "float32",
113
+ "transformers_version": "4.34.0",
114
+ "use_weighted_layer_sum": false,
115
+ "vocab_size": 679,
116
+ "xvector_output_dim": 512
117
+ }
eval_results.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 300.0,
3
+ "eval_cer": 0.32403965303593557,
4
+ "eval_loss": 2.5951595306396484,
5
+ "eval_runtime": 1.15,
6
+ "eval_samples": 45,
7
+ "eval_samples_per_second": 39.131,
8
+ "eval_steps_per_second": 2.609
9
+ }
preprocessor_config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "do_normalize": true,
3
+ "feature_extractor_type": "Wav2Vec2FeatureExtractor",
4
+ "feature_size": 1,
5
+ "padding_side": "right",
6
+ "padding_value": 0,
7
+ "processor_class": "Wav2Vec2Processor",
8
+ "return_attention_mask": true,
9
+ "sampling_rate": 16000
10
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9b61e6d7c21931997e82553ee1094451457e43f95812c2ee82aeedf4e89cd76d
3
+ size 1264686250
special_tokens_map.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<s>",
4
+ "</s>"
5
+ ],
6
+ "bos_token": "<s>",
7
+ "eos_token": "</s>",
8
+ "pad_token": "[PAD]",
9
+ "unk_token": "[UNK]"
10
+ }
tokenizer_config.json ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "675": {
4
+ "content": "[UNK]",
5
+ "lstrip": true,
6
+ "normalized": false,
7
+ "rstrip": true,
8
+ "single_word": false,
9
+ "special": false
10
+ },
11
+ "676": {
12
+ "content": "[PAD]",
13
+ "lstrip": true,
14
+ "normalized": false,
15
+ "rstrip": true,
16
+ "single_word": false,
17
+ "special": false
18
+ },
19
+ "677": {
20
+ "content": "<s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "678": {
28
+ "content": "</s>",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ }
35
+ },
36
+ "additional_special_tokens": [
37
+ "<s>",
38
+ "</s>"
39
+ ],
40
+ "bos_token": "<s>",
41
+ "clean_up_tokenization_spaces": true,
42
+ "config": null,
43
+ "do_lower_case": false,
44
+ "eos_token": "</s>",
45
+ "model_max_length": 1000000000000000019884624838656,
46
+ "pad_token": "[PAD]",
47
+ "processor_class": "Wav2Vec2Processor",
48
+ "replace_word_delimiter_char": " ",
49
+ "target_lang": null,
50
+ "tokenizer_class": "Wav2Vec2CTCTokenizer",
51
+ "tokenizer_file": null,
52
+ "tokenizer_type": "wav2vec2",
53
+ "trust_remote_code": false,
54
+ "unk_token": "[UNK]",
55
+ "word_delimiter_token": "|"
56
+ }
train_results.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 300.0,
3
+ "train_loss": 0.8083851718038753,
4
+ "train_runtime": 4592.71,
5
+ "train_samples": 359,
6
+ "train_samples_per_second": 23.45,
7
+ "train_steps_per_second": 1.502
8
+ }
trainer_state.json ADDED
@@ -0,0 +1,1270 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": null,
3
+ "best_model_checkpoint": null,
4
+ "epoch": 300.0,
5
+ "eval_steps": 150,
6
+ "global_step": 6900,
7
+ "is_hyper_param_search": false,
8
+ "is_local_process_zero": true,
9
+ "is_world_process_zero": true,
10
+ "log_history": [
11
+ {
12
+ "epoch": 2.17,
13
+ "learning_rate": 0.0003,
14
+ "loss": 35.2887,
15
+ "step": 50
16
+ },
17
+ {
18
+ "epoch": 4.35,
19
+ "learning_rate": 0.00029781021897810217,
20
+ "loss": 5.9569,
21
+ "step": 100
22
+ },
23
+ {
24
+ "epoch": 6.52,
25
+ "learning_rate": 0.00029562043795620436,
26
+ "loss": 4.9138,
27
+ "step": 150
28
+ },
29
+ {
30
+ "epoch": 6.52,
31
+ "eval_cer": 1.0,
32
+ "eval_loss": 4.7965407371521,
33
+ "eval_runtime": 1.256,
34
+ "eval_samples_per_second": 35.828,
35
+ "eval_steps_per_second": 2.389,
36
+ "step": 150
37
+ },
38
+ {
39
+ "epoch": 8.7,
40
+ "learning_rate": 0.00029343065693430656,
41
+ "loss": 4.887,
42
+ "step": 200
43
+ },
44
+ {
45
+ "epoch": 10.87,
46
+ "learning_rate": 0.00029124087591240875,
47
+ "loss": 4.8447,
48
+ "step": 250
49
+ },
50
+ {
51
+ "epoch": 13.04,
52
+ "learning_rate": 0.00028905109489051094,
53
+ "loss": 4.7484,
54
+ "step": 300
55
+ },
56
+ {
57
+ "epoch": 13.04,
58
+ "eval_cer": 1.0,
59
+ "eval_loss": 4.608075141906738,
60
+ "eval_runtime": 1.2451,
61
+ "eval_samples_per_second": 36.142,
62
+ "eval_steps_per_second": 2.409,
63
+ "step": 300
64
+ },
65
+ {
66
+ "epoch": 15.22,
67
+ "learning_rate": 0.00028686131386861314,
68
+ "loss": 4.6529,
69
+ "step": 350
70
+ },
71
+ {
72
+ "epoch": 17.39,
73
+ "learning_rate": 0.0002846715328467153,
74
+ "loss": 4.6373,
75
+ "step": 400
76
+ },
77
+ {
78
+ "epoch": 19.57,
79
+ "learning_rate": 0.00028248175182481747,
80
+ "loss": 4.5894,
81
+ "step": 450
82
+ },
83
+ {
84
+ "epoch": 19.57,
85
+ "eval_cer": 0.9851301115241635,
86
+ "eval_loss": 4.469708442687988,
87
+ "eval_runtime": 1.2325,
88
+ "eval_samples_per_second": 36.51,
89
+ "eval_steps_per_second": 2.434,
90
+ "step": 450
91
+ },
92
+ {
93
+ "epoch": 21.74,
94
+ "learning_rate": 0.00028029197080291966,
95
+ "loss": 4.5045,
96
+ "step": 500
97
+ },
98
+ {
99
+ "epoch": 23.91,
100
+ "learning_rate": 0.00027810218978102186,
101
+ "loss": 4.4076,
102
+ "step": 550
103
+ },
104
+ {
105
+ "epoch": 26.09,
106
+ "learning_rate": 0.00027591240875912405,
107
+ "loss": 4.2024,
108
+ "step": 600
109
+ },
110
+ {
111
+ "epoch": 26.09,
112
+ "eval_cer": 0.9076827757125155,
113
+ "eval_loss": 4.037315845489502,
114
+ "eval_runtime": 1.2357,
115
+ "eval_samples_per_second": 36.417,
116
+ "eval_steps_per_second": 2.428,
117
+ "step": 600
118
+ },
119
+ {
120
+ "epoch": 28.26,
121
+ "learning_rate": 0.00027372262773722625,
122
+ "loss": 3.8743,
123
+ "step": 650
124
+ },
125
+ {
126
+ "epoch": 30.43,
127
+ "learning_rate": 0.00027153284671532844,
128
+ "loss": 3.3488,
129
+ "step": 700
130
+ },
131
+ {
132
+ "epoch": 32.61,
133
+ "learning_rate": 0.00026934306569343063,
134
+ "loss": 2.7314,
135
+ "step": 750
136
+ },
137
+ {
138
+ "epoch": 32.61,
139
+ "eval_cer": 0.5340768277571252,
140
+ "eval_loss": 2.5507473945617676,
141
+ "eval_runtime": 1.2278,
142
+ "eval_samples_per_second": 36.651,
143
+ "eval_steps_per_second": 2.443,
144
+ "step": 750
145
+ },
146
+ {
147
+ "epoch": 34.78,
148
+ "learning_rate": 0.00026715328467153283,
149
+ "loss": 2.1968,
150
+ "step": 800
151
+ },
152
+ {
153
+ "epoch": 36.96,
154
+ "learning_rate": 0.000264963503649635,
155
+ "loss": 1.6522,
156
+ "step": 850
157
+ },
158
+ {
159
+ "epoch": 39.13,
160
+ "learning_rate": 0.0002627737226277372,
161
+ "loss": 1.2293,
162
+ "step": 900
163
+ },
164
+ {
165
+ "epoch": 39.13,
166
+ "eval_cer": 0.4138785625774473,
167
+ "eval_loss": 2.01461124420166,
168
+ "eval_runtime": 1.2246,
169
+ "eval_samples_per_second": 36.746,
170
+ "eval_steps_per_second": 2.45,
171
+ "step": 900
172
+ },
173
+ {
174
+ "epoch": 41.3,
175
+ "learning_rate": 0.0002605839416058394,
176
+ "loss": 0.9292,
177
+ "step": 950
178
+ },
179
+ {
180
+ "epoch": 43.48,
181
+ "learning_rate": 0.00025839416058394155,
182
+ "loss": 0.7208,
183
+ "step": 1000
184
+ },
185
+ {
186
+ "epoch": 45.65,
187
+ "learning_rate": 0.00025620437956204374,
188
+ "loss": 0.5544,
189
+ "step": 1050
190
+ },
191
+ {
192
+ "epoch": 45.65,
193
+ "eval_cer": 0.355638166047088,
194
+ "eval_loss": 1.9821244478225708,
195
+ "eval_runtime": 1.2073,
196
+ "eval_samples_per_second": 37.275,
197
+ "eval_steps_per_second": 2.485,
198
+ "step": 1050
199
+ },
200
+ {
201
+ "epoch": 47.83,
202
+ "learning_rate": 0.00025401459854014594,
203
+ "loss": 0.4757,
204
+ "step": 1100
205
+ },
206
+ {
207
+ "epoch": 50.0,
208
+ "learning_rate": 0.00025182481751824813,
209
+ "loss": 0.3895,
210
+ "step": 1150
211
+ },
212
+ {
213
+ "epoch": 52.17,
214
+ "learning_rate": 0.0002496350364963503,
215
+ "loss": 0.3224,
216
+ "step": 1200
217
+ },
218
+ {
219
+ "epoch": 52.17,
220
+ "eval_cer": 0.3587360594795539,
221
+ "eval_loss": 2.0189881324768066,
222
+ "eval_runtime": 1.1983,
223
+ "eval_samples_per_second": 37.554,
224
+ "eval_steps_per_second": 2.504,
225
+ "step": 1200
226
+ },
227
+ {
228
+ "epoch": 54.35,
229
+ "learning_rate": 0.0002474452554744525,
230
+ "loss": 0.279,
231
+ "step": 1250
232
+ },
233
+ {
234
+ "epoch": 56.52,
235
+ "learning_rate": 0.0002452554744525547,
236
+ "loss": 0.2285,
237
+ "step": 1300
238
+ },
239
+ {
240
+ "epoch": 58.7,
241
+ "learning_rate": 0.0002430656934306569,
242
+ "loss": 0.1951,
243
+ "step": 1350
244
+ },
245
+ {
246
+ "epoch": 58.7,
247
+ "eval_cer": 0.36121437422552666,
248
+ "eval_loss": 2.1229116916656494,
249
+ "eval_runtime": 1.2603,
250
+ "eval_samples_per_second": 35.706,
251
+ "eval_steps_per_second": 2.38,
252
+ "step": 1350
253
+ },
254
+ {
255
+ "epoch": 60.87,
256
+ "learning_rate": 0.0002408759124087591,
257
+ "loss": 0.1964,
258
+ "step": 1400
259
+ },
260
+ {
261
+ "epoch": 63.04,
262
+ "learning_rate": 0.0002386861313868613,
263
+ "loss": 0.1622,
264
+ "step": 1450
265
+ },
266
+ {
267
+ "epoch": 65.22,
268
+ "learning_rate": 0.0002364963503649635,
269
+ "loss": 0.1539,
270
+ "step": 1500
271
+ },
272
+ {
273
+ "epoch": 65.22,
274
+ "eval_cer": 0.3469640644361834,
275
+ "eval_loss": 2.111368179321289,
276
+ "eval_runtime": 1.2194,
277
+ "eval_samples_per_second": 36.903,
278
+ "eval_steps_per_second": 2.46,
279
+ "step": 1500
280
+ },
281
+ {
282
+ "epoch": 67.39,
283
+ "learning_rate": 0.00023430656934306568,
284
+ "loss": 0.1492,
285
+ "step": 1550
286
+ },
287
+ {
288
+ "epoch": 69.57,
289
+ "learning_rate": 0.00023211678832116788,
290
+ "loss": 0.1404,
291
+ "step": 1600
292
+ },
293
+ {
294
+ "epoch": 71.74,
295
+ "learning_rate": 0.00022992700729927004,
296
+ "loss": 0.1165,
297
+ "step": 1650
298
+ },
299
+ {
300
+ "epoch": 71.74,
301
+ "eval_cer": 0.33147459727385375,
302
+ "eval_loss": 2.274796485900879,
303
+ "eval_runtime": 1.1874,
304
+ "eval_samples_per_second": 37.898,
305
+ "eval_steps_per_second": 2.527,
306
+ "step": 1650
307
+ },
308
+ {
309
+ "epoch": 73.91,
310
+ "learning_rate": 0.00022773722627737224,
311
+ "loss": 0.1268,
312
+ "step": 1700
313
+ },
314
+ {
315
+ "epoch": 76.09,
316
+ "learning_rate": 0.00022554744525547443,
317
+ "loss": 0.1186,
318
+ "step": 1750
319
+ },
320
+ {
321
+ "epoch": 78.26,
322
+ "learning_rate": 0.00022335766423357663,
323
+ "loss": 0.1119,
324
+ "step": 1800
325
+ },
326
+ {
327
+ "epoch": 78.26,
328
+ "eval_cer": 0.34882280049566294,
329
+ "eval_loss": 2.2390518188476562,
330
+ "eval_runtime": 1.3465,
331
+ "eval_samples_per_second": 33.42,
332
+ "eval_steps_per_second": 2.228,
333
+ "step": 1800
334
+ },
335
+ {
336
+ "epoch": 80.43,
337
+ "learning_rate": 0.00022116788321167882,
338
+ "loss": 0.0988,
339
+ "step": 1850
340
+ },
341
+ {
342
+ "epoch": 82.61,
343
+ "learning_rate": 0.00021897810218978101,
344
+ "loss": 0.112,
345
+ "step": 1900
346
+ },
347
+ {
348
+ "epoch": 84.78,
349
+ "learning_rate": 0.0002167883211678832,
350
+ "loss": 0.0989,
351
+ "step": 1950
352
+ },
353
+ {
354
+ "epoch": 84.78,
355
+ "eval_cer": 0.3382899628252788,
356
+ "eval_loss": 2.343754529953003,
357
+ "eval_runtime": 1.2055,
358
+ "eval_samples_per_second": 37.329,
359
+ "eval_steps_per_second": 2.489,
360
+ "step": 1950
361
+ },
362
+ {
363
+ "epoch": 86.96,
364
+ "learning_rate": 0.00021459854014598537,
365
+ "loss": 0.097,
366
+ "step": 2000
367
+ },
368
+ {
369
+ "epoch": 89.13,
370
+ "learning_rate": 0.00021240875912408757,
371
+ "loss": 0.0854,
372
+ "step": 2050
373
+ },
374
+ {
375
+ "epoch": 91.3,
376
+ "learning_rate": 0.00021021897810218976,
377
+ "loss": 0.0915,
378
+ "step": 2100
379
+ },
380
+ {
381
+ "epoch": 91.3,
382
+ "eval_cer": 0.3587360594795539,
383
+ "eval_loss": 2.121840000152588,
384
+ "eval_runtime": 1.2037,
385
+ "eval_samples_per_second": 37.386,
386
+ "eval_steps_per_second": 2.492,
387
+ "step": 2100
388
+ },
389
+ {
390
+ "epoch": 93.48,
391
+ "learning_rate": 0.00020802919708029196,
392
+ "loss": 0.078,
393
+ "step": 2150
394
+ },
395
+ {
396
+ "epoch": 95.65,
397
+ "learning_rate": 0.00020583941605839415,
398
+ "loss": 0.0857,
399
+ "step": 2200
400
+ },
401
+ {
402
+ "epoch": 97.83,
403
+ "learning_rate": 0.00020364963503649632,
404
+ "loss": 0.0721,
405
+ "step": 2250
406
+ },
407
+ {
408
+ "epoch": 97.83,
409
+ "eval_cer": 0.35192069392812886,
410
+ "eval_loss": 2.242812395095825,
411
+ "eval_runtime": 1.1964,
412
+ "eval_samples_per_second": 37.614,
413
+ "eval_steps_per_second": 2.508,
414
+ "step": 2250
415
+ },
416
+ {
417
+ "epoch": 100.0,
418
+ "learning_rate": 0.0002014598540145985,
419
+ "loss": 0.0799,
420
+ "step": 2300
421
+ },
422
+ {
423
+ "epoch": 102.17,
424
+ "learning_rate": 0.0001992700729927007,
425
+ "loss": 0.0798,
426
+ "step": 2350
427
+ },
428
+ {
429
+ "epoch": 104.35,
430
+ "learning_rate": 0.0001970802919708029,
431
+ "loss": 0.0742,
432
+ "step": 2400
433
+ },
434
+ {
435
+ "epoch": 104.35,
436
+ "eval_cer": 0.33643122676579923,
437
+ "eval_loss": 2.229339838027954,
438
+ "eval_runtime": 1.2156,
439
+ "eval_samples_per_second": 37.019,
440
+ "eval_steps_per_second": 2.468,
441
+ "step": 2400
442
+ },
443
+ {
444
+ "epoch": 106.52,
445
+ "learning_rate": 0.0001948905109489051,
446
+ "loss": 0.0692,
447
+ "step": 2450
448
+ },
449
+ {
450
+ "epoch": 108.7,
451
+ "learning_rate": 0.0001927007299270073,
452
+ "loss": 0.0664,
453
+ "step": 2500
454
+ },
455
+ {
456
+ "epoch": 110.87,
457
+ "learning_rate": 0.00019051094890510948,
458
+ "loss": 0.0629,
459
+ "step": 2550
460
+ },
461
+ {
462
+ "epoch": 110.87,
463
+ "eval_cer": 0.33705080545229243,
464
+ "eval_loss": 2.2878150939941406,
465
+ "eval_runtime": 1.2044,
466
+ "eval_samples_per_second": 37.364,
467
+ "eval_steps_per_second": 2.491,
468
+ "step": 2550
469
+ },
470
+ {
471
+ "epoch": 113.04,
472
+ "learning_rate": 0.00018832116788321167,
473
+ "loss": 0.0619,
474
+ "step": 2600
475
+ },
476
+ {
477
+ "epoch": 115.22,
478
+ "learning_rate": 0.00018613138686131387,
479
+ "loss": 0.0582,
480
+ "step": 2650
481
+ },
482
+ {
483
+ "epoch": 117.39,
484
+ "learning_rate": 0.00018394160583941606,
485
+ "loss": 0.0495,
486
+ "step": 2700
487
+ },
488
+ {
489
+ "epoch": 117.39,
490
+ "eval_cer": 0.34076827757125155,
491
+ "eval_loss": 2.2671637535095215,
492
+ "eval_runtime": 1.2039,
493
+ "eval_samples_per_second": 37.379,
494
+ "eval_steps_per_second": 2.492,
495
+ "step": 2700
496
+ },
497
+ {
498
+ "epoch": 119.57,
499
+ "learning_rate": 0.00018175182481751826,
500
+ "loss": 0.0614,
501
+ "step": 2750
502
+ },
503
+ {
504
+ "epoch": 121.74,
505
+ "learning_rate": 0.00017956204379562042,
506
+ "loss": 0.0565,
507
+ "step": 2800
508
+ },
509
+ {
510
+ "epoch": 123.91,
511
+ "learning_rate": 0.00017737226277372262,
512
+ "loss": 0.0466,
513
+ "step": 2850
514
+ },
515
+ {
516
+ "epoch": 123.91,
517
+ "eval_cer": 0.35254027261462206,
518
+ "eval_loss": 2.2532107830047607,
519
+ "eval_runtime": 1.3563,
520
+ "eval_samples_per_second": 33.179,
521
+ "eval_steps_per_second": 2.212,
522
+ "step": 2850
523
+ },
524
+ {
525
+ "epoch": 126.09,
526
+ "learning_rate": 0.00017518248175182478,
527
+ "loss": 0.0465,
528
+ "step": 2900
529
+ },
530
+ {
531
+ "epoch": 128.26,
532
+ "learning_rate": 0.00017299270072992698,
533
+ "loss": 0.0496,
534
+ "step": 2950
535
+ },
536
+ {
537
+ "epoch": 130.43,
538
+ "learning_rate": 0.00017080291970802917,
539
+ "loss": 0.0424,
540
+ "step": 3000
541
+ },
542
+ {
543
+ "epoch": 130.43,
544
+ "eval_cer": 0.32589838909541513,
545
+ "eval_loss": 2.2844393253326416,
546
+ "eval_runtime": 1.2006,
547
+ "eval_samples_per_second": 37.48,
548
+ "eval_steps_per_second": 2.499,
549
+ "step": 3000
550
+ },
551
+ {
552
+ "epoch": 132.61,
553
+ "learning_rate": 0.00016861313868613137,
554
+ "loss": 0.0483,
555
+ "step": 3050
556
+ },
557
+ {
558
+ "epoch": 134.78,
559
+ "learning_rate": 0.00016642335766423356,
560
+ "loss": 0.0488,
561
+ "step": 3100
562
+ },
563
+ {
564
+ "epoch": 136.96,
565
+ "learning_rate": 0.00016423357664233575,
566
+ "loss": 0.0446,
567
+ "step": 3150
568
+ },
569
+ {
570
+ "epoch": 136.96,
571
+ "eval_cer": 0.3252788104089219,
572
+ "eval_loss": 2.2763445377349854,
573
+ "eval_runtime": 1.2043,
574
+ "eval_samples_per_second": 37.368,
575
+ "eval_steps_per_second": 2.491,
576
+ "step": 3150
577
+ },
578
+ {
579
+ "epoch": 139.13,
580
+ "learning_rate": 0.00016204379562043795,
581
+ "loss": 0.0424,
582
+ "step": 3200
583
+ },
584
+ {
585
+ "epoch": 141.3,
586
+ "learning_rate": 0.00015985401459854014,
587
+ "loss": 0.0429,
588
+ "step": 3250
589
+ },
590
+ {
591
+ "epoch": 143.48,
592
+ "learning_rate": 0.00015766423357664234,
593
+ "loss": 0.0411,
594
+ "step": 3300
595
+ },
596
+ {
597
+ "epoch": 143.48,
598
+ "eval_cer": 0.3302354399008674,
599
+ "eval_loss": 2.301079034805298,
600
+ "eval_runtime": 1.345,
601
+ "eval_samples_per_second": 33.458,
602
+ "eval_steps_per_second": 2.231,
603
+ "step": 3300
604
+ },
605
+ {
606
+ "epoch": 145.65,
607
+ "learning_rate": 0.00015547445255474453,
608
+ "loss": 0.0392,
609
+ "step": 3350
610
+ },
611
+ {
612
+ "epoch": 147.83,
613
+ "learning_rate": 0.00015328467153284672,
614
+ "loss": 0.0426,
615
+ "step": 3400
616
+ },
617
+ {
618
+ "epoch": 150.0,
619
+ "learning_rate": 0.00015109489051094892,
620
+ "loss": 0.0419,
621
+ "step": 3450
622
+ },
623
+ {
624
+ "epoch": 150.0,
625
+ "eval_cer": 0.3420074349442379,
626
+ "eval_loss": 2.320059299468994,
627
+ "eval_runtime": 1.2411,
628
+ "eval_samples_per_second": 36.259,
629
+ "eval_steps_per_second": 2.417,
630
+ "step": 3450
631
+ },
632
+ {
633
+ "epoch": 152.17,
634
+ "learning_rate": 0.00014890510948905108,
635
+ "loss": 0.0386,
636
+ "step": 3500
637
+ },
638
+ {
639
+ "epoch": 154.35,
640
+ "learning_rate": 0.00014671532846715328,
641
+ "loss": 0.0402,
642
+ "step": 3550
643
+ },
644
+ {
645
+ "epoch": 156.52,
646
+ "learning_rate": 0.00014452554744525547,
647
+ "loss": 0.0333,
648
+ "step": 3600
649
+ },
650
+ {
651
+ "epoch": 156.52,
652
+ "eval_cer": 0.34386617100371747,
653
+ "eval_loss": 2.364445209503174,
654
+ "eval_runtime": 1.2337,
655
+ "eval_samples_per_second": 36.475,
656
+ "eval_steps_per_second": 2.432,
657
+ "step": 3600
658
+ },
659
+ {
660
+ "epoch": 158.7,
661
+ "learning_rate": 0.00014233576642335764,
662
+ "loss": 0.0434,
663
+ "step": 3650
664
+ },
665
+ {
666
+ "epoch": 160.87,
667
+ "learning_rate": 0.00014014598540145983,
668
+ "loss": 0.0393,
669
+ "step": 3700
670
+ },
671
+ {
672
+ "epoch": 163.04,
673
+ "learning_rate": 0.00013795620437956203,
674
+ "loss": 0.0384,
675
+ "step": 3750
676
+ },
677
+ {
678
+ "epoch": 163.04,
679
+ "eval_cer": 0.35315985130111527,
680
+ "eval_loss": 2.3685200214385986,
681
+ "eval_runtime": 1.2136,
682
+ "eval_samples_per_second": 37.081,
683
+ "eval_steps_per_second": 2.472,
684
+ "step": 3750
685
+ },
686
+ {
687
+ "epoch": 165.22,
688
+ "learning_rate": 0.00013576642335766422,
689
+ "loss": 0.0324,
690
+ "step": 3800
691
+ },
692
+ {
693
+ "epoch": 167.39,
694
+ "learning_rate": 0.00013357664233576641,
695
+ "loss": 0.0438,
696
+ "step": 3850
697
+ },
698
+ {
699
+ "epoch": 169.57,
700
+ "learning_rate": 0.0001313868613138686,
701
+ "loss": 0.0367,
702
+ "step": 3900
703
+ },
704
+ {
705
+ "epoch": 169.57,
706
+ "eval_cer": 0.3469640644361834,
707
+ "eval_loss": 2.397036552429199,
708
+ "eval_runtime": 1.2259,
709
+ "eval_samples_per_second": 36.708,
710
+ "eval_steps_per_second": 2.447,
711
+ "step": 3900
712
+ },
713
+ {
714
+ "epoch": 171.74,
715
+ "learning_rate": 0.00012919708029197077,
716
+ "loss": 0.0336,
717
+ "step": 3950
718
+ },
719
+ {
720
+ "epoch": 173.91,
721
+ "learning_rate": 0.00012700729927007297,
722
+ "loss": 0.037,
723
+ "step": 4000
724
+ },
725
+ {
726
+ "epoch": 176.09,
727
+ "learning_rate": 0.00012481751824817516,
728
+ "loss": 0.0307,
729
+ "step": 4050
730
+ },
731
+ {
732
+ "epoch": 176.09,
733
+ "eval_cer": 0.3308550185873606,
734
+ "eval_loss": 2.3530125617980957,
735
+ "eval_runtime": 1.2484,
736
+ "eval_samples_per_second": 36.047,
737
+ "eval_steps_per_second": 2.403,
738
+ "step": 4050
739
+ },
740
+ {
741
+ "epoch": 178.26,
742
+ "learning_rate": 0.00012262773722627736,
743
+ "loss": 0.0284,
744
+ "step": 4100
745
+ },
746
+ {
747
+ "epoch": 180.43,
748
+ "learning_rate": 0.00012043795620437955,
749
+ "loss": 0.0233,
750
+ "step": 4150
751
+ },
752
+ {
753
+ "epoch": 182.61,
754
+ "learning_rate": 0.00011824817518248174,
755
+ "loss": 0.0328,
756
+ "step": 4200
757
+ },
758
+ {
759
+ "epoch": 182.61,
760
+ "eval_cer": 0.33147459727385375,
761
+ "eval_loss": 2.3414556980133057,
762
+ "eval_runtime": 1.2281,
763
+ "eval_samples_per_second": 36.64,
764
+ "eval_steps_per_second": 2.443,
765
+ "step": 4200
766
+ },
767
+ {
768
+ "epoch": 184.78,
769
+ "learning_rate": 0.00011605839416058394,
770
+ "loss": 0.0285,
771
+ "step": 4250
772
+ },
773
+ {
774
+ "epoch": 186.96,
775
+ "learning_rate": 0.00011386861313868612,
776
+ "loss": 0.0222,
777
+ "step": 4300
778
+ },
779
+ {
780
+ "epoch": 189.13,
781
+ "learning_rate": 0.00011167883211678831,
782
+ "loss": 0.0271,
783
+ "step": 4350
784
+ },
785
+ {
786
+ "epoch": 189.13,
787
+ "eval_cer": 0.3308550185873606,
788
+ "eval_loss": 2.4165024757385254,
789
+ "eval_runtime": 1.1891,
790
+ "eval_samples_per_second": 37.844,
791
+ "eval_steps_per_second": 2.523,
792
+ "step": 4350
793
+ },
794
+ {
795
+ "epoch": 191.3,
796
+ "learning_rate": 0.00010948905109489051,
797
+ "loss": 0.0307,
798
+ "step": 4400
799
+ },
800
+ {
801
+ "epoch": 193.48,
802
+ "learning_rate": 0.00010729927007299269,
803
+ "loss": 0.023,
804
+ "step": 4450
805
+ },
806
+ {
807
+ "epoch": 195.65,
808
+ "learning_rate": 0.00010510948905109488,
809
+ "loss": 0.0213,
810
+ "step": 4500
811
+ },
812
+ {
813
+ "epoch": 195.65,
814
+ "eval_cer": 0.3451053283767038,
815
+ "eval_loss": 2.447828769683838,
816
+ "eval_runtime": 1.1406,
817
+ "eval_samples_per_second": 39.452,
818
+ "eval_steps_per_second": 2.63,
819
+ "step": 4500
820
+ },
821
+ {
822
+ "epoch": 197.83,
823
+ "learning_rate": 0.00010291970802919708,
824
+ "loss": 0.021,
825
+ "step": 4550
826
+ },
827
+ {
828
+ "epoch": 200.0,
829
+ "learning_rate": 0.00010072992700729926,
830
+ "loss": 0.0246,
831
+ "step": 4600
832
+ },
833
+ {
834
+ "epoch": 202.17,
835
+ "learning_rate": 9.854014598540145e-05,
836
+ "loss": 0.0193,
837
+ "step": 4650
838
+ },
839
+ {
840
+ "epoch": 202.17,
841
+ "eval_cer": 0.355638166047088,
842
+ "eval_loss": 2.524061918258667,
843
+ "eval_runtime": 1.203,
844
+ "eval_samples_per_second": 37.406,
845
+ "eval_steps_per_second": 2.494,
846
+ "step": 4650
847
+ },
848
+ {
849
+ "epoch": 204.35,
850
+ "learning_rate": 9.635036496350364e-05,
851
+ "loss": 0.0223,
852
+ "step": 4700
853
+ },
854
+ {
855
+ "epoch": 206.52,
856
+ "learning_rate": 9.416058394160584e-05,
857
+ "loss": 0.0223,
858
+ "step": 4750
859
+ },
860
+ {
861
+ "epoch": 208.7,
862
+ "learning_rate": 9.197080291970803e-05,
863
+ "loss": 0.0204,
864
+ "step": 4800
865
+ },
866
+ {
867
+ "epoch": 208.7,
868
+ "eval_cer": 0.34634448574969023,
869
+ "eval_loss": 2.570009708404541,
870
+ "eval_runtime": 1.2664,
871
+ "eval_samples_per_second": 35.533,
872
+ "eval_steps_per_second": 2.369,
873
+ "step": 4800
874
+ },
875
+ {
876
+ "epoch": 210.87,
877
+ "learning_rate": 8.978102189781021e-05,
878
+ "loss": 0.0202,
879
+ "step": 4850
880
+ },
881
+ {
882
+ "epoch": 213.04,
883
+ "learning_rate": 8.759124087591239e-05,
884
+ "loss": 0.0193,
885
+ "step": 4900
886
+ },
887
+ {
888
+ "epoch": 215.22,
889
+ "learning_rate": 8.540145985401459e-05,
890
+ "loss": 0.0185,
891
+ "step": 4950
892
+ },
893
+ {
894
+ "epoch": 215.22,
895
+ "eval_cer": 0.31784386617100374,
896
+ "eval_loss": 2.583724021911621,
897
+ "eval_runtime": 1.2549,
898
+ "eval_samples_per_second": 35.859,
899
+ "eval_steps_per_second": 2.391,
900
+ "step": 4950
901
+ },
902
+ {
903
+ "epoch": 217.39,
904
+ "learning_rate": 8.321167883211678e-05,
905
+ "loss": 0.0191,
906
+ "step": 5000
907
+ },
908
+ {
909
+ "epoch": 219.57,
910
+ "learning_rate": 8.102189781021897e-05,
911
+ "loss": 0.0169,
912
+ "step": 5050
913
+ },
914
+ {
915
+ "epoch": 221.74,
916
+ "learning_rate": 7.883211678832117e-05,
917
+ "loss": 0.0161,
918
+ "step": 5100
919
+ },
920
+ {
921
+ "epoch": 221.74,
922
+ "eval_cer": 0.33767038413878564,
923
+ "eval_loss": 2.513859987258911,
924
+ "eval_runtime": 1.2515,
925
+ "eval_samples_per_second": 35.958,
926
+ "eval_steps_per_second": 2.397,
927
+ "step": 5100
928
+ },
929
+ {
930
+ "epoch": 223.91,
931
+ "learning_rate": 7.664233576642336e-05,
932
+ "loss": 0.0183,
933
+ "step": 5150
934
+ },
935
+ {
936
+ "epoch": 226.09,
937
+ "learning_rate": 7.445255474452554e-05,
938
+ "loss": 0.0228,
939
+ "step": 5200
940
+ },
941
+ {
942
+ "epoch": 228.26,
943
+ "learning_rate": 7.226277372262774e-05,
944
+ "loss": 0.0167,
945
+ "step": 5250
946
+ },
947
+ {
948
+ "epoch": 228.26,
949
+ "eval_cer": 0.3351920693928129,
950
+ "eval_loss": 2.5287766456604004,
951
+ "eval_runtime": 1.2044,
952
+ "eval_samples_per_second": 37.363,
953
+ "eval_steps_per_second": 2.491,
954
+ "step": 5250
955
+ },
956
+ {
957
+ "epoch": 230.43,
958
+ "learning_rate": 7.007299270072992e-05,
959
+ "loss": 0.0181,
960
+ "step": 5300
961
+ },
962
+ {
963
+ "epoch": 232.61,
964
+ "learning_rate": 6.788321167883211e-05,
965
+ "loss": 0.0144,
966
+ "step": 5350
967
+ },
968
+ {
969
+ "epoch": 234.78,
970
+ "learning_rate": 6.56934306569343e-05,
971
+ "loss": 0.0148,
972
+ "step": 5400
973
+ },
974
+ {
975
+ "epoch": 234.78,
976
+ "eval_cer": 0.338909541511772,
977
+ "eval_loss": 2.574066400527954,
978
+ "eval_runtime": 1.2534,
979
+ "eval_samples_per_second": 35.904,
980
+ "eval_steps_per_second": 2.394,
981
+ "step": 5400
982
+ },
983
+ {
984
+ "epoch": 236.96,
985
+ "learning_rate": 6.350364963503648e-05,
986
+ "loss": 0.0143,
987
+ "step": 5450
988
+ },
989
+ {
990
+ "epoch": 239.13,
991
+ "learning_rate": 6.131386861313868e-05,
992
+ "loss": 0.0197,
993
+ "step": 5500
994
+ },
995
+ {
996
+ "epoch": 241.3,
997
+ "learning_rate": 5.912408759124087e-05,
998
+ "loss": 0.0141,
999
+ "step": 5550
1000
+ },
1001
+ {
1002
+ "epoch": 241.3,
1003
+ "eval_cer": 0.338909541511772,
1004
+ "eval_loss": 2.5173895359039307,
1005
+ "eval_runtime": 1.1989,
1006
+ "eval_samples_per_second": 37.536,
1007
+ "eval_steps_per_second": 2.502,
1008
+ "step": 5550
1009
+ },
1010
+ {
1011
+ "epoch": 243.48,
1012
+ "learning_rate": 5.693430656934306e-05,
1013
+ "loss": 0.0165,
1014
+ "step": 5600
1015
+ },
1016
+ {
1017
+ "epoch": 245.65,
1018
+ "learning_rate": 5.4744525547445253e-05,
1019
+ "loss": 0.0127,
1020
+ "step": 5650
1021
+ },
1022
+ {
1023
+ "epoch": 247.83,
1024
+ "learning_rate": 5.255474452554744e-05,
1025
+ "loss": 0.0122,
1026
+ "step": 5700
1027
+ },
1028
+ {
1029
+ "epoch": 247.83,
1030
+ "eval_cer": 0.3351920693928129,
1031
+ "eval_loss": 2.5573315620422363,
1032
+ "eval_runtime": 1.2363,
1033
+ "eval_samples_per_second": 36.4,
1034
+ "eval_steps_per_second": 2.427,
1035
+ "step": 5700
1036
+ },
1037
+ {
1038
+ "epoch": 250.0,
1039
+ "learning_rate": 5.036496350364963e-05,
1040
+ "loss": 0.0135,
1041
+ "step": 5750
1042
+ },
1043
+ {
1044
+ "epoch": 252.17,
1045
+ "learning_rate": 4.817518248175182e-05,
1046
+ "loss": 0.0116,
1047
+ "step": 5800
1048
+ },
1049
+ {
1050
+ "epoch": 254.35,
1051
+ "learning_rate": 4.5985401459854016e-05,
1052
+ "loss": 0.0115,
1053
+ "step": 5850
1054
+ },
1055
+ {
1056
+ "epoch": 254.35,
1057
+ "eval_cer": 0.32961586121437425,
1058
+ "eval_loss": 2.579023838043213,
1059
+ "eval_runtime": 1.2327,
1060
+ "eval_samples_per_second": 36.506,
1061
+ "eval_steps_per_second": 2.434,
1062
+ "step": 5850
1063
+ },
1064
+ {
1065
+ "epoch": 256.52,
1066
+ "learning_rate": 4.3795620437956196e-05,
1067
+ "loss": 0.0141,
1068
+ "step": 5900
1069
+ },
1070
+ {
1071
+ "epoch": 258.7,
1072
+ "learning_rate": 4.160583941605839e-05,
1073
+ "loss": 0.0143,
1074
+ "step": 5950
1075
+ },
1076
+ {
1077
+ "epoch": 260.87,
1078
+ "learning_rate": 3.9416058394160584e-05,
1079
+ "loss": 0.0141,
1080
+ "step": 6000
1081
+ },
1082
+ {
1083
+ "epoch": 260.87,
1084
+ "eval_cer": 0.32032218091697645,
1085
+ "eval_loss": 2.577375888824463,
1086
+ "eval_runtime": 1.2321,
1087
+ "eval_samples_per_second": 36.524,
1088
+ "eval_steps_per_second": 2.435,
1089
+ "step": 6000
1090
+ },
1091
+ {
1092
+ "epoch": 263.04,
1093
+ "learning_rate": 3.722627737226277e-05,
1094
+ "loss": 0.0116,
1095
+ "step": 6050
1096
+ },
1097
+ {
1098
+ "epoch": 265.22,
1099
+ "learning_rate": 3.503649635036496e-05,
1100
+ "loss": 0.0101,
1101
+ "step": 6100
1102
+ },
1103
+ {
1104
+ "epoch": 267.39,
1105
+ "learning_rate": 3.284671532846715e-05,
1106
+ "loss": 0.0123,
1107
+ "step": 6150
1108
+ },
1109
+ {
1110
+ "epoch": 267.39,
1111
+ "eval_cer": 0.3308550185873606,
1112
+ "eval_loss": 2.614670753479004,
1113
+ "eval_runtime": 1.1319,
1114
+ "eval_samples_per_second": 39.755,
1115
+ "eval_steps_per_second": 2.65,
1116
+ "step": 6150
1117
+ },
1118
+ {
1119
+ "epoch": 269.57,
1120
+ "learning_rate": 3.065693430656934e-05,
1121
+ "loss": 0.0151,
1122
+ "step": 6200
1123
+ },
1124
+ {
1125
+ "epoch": 271.74,
1126
+ "learning_rate": 2.846715328467153e-05,
1127
+ "loss": 0.0099,
1128
+ "step": 6250
1129
+ },
1130
+ {
1131
+ "epoch": 273.91,
1132
+ "learning_rate": 2.627737226277372e-05,
1133
+ "loss": 0.0214,
1134
+ "step": 6300
1135
+ },
1136
+ {
1137
+ "epoch": 273.91,
1138
+ "eval_cer": 0.3302354399008674,
1139
+ "eval_loss": 2.620166778564453,
1140
+ "eval_runtime": 1.262,
1141
+ "eval_samples_per_second": 35.657,
1142
+ "eval_steps_per_second": 2.377,
1143
+ "step": 6300
1144
+ },
1145
+ {
1146
+ "epoch": 276.09,
1147
+ "learning_rate": 2.408759124087591e-05,
1148
+ "loss": 0.0085,
1149
+ "step": 6350
1150
+ },
1151
+ {
1152
+ "epoch": 278.26,
1153
+ "learning_rate": 2.1897810218978098e-05,
1154
+ "loss": 0.0119,
1155
+ "step": 6400
1156
+ },
1157
+ {
1158
+ "epoch": 280.43,
1159
+ "learning_rate": 1.9708029197080292e-05,
1160
+ "loss": 0.0107,
1161
+ "step": 6450
1162
+ },
1163
+ {
1164
+ "epoch": 280.43,
1165
+ "eval_cer": 0.32342007434944237,
1166
+ "eval_loss": 2.6263809204101562,
1167
+ "eval_runtime": 1.2547,
1168
+ "eval_samples_per_second": 35.867,
1169
+ "eval_steps_per_second": 2.391,
1170
+ "step": 6450
1171
+ },
1172
+ {
1173
+ "epoch": 282.61,
1174
+ "learning_rate": 1.751824817518248e-05,
1175
+ "loss": 0.0107,
1176
+ "step": 6500
1177
+ },
1178
+ {
1179
+ "epoch": 284.78,
1180
+ "learning_rate": 1.532846715328467e-05,
1181
+ "loss": 0.0105,
1182
+ "step": 6550
1183
+ },
1184
+ {
1185
+ "epoch": 286.96,
1186
+ "learning_rate": 1.313868613138686e-05,
1187
+ "loss": 0.0086,
1188
+ "step": 6600
1189
+ },
1190
+ {
1191
+ "epoch": 286.96,
1192
+ "eval_cer": 0.3215613382899628,
1193
+ "eval_loss": 2.607461452484131,
1194
+ "eval_runtime": 1.204,
1195
+ "eval_samples_per_second": 37.374,
1196
+ "eval_steps_per_second": 2.492,
1197
+ "step": 6600
1198
+ },
1199
+ {
1200
+ "epoch": 289.13,
1201
+ "learning_rate": 1.0948905109489049e-05,
1202
+ "loss": 0.0095,
1203
+ "step": 6650
1204
+ },
1205
+ {
1206
+ "epoch": 291.3,
1207
+ "learning_rate": 8.75912408759124e-06,
1208
+ "loss": 0.0108,
1209
+ "step": 6700
1210
+ },
1211
+ {
1212
+ "epoch": 293.48,
1213
+ "learning_rate": 6.56934306569343e-06,
1214
+ "loss": 0.0106,
1215
+ "step": 6750
1216
+ },
1217
+ {
1218
+ "epoch": 293.48,
1219
+ "eval_cer": 0.3246592317224288,
1220
+ "eval_loss": 2.595982789993286,
1221
+ "eval_runtime": 1.1323,
1222
+ "eval_samples_per_second": 39.741,
1223
+ "eval_steps_per_second": 2.649,
1224
+ "step": 6750
1225
+ },
1226
+ {
1227
+ "epoch": 295.65,
1228
+ "learning_rate": 4.37956204379562e-06,
1229
+ "loss": 0.0143,
1230
+ "step": 6800
1231
+ },
1232
+ {
1233
+ "epoch": 297.83,
1234
+ "learning_rate": 2.18978102189781e-06,
1235
+ "loss": 0.0105,
1236
+ "step": 6850
1237
+ },
1238
+ {
1239
+ "epoch": 300.0,
1240
+ "learning_rate": 0.0,
1241
+ "loss": 0.0085,
1242
+ "step": 6900
1243
+ },
1244
+ {
1245
+ "epoch": 300.0,
1246
+ "eval_cer": 0.32403965303593557,
1247
+ "eval_loss": 2.5951595306396484,
1248
+ "eval_runtime": 1.2068,
1249
+ "eval_samples_per_second": 37.288,
1250
+ "eval_steps_per_second": 2.486,
1251
+ "step": 6900
1252
+ },
1253
+ {
1254
+ "epoch": 300.0,
1255
+ "step": 6900,
1256
+ "total_flos": 2.3112928880616276e+19,
1257
+ "train_loss": 0.8083851718038753,
1258
+ "train_runtime": 4592.71,
1259
+ "train_samples_per_second": 23.45,
1260
+ "train_steps_per_second": 1.502
1261
+ }
1262
+ ],
1263
+ "logging_steps": 50,
1264
+ "max_steps": 6900,
1265
+ "num_train_epochs": 300,
1266
+ "save_steps": 150,
1267
+ "total_flos": 2.3112928880616276e+19,
1268
+ "trial_name": null,
1269
+ "trial_params": null
1270
+ }
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c0741fe1648758c067baeb587c00ff9d0528d818e60814b62c8d0f8ca82d1c4d
3
+ size 4472
vocab.json ADDED
@@ -0,0 +1,679 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "0": 1,
3
+ "1": 2,
4
+ "2": 3,
5
+ "3": 4,
6
+ "4": 5,
7
+ "5": 6,
8
+ "6": 7,
9
+ "7": 8,
10
+ "8": 9,
11
+ "9": 10,
12
+ "[PAD]": 676,
13
+ "[UNK]": 675,
14
+ "|": 0,
15
+ " ": 11,
16
+ "、": 12,
17
+ "。": 13,
18
+ "々": 14,
19
+ "ぁ": 15,
20
+ "あ": 16,
21
+ "い": 17,
22
+ "う": 18,
23
+ "え": 19,
24
+ "お": 20,
25
+ "か": 21,
26
+ "が": 22,
27
+ "き": 23,
28
+ "ぎ": 24,
29
+ "く": 25,
30
+ "ぐ": 26,
31
+ "け": 27,
32
+ "げ": 28,
33
+ "こ": 29,
34
+ "ご": 30,
35
+ "さ": 31,
36
+ "ざ": 32,
37
+ "し": 33,
38
+ "じ": 34,
39
+ "す": 35,
40
+ "ず": 36,
41
+ "せ": 37,
42
+ "ぜ": 38,
43
+ "そ": 39,
44
+ "た": 40,
45
+ "だ": 41,
46
+ "ち": 42,
47
+ "っ": 43,
48
+ "つ": 44,
49
+ "て": 45,
50
+ "で": 46,
51
+ "と": 47,
52
+ "ど": 48,
53
+ "な": 49,
54
+ "に": 50,
55
+ "ぬ": 51,
56
+ "ね": 52,
57
+ "の": 53,
58
+ "は": 54,
59
+ "ば": 55,
60
+ "ぱ": 56,
61
+ "ひ": 57,
62
+ "び": 58,
63
+ "ふ": 59,
64
+ "ぶ": 60,
65
+ "ぷ": 61,
66
+ "へ": 62,
67
+ "べ": 63,
68
+ "ほ": 64,
69
+ "ぼ": 65,
70
+ "ぽ": 66,
71
+ "ま": 67,
72
+ "み": 68,
73
+ "む": 69,
74
+ "め": 70,
75
+ "も": 71,
76
+ "ゃ": 72,
77
+ "や": 73,
78
+ "ゆ": 74,
79
+ "ょ": 75,
80
+ "よ": 76,
81
+ "ら": 77,
82
+ "り": 78,
83
+ "る": 79,
84
+ "れ": 80,
85
+ "ろ": 81,
86
+ "わ": 82,
87
+ "を": 83,
88
+ "ん": 84,
89
+ "ァ": 85,
90
+ "ア": 86,
91
+ "ィ": 87,
92
+ "イ": 88,
93
+ "ウ": 89,
94
+ "ェ": 90,
95
+ "エ": 91,
96
+ "ォ": 92,
97
+ "オ": 93,
98
+ "カ": 94,
99
+ "ガ": 95,
100
+ "キ": 96,
101
+ "ギ": 97,
102
+ "ク": 98,
103
+ "グ": 99,
104
+ "ケ": 100,
105
+ "ゲ": 101,
106
+ "コ": 102,
107
+ "ゴ": 103,
108
+ "サ": 104,
109
+ "ザ": 105,
110
+ "シ": 106,
111
+ "ジ": 107,
112
+ "ス": 108,
113
+ "ズ": 109,
114
+ "セ": 110,
115
+ "ソ": 111,
116
+ "タ": 112,
117
+ "ダ": 113,
118
+ "チ": 114,
119
+ "ッ": 115,
120
+ "ツ": 116,
121
+ "テ": 117,
122
+ "デ": 118,
123
+ "ト": 119,
124
+ "ド": 120,
125
+ "ナ": 121,
126
+ "ニ": 122,
127
+ "ネ": 123,
128
+ "ノ": 124,
129
+ "ハ": 125,
130
+ "バ": 126,
131
+ "パ": 127,
132
+ "ヒ": 128,
133
+ "ビ": 129,
134
+ "ピ": 130,
135
+ "フ": 131,
136
+ "ブ": 132,
137
+ "プ": 133,
138
+ "ベ": 134,
139
+ "ペ": 135,
140
+ "ホ": 136,
141
+ "ボ": 137,
142
+ "ポ": 138,
143
+ "マ": 139,
144
+ "ミ": 140,
145
+ "ム": 141,
146
+ "メ": 142,
147
+ "モ": 143,
148
+ "ャ": 144,
149
+ "ヤ": 145,
150
+ "ュ": 146,
151
+ "ヨ": 147,
152
+ "ラ": 148,
153
+ "リ": 149,
154
+ "ル": 150,
155
+ "レ": 151,
156
+ "ロ": 152,
157
+ "ワ": 153,
158
+ "ン": 154,
159
+ "ヶ": 155,
160
+ "ー": 156,
161
+ "一": 157,
162
+ "万": 158,
163
+ "丈": 159,
164
+ "三": 160,
165
+ "上": 161,
166
+ "下": 162,
167
+ "不": 163,
168
+ "中": 164,
169
+ "丸": 165,
170
+ "主": 166,
171
+ "久": 167,
172
+ "九": 168,
173
+ "乾": 169,
174
+ "予": 170,
175
+ "事": 171,
176
+ "二": 172,
177
+ "五": 173,
178
+ "井": 174,
179
+ "交": 175,
180
+ "京": 176,
181
+ "人": 177,
182
+ "今": 178,
183
+ "仏": 179,
184
+ "仕": 180,
185
+ "他": 181,
186
+ "付": 182,
187
+ "代": 183,
188
+ "以": 184,
189
+ "件": 185,
190
+ "企": 186,
191
+ "伊": 187,
192
+ "休": 188,
193
+ "会": 189,
194
+ "伸": 190,
195
+ "住": 191,
196
+ "体": 192,
197
+ "何": 193,
198
+ "余": 194,
199
+ "作": 195,
200
+ "使": 196,
201
+ "例": 197,
202
+ "保": 198,
203
+ "信": 199,
204
+ "俣": 200,
205
+ "個": 201,
206
+ "倒": 202,
207
+ "候": 203,
208
+ "健": 204,
209
+ "備": 205,
210
+ "元": 206,
211
+ "充": 207,
212
+ "先": 208,
213
+ "入": 209,
214
+ "全": 210,
215
+ "公": 211,
216
+ "共": 212,
217
+ "内": 213,
218
+ "円": 214,
219
+ "写": 215,
220
+ "冬": 216,
221
+ "冷": 217,
222
+ "凍": 218,
223
+ "出": 219,
224
+ "分": 220,
225
+ "切": 221,
226
+ "初": 222,
227
+ "到": 223,
228
+ "制": 224,
229
+ "前": 225,
230
+ "力": 226,
231
+ "加": 227,
232
+ "動": 228,
233
+ "募": 229,
234
+ "勧": 230,
235
+ "化": 231,
236
+ "北": 232,
237
+ "南": 233,
238
+ "厚": 234,
239
+ "原": 235,
240
+ "去": 236,
241
+ "参": 237,
242
+ "友": 238,
243
+ "取": 239,
244
+ "口": 240,
245
+ "古": 241,
246
+ "可": 242,
247
+ "台": 243,
248
+ "号": 244,
249
+ "司": 245,
250
+ "合": 246,
251
+ "吉": 247,
252
+ "吊": 248,
253
+ "同": 249,
254
+ "名": 250,
255
+ "吹": 251,
256
+ "味": 252,
257
+ "呼": 253,
258
+ "和": 254,
259
+ "品": 255,
260
+ "唇": 256,
261
+ "商": 257,
262
+ "問": 258,
263
+ "噌": 259,
264
+ "回": 260,
265
+ "固": 261,
266
+ "国": 262,
267
+ "園": 263,
268
+ "地": 264,
269
+ "型": 265,
270
+ "域": 266,
271
+ "報": 267,
272
+ "場": 268,
273
+ "塗": 269,
274
+ "増": 270,
275
+ "声": 271,
276
+ "売": 272,
277
+ "変": 273,
278
+ "夏": 274,
279
+ "外": 275,
280
+ "多": 276,
281
+ "大": 277,
282
+ "天": 278,
283
+ "太": 279,
284
+ "夫": 280,
285
+ "失": 281,
286
+ "奈": 282,
287
+ "奥": 283,
288
+ "女": 284,
289
+ "好": 285,
290
+ "始": 286,
291
+ "嫌": 287,
292
+ "嬉": 288,
293
+ "子": 289,
294
+ "存": 290,
295
+ "孝": 291,
296
+ "学": 292,
297
+ "定": 293,
298
+ "実": 294,
299
+ "室": 295,
300
+ "宮": 296,
301
+ "家": 297,
302
+ "容": 298,
303
+ "寝": 299,
304
+ "寺": 300,
305
+ "対": 301,
306
+ "小": 302,
307
+ "少": 303,
308
+ "尾": 304,
309
+ "局": 305,
310
+ "届": 306,
311
+ "屋": 307,
312
+ "山": 308,
313
+ "岐": 309,
314
+ "岡": 310,
315
+ "岩": 311,
316
+ "岳": 312,
317
+ "島": 313,
318
+ "川": 314,
319
+ "帰": 315,
320
+ "常": 316,
321
+ "平": 317,
322
+ "年": 318,
323
+ "幻": 319,
324
+ "広": 320,
325
+ "底": 321,
326
+ "店": 322,
327
+ "座": 323,
328
+ "庫": 324,
329
+ "弁": 325,
330
+ "式": 326,
331
+ "張": 327,
332
+ "強": 328,
333
+ "当": 329,
334
+ "形": 330,
335
+ "影": 331,
336
+ "待": 332,
337
+ "後": 333,
338
+ "得": 334,
339
+ "忘": 335,
340
+ "応": 336,
341
+ "思": 337,
342
+ "怠": 338,
343
+ "恥": 339,
344
+ "悪": 340,
345
+ "情": 341,
346
+ "想": 342,
347
+ "意": 343,
348
+ "愛": 344,
349
+ "感": 345,
350
+ "慢": 346,
351
+ "成": 347,
352
+ "我": 348,
353
+ "戦": 349,
354
+ "戻": 350,
355
+ "所": 351,
356
+ "手": 352,
357
+ "打": 353,
358
+ "抜": 354,
359
+ "押": 355,
360
+ "拝": 356,
361
+ "拶": 357,
362
+ "持": 358,
363
+ "指": 359,
364
+ "挨": 360,
365
+ "掃": 361,
366
+ "援": 362,
367
+ "教": 363,
368
+ "数": 364,
369
+ "文": 365,
370
+ "料": 366,
371
+ "断": 367,
372
+ "新": 368,
373
+ "方": 369,
374
+ "旗": 370,
375
+ "日": 371,
376
+ "旦": 372,
377
+ "早": 373,
378
+ "明": 374,
379
+ "映": 375,
380
+ "春": 376,
381
+ "昨": 377,
382
+ "是": 378,
383
+ "昼": 379,
384
+ "時": 380,
385
+ "普": 381,
386
+ "景": 382,
387
+ "晴": 383,
388
+ "暑": 384,
389
+ "暗": 385,
390
+ "書": 386,
391
+ "最": 387,
392
+ "月": 388,
393
+ "有": 389,
394
+ "望": 390,
395
+ "期": 391,
396
+ "木": 392,
397
+ "本": 393,
398
+ "机": 394,
399
+ "村": 395,
400
+ "来": 396,
401
+ "杯": 397,
402
+ "東": 398,
403
+ "林": 399,
404
+ "枚": 400,
405
+ "柴": 401,
406
+ "校": 402,
407
+ "梨": 403,
408
+ "棒": 404,
409
+ "森": 405,
410
+ "椿": 406,
411
+ "楽": 407,
412
+ "構": 408,
413
+ "横": 409,
414
+ "樹": 410,
415
+ "機": 411,
416
+ "欄": 412,
417
+ "次": 413,
418
+ "欲": 414,
419
+ "正": 415,
420
+ "残": 416,
421
+ "段": 417,
422
+ "母": 418,
423
+ "毎": 419,
424
+ "比": 420,
425
+ "毛": 421,
426
+ "気": 422,
427
+ "水": 423,
428
+ "汁": 424,
429
+ "汗": 425,
430
+ "況": 426,
431
+ "泉": 427,
432
+ "泊": 428,
433
+ "法": 429,
434
+ "注": 430,
435
+ "洋": 431,
436
+ "活": 432,
437
+ "流": 433,
438
+ "海": 434,
439
+ "消": 435,
440
+ "減": 436,
441
+ "渡": 437,
442
+ "温": 438,
443
+ "準": 439,
444
+ "漫": 440,
445
+ "激": 441,
446
+ "濃": 442,
447
+ "瀬": 443,
448
+ "火": 444,
449
+ "炎": 445,
450
+ "炭": 446,
451
+ "焚": 447,
452
+ "焦": 448,
453
+ "然": 449,
454
+ "焼": 450,
455
+ "照": 451,
456
+ "煮": 452,
457
+ "熊": 453,
458
+ "熱": 454,
459
+ "燃": 455,
460
+ "燕": 456,
461
+ "燥": 457,
462
+ "父": 458,
463
+ "物": 459,
464
+ "特": 460,
465
+ "犬": 461,
466
+ "状": 462,
467
+ "狙": 463,
468
+ "独": 464,
469
+ "狭": 465,
470
+ "猫": 466,
471
+ "獣": 467,
472
+ "王": 468,
473
+ "球": 469,
474
+ "理": 470,
475
+ "生": 471,
476
+ "用": 472,
477
+ "田": 473,
478
+ "甲": 474,
479
+ "申": 475,
480
+ "町": 476,
481
+ "画": 477,
482
+ "界": 478,
483
+ "留": 479,
484
+ "番": 480,
485
+ "疲": 481,
486
+ "癒": 482,
487
+ "発": 483,
488
+ "登": 484,
489
+ "白": 485,
490
+ "百": 486,
491
+ "的": 487,
492
+ "皆": 488,
493
+ "皿": 489,
494
+ "監": 490,
495
+ "目": 491,
496
+ "直": 492,
497
+ "相": 493,
498
+ "省": 494,
499
+ "県": 495,
500
+ "真": 496,
501
+ "督": 497,
502
+ "瞬": 498,
503
+ "知": 499,
504
+ "硬": 500,
505
+ "確": 501,
506
+ "礼": 502,
507
+ "社": 503,
508
+ "神": 504,
509
+ "福": 505,
510
+ "私": 506,
511
+ "移": 507,
512
+ "稲": 508,
513
+ "穂": 509,
514
+ "空": 510,
515
+ "立": 511,
516
+ "端": 512,
517
+ "答": 513,
518
+ "箇": 514,
519
+ "箱": 515,
520
+ "籍": 516,
521
+ "米": 517,
522
+ "粛": 518,
523
+ "精": 519,
524
+ "糖": 520,
525
+ "系": 521,
526
+ "納": 522,
527
+ "素": 523,
528
+ "細": 524,
529
+ "終": 525,
530
+ "結": 526,
531
+ "絶": 527,
532
+ "継": 528,
533
+ "綺": 529,
534
+ "綿": 530,
535
+ "緒": 531,
536
+ "締": 532,
537
+ "練": 533,
538
+ "縁": 534,
539
+ "繰": 535,
540
+ "缶": 536,
541
+ "置": 537,
542
+ "羊": 538,
543
+ "美": 539,
544
+ "義": 540,
545
+ "考": 541,
546
+ "者": 542,
547
+ "耳": 543,
548
+ "聞": 544,
549
+ "肉": 545,
550
+ "育": 546,
551
+ "腹": 547,
552
+ "自": 548,
553
+ "良": 549,
554
+ "色": 550,
555
+ "若": 551,
556
+ "茶": 552,
557
+ "荒": 553,
558
+ "荘": 554,
559
+ "荷": 555,
560
+ "落": 556,
561
+ "蔵": 557,
562
+ "薬": 558,
563
+ "蝶": 559,
564
+ "行": 560,
565
+ "街": 561,
566
+ "褒": 562,
567
+ "西": 563,
568
+ "要": 564,
569
+ "見": 565,
570
+ "視": 566,
571
+ "覧": 567,
572
+ "親": 568,
573
+ "観": 569,
574
+ "言": 570,
575
+ "記": 571,
576
+ "設": 572,
577
+ "許": 573,
578
+ "訳": 574,
579
+ "試": 575,
580
+ "話": 576,
581
+ "詳": 577,
582
+ "説": 578,
583
+ "読": 579,
584
+ "誰": 580,
585
+ "調": 581,
586
+ "請": 582,
587
+ "謝": 583,
588
+ "識": 584,
589
+ "議": 585,
590
+ "谷": 586,
591
+ "買": 587,
592
+ "質": 588,
593
+ "赤": 589,
594
+ "走": 590,
595
+ "越": 591,
596
+ "路": 592,
597
+ "身": 593,
598
+ "車": 594,
599
+ "転": 595,
600
+ "載": 596,
601
+ "辛": 597,
602
+ "辺": 598,
603
+ "込": 599,
604
+ "近": 600,
605
+ "返": 601,
606
+ "追": 602,
607
+ "途": 603,
608
+ "通": 604,
609
+ "速": 605,
610
+ "連": 606,
611
+ "週": 607,
612
+ "遅": 608,
613
+ "運": 609,
614
+ "過": 610,
615
+ "達": 611,
616
+ "違": 612,
617
+ "適": 613,
618
+ "選": 614,
619
+ "郎": 615,
620
+ "部": 616,
621
+ "配": 617,
622
+ "酒": 618,
623
+ "重": 619,
624
+ "野": 620,
625
+ "量": 621,
626
+ "釣": 622,
627
+ "録": 623,
628
+ "鍵": 624,
629
+ "長": 625,
630
+ "開": 626,
631
+ "間": 627,
632
+ "関": 628,
633
+ "閣": 629,
634
+ "阜": 630,
635
+ "降": 631,
636
+ "限": 632,
637
+ "院": 633,
638
+ "除": 634,
639
+ "陸": 635,
640
+ "雅": 636,
641
+ "集": 637,
642
+ "雉": 638,
643
+ "難": 639,
644
+ "雨": 640,
645
+ "雪": 641,
646
+ "電": 642,
647
+ "青": 643,
648
+ "非": 644,
649
+ "面": 645,
650
+ "音": 646,
651
+ "響": 647,
652
+ "頂": 648,
653
+ "頃": 649,
654
+ "順": 650,
655
+ "頼": 651,
656
+ "顔": 652,
657
+ "風": 653,
658
+ "食": 654,
659
+ "飲": 655,
660
+ "飼": 656,
661
+ "馬": 657,
662
+ "験": 658,
663
+ "驚": 659,
664
+ "高": 660,
665
+ "髪": 661,
666
+ "鬼": 662,
667
+ "鶏": 663,
668
+ "鹿": 664,
669
+ "麗": 665,
670
+ "!": 666,
671
+ "(": 667,
672
+ ")": 668,
673
+ "/": 669,
674
+ "1": 670,
675
+ "2": 671,
676
+ "3": 672,
677
+ "?": 673,
678
+ "m": 674
679
+ }