File size: 23,781 Bytes
5bf8763
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
2024-03-26 15:40:12,486 ----------------------------------------------------------------------------------------------------
2024-03-26 15:40:12,486 Model: "SequenceTagger(
  (embeddings): TransformerWordEmbeddings(
    (model): BertModel(
      (embeddings): BertEmbeddings(
        (word_embeddings): Embedding(31103, 768)
        (position_embeddings): Embedding(512, 768)
        (token_type_embeddings): Embedding(2, 768)
        (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
        (dropout): Dropout(p=0.1, inplace=False)
      )
      (encoder): BertEncoder(
        (layer): ModuleList(
          (0-11): 12 x BertLayer(
            (attention): BertAttention(
              (self): BertSelfAttention(
                (query): Linear(in_features=768, out_features=768, bias=True)
                (key): Linear(in_features=768, out_features=768, bias=True)
                (value): Linear(in_features=768, out_features=768, bias=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
              (output): BertSelfOutput(
                (dense): Linear(in_features=768, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (intermediate): BertIntermediate(
              (dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
            )
            (output): BertOutput(
              (dense): Linear(in_features=3072, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
        )
      )
      (pooler): BertPooler(
        (dense): Linear(in_features=768, out_features=768, bias=True)
        (activation): Tanh()
      )
    )
  )
  (locked_dropout): LockedDropout(p=0.5)
  (linear): Linear(in_features=768, out_features=17, bias=True)
  (loss_function): CrossEntropyLoss()
)"
2024-03-26 15:40:12,486 ----------------------------------------------------------------------------------------------------
2024-03-26 15:40:12,487 Corpus: 758 train + 94 dev + 96 test sentences
2024-03-26 15:40:12,487 ----------------------------------------------------------------------------------------------------
2024-03-26 15:40:12,487 Train:  758 sentences
2024-03-26 15:40:12,487         (train_with_dev=False, train_with_test=False)
2024-03-26 15:40:12,487 ----------------------------------------------------------------------------------------------------
2024-03-26 15:40:12,487 Training Params:
2024-03-26 15:40:12,487  - learning_rate: "3e-05" 
2024-03-26 15:40:12,487  - mini_batch_size: "8"
2024-03-26 15:40:12,487  - max_epochs: "10"
2024-03-26 15:40:12,487  - shuffle: "True"
2024-03-26 15:40:12,487 ----------------------------------------------------------------------------------------------------
2024-03-26 15:40:12,487 Plugins:
2024-03-26 15:40:12,487  - TensorboardLogger
2024-03-26 15:40:12,487  - LinearScheduler | warmup_fraction: '0.1'
2024-03-26 15:40:12,487 ----------------------------------------------------------------------------------------------------
2024-03-26 15:40:12,487 Final evaluation on model from best epoch (best-model.pt)
2024-03-26 15:40:12,487  - metric: "('micro avg', 'f1-score')"
2024-03-26 15:40:12,487 ----------------------------------------------------------------------------------------------------
2024-03-26 15:40:12,487 Computation:
2024-03-26 15:40:12,487  - compute on device: cuda:0
2024-03-26 15:40:12,487  - embedding storage: none
2024-03-26 15:40:12,487 ----------------------------------------------------------------------------------------------------
2024-03-26 15:40:12,487 Model training base path: "flair-co-funer-german_dbmdz_bert_base-bs8-e10-lr3e-05-2"
2024-03-26 15:40:12,487 ----------------------------------------------------------------------------------------------------
2024-03-26 15:40:12,487 ----------------------------------------------------------------------------------------------------
2024-03-26 15:40:12,487 Logging anything other than scalars to TensorBoard is currently not supported.
2024-03-26 15:40:14,307 epoch 1 - iter 9/95 - loss 3.06540800 - time (sec): 1.82 - samples/sec: 1935.79 - lr: 0.000003 - momentum: 0.000000
2024-03-26 15:40:16,395 epoch 1 - iter 18/95 - loss 2.98543676 - time (sec): 3.91 - samples/sec: 1844.22 - lr: 0.000005 - momentum: 0.000000
2024-03-26 15:40:17,950 epoch 1 - iter 27/95 - loss 2.82818511 - time (sec): 5.46 - samples/sec: 1845.71 - lr: 0.000008 - momentum: 0.000000
2024-03-26 15:40:19,874 epoch 1 - iter 36/95 - loss 2.64677852 - time (sec): 7.39 - samples/sec: 1868.41 - lr: 0.000011 - momentum: 0.000000
2024-03-26 15:40:21,944 epoch 1 - iter 45/95 - loss 2.48493169 - time (sec): 9.46 - samples/sec: 1802.88 - lr: 0.000014 - momentum: 0.000000
2024-03-26 15:40:23,899 epoch 1 - iter 54/95 - loss 2.34259258 - time (sec): 11.41 - samples/sec: 1779.43 - lr: 0.000017 - momentum: 0.000000
2024-03-26 15:40:25,435 epoch 1 - iter 63/95 - loss 2.22525861 - time (sec): 12.95 - samples/sec: 1786.68 - lr: 0.000020 - momentum: 0.000000
2024-03-26 15:40:26,692 epoch 1 - iter 72/95 - loss 2.10777242 - time (sec): 14.21 - samples/sec: 1840.95 - lr: 0.000022 - momentum: 0.000000
2024-03-26 15:40:28,235 epoch 1 - iter 81/95 - loss 1.99428411 - time (sec): 15.75 - samples/sec: 1867.99 - lr: 0.000025 - momentum: 0.000000
2024-03-26 15:40:30,167 epoch 1 - iter 90/95 - loss 1.88297203 - time (sec): 17.68 - samples/sec: 1844.46 - lr: 0.000028 - momentum: 0.000000
2024-03-26 15:40:31,213 ----------------------------------------------------------------------------------------------------
2024-03-26 15:40:31,213 EPOCH 1 done: loss 1.8180 - lr: 0.000028
2024-03-26 15:40:32,187 DEV : loss 0.5155203342437744 - f1-score (micro avg)  0.6383
2024-03-26 15:40:32,188 saving best model
2024-03-26 15:40:32,460 ----------------------------------------------------------------------------------------------------
2024-03-26 15:40:33,765 epoch 2 - iter 9/95 - loss 0.73800033 - time (sec): 1.30 - samples/sec: 2486.66 - lr: 0.000030 - momentum: 0.000000
2024-03-26 15:40:35,631 epoch 2 - iter 18/95 - loss 0.59456126 - time (sec): 3.17 - samples/sec: 2166.51 - lr: 0.000029 - momentum: 0.000000
2024-03-26 15:40:38,431 epoch 2 - iter 27/95 - loss 0.49498182 - time (sec): 5.97 - samples/sec: 1935.89 - lr: 0.000029 - momentum: 0.000000
2024-03-26 15:40:40,488 epoch 2 - iter 36/95 - loss 0.46708892 - time (sec): 8.03 - samples/sec: 1851.97 - lr: 0.000029 - momentum: 0.000000
2024-03-26 15:40:42,235 epoch 2 - iter 45/95 - loss 0.43499089 - time (sec): 9.77 - samples/sec: 1838.88 - lr: 0.000028 - momentum: 0.000000
2024-03-26 15:40:44,315 epoch 2 - iter 54/95 - loss 0.42013650 - time (sec): 11.85 - samples/sec: 1794.88 - lr: 0.000028 - momentum: 0.000000
2024-03-26 15:40:45,836 epoch 2 - iter 63/95 - loss 0.42234655 - time (sec): 13.38 - samples/sec: 1819.56 - lr: 0.000028 - momentum: 0.000000
2024-03-26 15:40:47,314 epoch 2 - iter 72/95 - loss 0.41658737 - time (sec): 14.85 - samples/sec: 1846.84 - lr: 0.000028 - momentum: 0.000000
2024-03-26 15:40:48,477 epoch 2 - iter 81/95 - loss 0.41208706 - time (sec): 16.02 - samples/sec: 1881.01 - lr: 0.000027 - momentum: 0.000000
2024-03-26 15:40:49,745 epoch 2 - iter 90/95 - loss 0.40602769 - time (sec): 17.28 - samples/sec: 1903.37 - lr: 0.000027 - momentum: 0.000000
2024-03-26 15:40:50,697 ----------------------------------------------------------------------------------------------------
2024-03-26 15:40:50,697 EPOCH 2 done: loss 0.3967 - lr: 0.000027
2024-03-26 15:40:51,586 DEV : loss 0.26970556378364563 - f1-score (micro avg)  0.8238
2024-03-26 15:40:51,587 saving best model
2024-03-26 15:40:52,052 ----------------------------------------------------------------------------------------------------
2024-03-26 15:40:54,053 epoch 3 - iter 9/95 - loss 0.20835677 - time (sec): 2.00 - samples/sec: 1665.21 - lr: 0.000026 - momentum: 0.000000
2024-03-26 15:40:56,093 epoch 3 - iter 18/95 - loss 0.23115316 - time (sec): 4.04 - samples/sec: 1797.95 - lr: 0.000026 - momentum: 0.000000
2024-03-26 15:40:57,044 epoch 3 - iter 27/95 - loss 0.25465988 - time (sec): 4.99 - samples/sec: 1928.66 - lr: 0.000026 - momentum: 0.000000
2024-03-26 15:40:58,765 epoch 3 - iter 36/95 - loss 0.25192352 - time (sec): 6.71 - samples/sec: 1890.69 - lr: 0.000025 - momentum: 0.000000
2024-03-26 15:41:00,010 epoch 3 - iter 45/95 - loss 0.25628493 - time (sec): 7.96 - samples/sec: 1935.91 - lr: 0.000025 - momentum: 0.000000
2024-03-26 15:41:02,022 epoch 3 - iter 54/95 - loss 0.24802067 - time (sec): 9.97 - samples/sec: 1875.48 - lr: 0.000025 - momentum: 0.000000
2024-03-26 15:41:03,625 epoch 3 - iter 63/95 - loss 0.24391783 - time (sec): 11.57 - samples/sec: 1886.39 - lr: 0.000025 - momentum: 0.000000
2024-03-26 15:41:05,125 epoch 3 - iter 72/95 - loss 0.23907559 - time (sec): 13.07 - samples/sec: 1895.63 - lr: 0.000024 - momentum: 0.000000
2024-03-26 15:41:06,904 epoch 3 - iter 81/95 - loss 0.23408928 - time (sec): 14.85 - samples/sec: 1882.53 - lr: 0.000024 - momentum: 0.000000
2024-03-26 15:41:09,491 epoch 3 - iter 90/95 - loss 0.21349430 - time (sec): 17.44 - samples/sec: 1875.49 - lr: 0.000024 - momentum: 0.000000
2024-03-26 15:41:10,573 ----------------------------------------------------------------------------------------------------
2024-03-26 15:41:10,573 EPOCH 3 done: loss 0.2095 - lr: 0.000024
2024-03-26 15:41:11,463 DEV : loss 0.23282712697982788 - f1-score (micro avg)  0.8548
2024-03-26 15:41:11,464 saving best model
2024-03-26 15:41:11,911 ----------------------------------------------------------------------------------------------------
2024-03-26 15:41:13,575 epoch 4 - iter 9/95 - loss 0.18254107 - time (sec): 1.66 - samples/sec: 1934.22 - lr: 0.000023 - momentum: 0.000000
2024-03-26 15:41:15,532 epoch 4 - iter 18/95 - loss 0.15909112 - time (sec): 3.62 - samples/sec: 1861.36 - lr: 0.000023 - momentum: 0.000000
2024-03-26 15:41:16,741 epoch 4 - iter 27/95 - loss 0.15496899 - time (sec): 4.83 - samples/sec: 1949.69 - lr: 0.000022 - momentum: 0.000000
2024-03-26 15:41:18,372 epoch 4 - iter 36/95 - loss 0.15784319 - time (sec): 6.46 - samples/sec: 1922.03 - lr: 0.000022 - momentum: 0.000000
2024-03-26 15:41:20,490 epoch 4 - iter 45/95 - loss 0.15558124 - time (sec): 8.58 - samples/sec: 1861.76 - lr: 0.000022 - momentum: 0.000000
2024-03-26 15:41:22,003 epoch 4 - iter 54/95 - loss 0.15932292 - time (sec): 10.09 - samples/sec: 1870.48 - lr: 0.000022 - momentum: 0.000000
2024-03-26 15:41:24,413 epoch 4 - iter 63/95 - loss 0.15131129 - time (sec): 12.50 - samples/sec: 1823.95 - lr: 0.000021 - momentum: 0.000000
2024-03-26 15:41:26,887 epoch 4 - iter 72/95 - loss 0.14104034 - time (sec): 14.97 - samples/sec: 1786.51 - lr: 0.000021 - momentum: 0.000000
2024-03-26 15:41:28,301 epoch 4 - iter 81/95 - loss 0.14018074 - time (sec): 16.39 - samples/sec: 1793.26 - lr: 0.000021 - momentum: 0.000000
2024-03-26 15:41:30,052 epoch 4 - iter 90/95 - loss 0.13938696 - time (sec): 18.14 - samples/sec: 1793.47 - lr: 0.000020 - momentum: 0.000000
2024-03-26 15:41:31,154 ----------------------------------------------------------------------------------------------------
2024-03-26 15:41:31,154 EPOCH 4 done: loss 0.1367 - lr: 0.000020
2024-03-26 15:41:32,048 DEV : loss 0.20248137414455414 - f1-score (micro avg)  0.897
2024-03-26 15:41:32,049 saving best model
2024-03-26 15:41:32,514 ----------------------------------------------------------------------------------------------------
2024-03-26 15:41:33,481 epoch 5 - iter 9/95 - loss 0.08003747 - time (sec): 0.96 - samples/sec: 2134.72 - lr: 0.000020 - momentum: 0.000000
2024-03-26 15:41:35,145 epoch 5 - iter 18/95 - loss 0.10141185 - time (sec): 2.63 - samples/sec: 2024.88 - lr: 0.000019 - momentum: 0.000000
2024-03-26 15:41:37,601 epoch 5 - iter 27/95 - loss 0.10479908 - time (sec): 5.08 - samples/sec: 1792.94 - lr: 0.000019 - momentum: 0.000000
2024-03-26 15:41:39,428 epoch 5 - iter 36/95 - loss 0.09976634 - time (sec): 6.91 - samples/sec: 1795.08 - lr: 0.000019 - momentum: 0.000000
2024-03-26 15:41:41,371 epoch 5 - iter 45/95 - loss 0.09450519 - time (sec): 8.85 - samples/sec: 1765.26 - lr: 0.000019 - momentum: 0.000000
2024-03-26 15:41:42,973 epoch 5 - iter 54/95 - loss 0.09640069 - time (sec): 10.46 - samples/sec: 1801.98 - lr: 0.000018 - momentum: 0.000000
2024-03-26 15:41:45,300 epoch 5 - iter 63/95 - loss 0.09570810 - time (sec): 12.78 - samples/sec: 1791.60 - lr: 0.000018 - momentum: 0.000000
2024-03-26 15:41:46,701 epoch 5 - iter 72/95 - loss 0.10087366 - time (sec): 14.18 - samples/sec: 1810.86 - lr: 0.000018 - momentum: 0.000000
2024-03-26 15:41:48,581 epoch 5 - iter 81/95 - loss 0.09656308 - time (sec): 16.06 - samples/sec: 1787.55 - lr: 0.000017 - momentum: 0.000000
2024-03-26 15:41:50,426 epoch 5 - iter 90/95 - loss 0.09544348 - time (sec): 17.91 - samples/sec: 1789.75 - lr: 0.000017 - momentum: 0.000000
2024-03-26 15:41:51,773 ----------------------------------------------------------------------------------------------------
2024-03-26 15:41:51,773 EPOCH 5 done: loss 0.0957 - lr: 0.000017
2024-03-26 15:41:52,658 DEV : loss 0.18282510340213776 - f1-score (micro avg)  0.9018
2024-03-26 15:41:52,659 saving best model
2024-03-26 15:41:53,118 ----------------------------------------------------------------------------------------------------
2024-03-26 15:41:54,488 epoch 6 - iter 9/95 - loss 0.05876429 - time (sec): 1.37 - samples/sec: 2104.44 - lr: 0.000016 - momentum: 0.000000
2024-03-26 15:41:56,637 epoch 6 - iter 18/95 - loss 0.06157639 - time (sec): 3.52 - samples/sec: 2038.65 - lr: 0.000016 - momentum: 0.000000
2024-03-26 15:41:58,197 epoch 6 - iter 27/95 - loss 0.05890095 - time (sec): 5.08 - samples/sec: 1977.70 - lr: 0.000016 - momentum: 0.000000
2024-03-26 15:42:00,164 epoch 6 - iter 36/95 - loss 0.06546844 - time (sec): 7.04 - samples/sec: 1916.54 - lr: 0.000016 - momentum: 0.000000
2024-03-26 15:42:02,301 epoch 6 - iter 45/95 - loss 0.07740664 - time (sec): 9.18 - samples/sec: 1934.90 - lr: 0.000015 - momentum: 0.000000
2024-03-26 15:42:03,487 epoch 6 - iter 54/95 - loss 0.07626467 - time (sec): 10.37 - samples/sec: 1950.01 - lr: 0.000015 - momentum: 0.000000
2024-03-26 15:42:04,553 epoch 6 - iter 63/95 - loss 0.07604980 - time (sec): 11.43 - samples/sec: 1969.14 - lr: 0.000015 - momentum: 0.000000
2024-03-26 15:42:06,088 epoch 6 - iter 72/95 - loss 0.07068612 - time (sec): 12.97 - samples/sec: 1968.89 - lr: 0.000014 - momentum: 0.000000
2024-03-26 15:42:08,085 epoch 6 - iter 81/95 - loss 0.06914491 - time (sec): 14.97 - samples/sec: 1953.49 - lr: 0.000014 - momentum: 0.000000
2024-03-26 15:42:10,065 epoch 6 - iter 90/95 - loss 0.06921921 - time (sec): 16.95 - samples/sec: 1940.92 - lr: 0.000014 - momentum: 0.000000
2024-03-26 15:42:10,988 ----------------------------------------------------------------------------------------------------
2024-03-26 15:42:10,988 EPOCH 6 done: loss 0.0677 - lr: 0.000014
2024-03-26 15:42:11,881 DEV : loss 0.18957725167274475 - f1-score (micro avg)  0.9152
2024-03-26 15:42:11,882 saving best model
2024-03-26 15:42:12,331 ----------------------------------------------------------------------------------------------------
2024-03-26 15:42:13,762 epoch 7 - iter 9/95 - loss 0.04478296 - time (sec): 1.43 - samples/sec: 1859.67 - lr: 0.000013 - momentum: 0.000000
2024-03-26 15:42:15,536 epoch 7 - iter 18/95 - loss 0.05377288 - time (sec): 3.20 - samples/sec: 1809.15 - lr: 0.000013 - momentum: 0.000000
2024-03-26 15:42:17,129 epoch 7 - iter 27/95 - loss 0.05538811 - time (sec): 4.80 - samples/sec: 1901.87 - lr: 0.000013 - momentum: 0.000000
2024-03-26 15:42:18,829 epoch 7 - iter 36/95 - loss 0.05372464 - time (sec): 6.50 - samples/sec: 1851.31 - lr: 0.000012 - momentum: 0.000000
2024-03-26 15:42:20,180 epoch 7 - iter 45/95 - loss 0.05305431 - time (sec): 7.85 - samples/sec: 1868.49 - lr: 0.000012 - momentum: 0.000000
2024-03-26 15:42:22,222 epoch 7 - iter 54/95 - loss 0.05387578 - time (sec): 9.89 - samples/sec: 1812.68 - lr: 0.000012 - momentum: 0.000000
2024-03-26 15:42:24,451 epoch 7 - iter 63/95 - loss 0.05309062 - time (sec): 12.12 - samples/sec: 1761.72 - lr: 0.000011 - momentum: 0.000000
2024-03-26 15:42:26,980 epoch 7 - iter 72/95 - loss 0.06066650 - time (sec): 14.65 - samples/sec: 1760.08 - lr: 0.000011 - momentum: 0.000000
2024-03-26 15:42:28,904 epoch 7 - iter 81/95 - loss 0.06401076 - time (sec): 16.57 - samples/sec: 1768.68 - lr: 0.000011 - momentum: 0.000000
2024-03-26 15:42:30,861 epoch 7 - iter 90/95 - loss 0.06500645 - time (sec): 18.53 - samples/sec: 1769.11 - lr: 0.000010 - momentum: 0.000000
2024-03-26 15:42:31,750 ----------------------------------------------------------------------------------------------------
2024-03-26 15:42:31,750 EPOCH 7 done: loss 0.0635 - lr: 0.000010
2024-03-26 15:42:32,643 DEV : loss 0.1907779574394226 - f1-score (micro avg)  0.9114
2024-03-26 15:42:32,644 ----------------------------------------------------------------------------------------------------
2024-03-26 15:42:34,896 epoch 8 - iter 9/95 - loss 0.04036110 - time (sec): 2.25 - samples/sec: 1682.42 - lr: 0.000010 - momentum: 0.000000
2024-03-26 15:42:36,441 epoch 8 - iter 18/95 - loss 0.04551092 - time (sec): 3.80 - samples/sec: 1814.98 - lr: 0.000010 - momentum: 0.000000
2024-03-26 15:42:38,602 epoch 8 - iter 27/95 - loss 0.06111466 - time (sec): 5.96 - samples/sec: 1774.69 - lr: 0.000009 - momentum: 0.000000
2024-03-26 15:42:40,144 epoch 8 - iter 36/95 - loss 0.05495416 - time (sec): 7.50 - samples/sec: 1799.10 - lr: 0.000009 - momentum: 0.000000
2024-03-26 15:42:42,001 epoch 8 - iter 45/95 - loss 0.04865017 - time (sec): 9.36 - samples/sec: 1779.30 - lr: 0.000009 - momentum: 0.000000
2024-03-26 15:42:43,679 epoch 8 - iter 54/95 - loss 0.05233134 - time (sec): 11.03 - samples/sec: 1790.25 - lr: 0.000008 - momentum: 0.000000
2024-03-26 15:42:45,475 epoch 8 - iter 63/95 - loss 0.05181467 - time (sec): 12.83 - samples/sec: 1789.75 - lr: 0.000008 - momentum: 0.000000
2024-03-26 15:42:46,781 epoch 8 - iter 72/95 - loss 0.05009695 - time (sec): 14.14 - samples/sec: 1809.59 - lr: 0.000008 - momentum: 0.000000
2024-03-26 15:42:48,603 epoch 8 - iter 81/95 - loss 0.04968315 - time (sec): 15.96 - samples/sec: 1834.68 - lr: 0.000007 - momentum: 0.000000
2024-03-26 15:42:51,006 epoch 8 - iter 90/95 - loss 0.04619173 - time (sec): 18.36 - samples/sec: 1796.39 - lr: 0.000007 - momentum: 0.000000
2024-03-26 15:42:51,812 ----------------------------------------------------------------------------------------------------
2024-03-26 15:42:51,812 EPOCH 8 done: loss 0.0468 - lr: 0.000007
2024-03-26 15:42:52,710 DEV : loss 0.2047257125377655 - f1-score (micro avg)  0.9243
2024-03-26 15:42:52,712 saving best model
2024-03-26 15:42:53,147 ----------------------------------------------------------------------------------------------------
2024-03-26 15:42:54,962 epoch 9 - iter 9/95 - loss 0.06070442 - time (sec): 1.81 - samples/sec: 1872.51 - lr: 0.000007 - momentum: 0.000000
2024-03-26 15:42:57,140 epoch 9 - iter 18/95 - loss 0.04271149 - time (sec): 3.99 - samples/sec: 1737.27 - lr: 0.000006 - momentum: 0.000000
2024-03-26 15:42:59,012 epoch 9 - iter 27/95 - loss 0.04904370 - time (sec): 5.86 - samples/sec: 1779.69 - lr: 0.000006 - momentum: 0.000000
2024-03-26 15:43:00,548 epoch 9 - iter 36/95 - loss 0.04849692 - time (sec): 7.40 - samples/sec: 1794.15 - lr: 0.000006 - momentum: 0.000000
2024-03-26 15:43:01,950 epoch 9 - iter 45/95 - loss 0.04227572 - time (sec): 8.80 - samples/sec: 1832.77 - lr: 0.000005 - momentum: 0.000000
2024-03-26 15:43:03,351 epoch 9 - iter 54/95 - loss 0.03939595 - time (sec): 10.20 - samples/sec: 1885.07 - lr: 0.000005 - momentum: 0.000000
2024-03-26 15:43:05,159 epoch 9 - iter 63/95 - loss 0.04440370 - time (sec): 12.01 - samples/sec: 1892.31 - lr: 0.000005 - momentum: 0.000000
2024-03-26 15:43:07,159 epoch 9 - iter 72/95 - loss 0.04368057 - time (sec): 14.01 - samples/sec: 1864.39 - lr: 0.000004 - momentum: 0.000000
2024-03-26 15:43:09,427 epoch 9 - iter 81/95 - loss 0.04508413 - time (sec): 16.28 - samples/sec: 1823.20 - lr: 0.000004 - momentum: 0.000000
2024-03-26 15:43:11,154 epoch 9 - iter 90/95 - loss 0.04367290 - time (sec): 18.01 - samples/sec: 1838.28 - lr: 0.000004 - momentum: 0.000000
2024-03-26 15:43:11,741 ----------------------------------------------------------------------------------------------------
2024-03-26 15:43:11,741 EPOCH 9 done: loss 0.0428 - lr: 0.000004
2024-03-26 15:43:12,644 DEV : loss 0.2012164443731308 - f1-score (micro avg)  0.9366
2024-03-26 15:43:12,645 saving best model
2024-03-26 15:43:13,084 ----------------------------------------------------------------------------------------------------
2024-03-26 15:43:15,135 epoch 10 - iter 9/95 - loss 0.01028236 - time (sec): 2.05 - samples/sec: 1883.95 - lr: 0.000003 - momentum: 0.000000
2024-03-26 15:43:16,888 epoch 10 - iter 18/95 - loss 0.02145361 - time (sec): 3.80 - samples/sec: 1868.01 - lr: 0.000003 - momentum: 0.000000
2024-03-26 15:43:17,991 epoch 10 - iter 27/95 - loss 0.01929684 - time (sec): 4.91 - samples/sec: 1941.57 - lr: 0.000003 - momentum: 0.000000
2024-03-26 15:43:19,424 epoch 10 - iter 36/95 - loss 0.02733041 - time (sec): 6.34 - samples/sec: 1975.04 - lr: 0.000002 - momentum: 0.000000
2024-03-26 15:43:21,362 epoch 10 - iter 45/95 - loss 0.03522743 - time (sec): 8.28 - samples/sec: 1907.13 - lr: 0.000002 - momentum: 0.000000
2024-03-26 15:43:22,448 epoch 10 - iter 54/95 - loss 0.03959733 - time (sec): 9.36 - samples/sec: 1955.17 - lr: 0.000002 - momentum: 0.000000
2024-03-26 15:43:23,670 epoch 10 - iter 63/95 - loss 0.03591018 - time (sec): 10.58 - samples/sec: 1982.36 - lr: 0.000001 - momentum: 0.000000
2024-03-26 15:43:25,572 epoch 10 - iter 72/95 - loss 0.03543953 - time (sec): 12.49 - samples/sec: 1977.49 - lr: 0.000001 - momentum: 0.000000
2024-03-26 15:43:28,209 epoch 10 - iter 81/95 - loss 0.03611102 - time (sec): 15.12 - samples/sec: 1936.80 - lr: 0.000001 - momentum: 0.000000
2024-03-26 15:43:30,214 epoch 10 - iter 90/95 - loss 0.03608582 - time (sec): 17.13 - samples/sec: 1915.57 - lr: 0.000000 - momentum: 0.000000
2024-03-26 15:43:31,130 ----------------------------------------------------------------------------------------------------
2024-03-26 15:43:31,130 EPOCH 10 done: loss 0.0358 - lr: 0.000000
2024-03-26 15:43:32,026 DEV : loss 0.20790590345859528 - f1-score (micro avg)  0.9267
2024-03-26 15:43:32,318 ----------------------------------------------------------------------------------------------------
2024-03-26 15:43:32,319 Loading model from best epoch ...
2024-03-26 15:43:33,177 SequenceTagger predicts: Dictionary with 17 tags: O, S-Unternehmen, B-Unternehmen, E-Unternehmen, I-Unternehmen, S-Auslagerung, B-Auslagerung, E-Auslagerung, I-Auslagerung, S-Ort, B-Ort, E-Ort, I-Ort, S-Software, B-Software, E-Software, I-Software
2024-03-26 15:43:33,917 
Results:
- F-score (micro) 0.9072
- F-score (macro) 0.6888
- Accuracy 0.8336

By class:
              precision    recall  f1-score   support

 Unternehmen     0.8969    0.8835    0.8902       266
 Auslagerung     0.8707    0.9197    0.8945       249
         Ort     0.9565    0.9851    0.9706       134
    Software     0.0000    0.0000    0.0000         0

   micro avg     0.8962    0.9183    0.9072       649
   macro avg     0.6810    0.6971    0.6888       649
weighted avg     0.8992    0.9183    0.9084       649

2024-03-26 15:43:33,917 ----------------------------------------------------------------------------------------------------