stefan-it commited on
Commit
7c730ba
1 Parent(s): 1662fa2

Upload ./training.log with huggingface_hub

Browse files
Files changed (1) hide show
  1. training.log +243 -0
training.log ADDED
@@ -0,0 +1,243 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-25 21:05:36,363 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-25 21:05:36,364 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(64001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=17, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-25 21:05:36,364 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-25 21:05:36,364 MultiCorpus: 1166 train + 165 dev + 415 test sentences
52
+ - NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
53
+ 2023-10-25 21:05:36,364 ----------------------------------------------------------------------------------------------------
54
+ 2023-10-25 21:05:36,364 Train: 1166 sentences
55
+ 2023-10-25 21:05:36,364 (train_with_dev=False, train_with_test=False)
56
+ 2023-10-25 21:05:36,364 ----------------------------------------------------------------------------------------------------
57
+ 2023-10-25 21:05:36,364 Training Params:
58
+ 2023-10-25 21:05:36,364 - learning_rate: "3e-05"
59
+ 2023-10-25 21:05:36,364 - mini_batch_size: "8"
60
+ 2023-10-25 21:05:36,364 - max_epochs: "10"
61
+ 2023-10-25 21:05:36,364 - shuffle: "True"
62
+ 2023-10-25 21:05:36,364 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-25 21:05:36,364 Plugins:
64
+ 2023-10-25 21:05:36,364 - TensorboardLogger
65
+ 2023-10-25 21:05:36,364 - LinearScheduler | warmup_fraction: '0.1'
66
+ 2023-10-25 21:05:36,364 ----------------------------------------------------------------------------------------------------
67
+ 2023-10-25 21:05:36,364 Final evaluation on model from best epoch (best-model.pt)
68
+ 2023-10-25 21:05:36,364 - metric: "('micro avg', 'f1-score')"
69
+ 2023-10-25 21:05:36,364 ----------------------------------------------------------------------------------------------------
70
+ 2023-10-25 21:05:36,365 Computation:
71
+ 2023-10-25 21:05:36,365 - compute on device: cuda:0
72
+ 2023-10-25 21:05:36,365 - embedding storage: none
73
+ 2023-10-25 21:05:36,365 ----------------------------------------------------------------------------------------------------
74
+ 2023-10-25 21:05:36,365 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
75
+ 2023-10-25 21:05:36,365 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-25 21:05:36,365 ----------------------------------------------------------------------------------------------------
77
+ 2023-10-25 21:05:36,365 Logging anything other than scalars to TensorBoard is currently not supported.
78
+ 2023-10-25 21:05:37,164 epoch 1 - iter 14/146 - loss 2.83025878 - time (sec): 0.80 - samples/sec: 4267.88 - lr: 0.000003 - momentum: 0.000000
79
+ 2023-10-25 21:05:37,955 epoch 1 - iter 28/146 - loss 2.46691922 - time (sec): 1.59 - samples/sec: 4479.04 - lr: 0.000006 - momentum: 0.000000
80
+ 2023-10-25 21:05:38,902 epoch 1 - iter 42/146 - loss 1.86829011 - time (sec): 2.54 - samples/sec: 4669.08 - lr: 0.000008 - momentum: 0.000000
81
+ 2023-10-25 21:05:39,724 epoch 1 - iter 56/146 - loss 1.54384758 - time (sec): 3.36 - samples/sec: 4663.76 - lr: 0.000011 - momentum: 0.000000
82
+ 2023-10-25 21:05:40,514 epoch 1 - iter 70/146 - loss 1.34624853 - time (sec): 4.15 - samples/sec: 4705.75 - lr: 0.000014 - momentum: 0.000000
83
+ 2023-10-25 21:05:41,496 epoch 1 - iter 84/146 - loss 1.20137129 - time (sec): 5.13 - samples/sec: 4668.86 - lr: 0.000017 - momentum: 0.000000
84
+ 2023-10-25 21:05:42,434 epoch 1 - iter 98/146 - loss 1.07011610 - time (sec): 6.07 - samples/sec: 4743.31 - lr: 0.000020 - momentum: 0.000000
85
+ 2023-10-25 21:05:43,498 epoch 1 - iter 112/146 - loss 0.97878778 - time (sec): 7.13 - samples/sec: 4737.41 - lr: 0.000023 - momentum: 0.000000
86
+ 2023-10-25 21:05:44,331 epoch 1 - iter 126/146 - loss 0.90038816 - time (sec): 7.96 - samples/sec: 4778.42 - lr: 0.000026 - momentum: 0.000000
87
+ 2023-10-25 21:05:45,275 epoch 1 - iter 140/146 - loss 0.82667252 - time (sec): 8.91 - samples/sec: 4803.31 - lr: 0.000029 - momentum: 0.000000
88
+ 2023-10-25 21:05:45,697 ----------------------------------------------------------------------------------------------------
89
+ 2023-10-25 21:05:45,697 EPOCH 1 done: loss 0.8075 - lr: 0.000029
90
+ 2023-10-25 21:05:46,358 DEV : loss 0.17039310932159424 - f1-score (micro avg) 0.5702
91
+ 2023-10-25 21:05:46,362 saving best model
92
+ 2023-10-25 21:05:46,879 ----------------------------------------------------------------------------------------------------
93
+ 2023-10-25 21:05:47,843 epoch 2 - iter 14/146 - loss 0.20229098 - time (sec): 0.96 - samples/sec: 4761.19 - lr: 0.000030 - momentum: 0.000000
94
+ 2023-10-25 21:05:48,858 epoch 2 - iter 28/146 - loss 0.18766990 - time (sec): 1.98 - samples/sec: 4776.95 - lr: 0.000029 - momentum: 0.000000
95
+ 2023-10-25 21:05:49,755 epoch 2 - iter 42/146 - loss 0.18648775 - time (sec): 2.88 - samples/sec: 4775.20 - lr: 0.000029 - momentum: 0.000000
96
+ 2023-10-25 21:05:50,655 epoch 2 - iter 56/146 - loss 0.19029101 - time (sec): 3.77 - samples/sec: 4741.42 - lr: 0.000029 - momentum: 0.000000
97
+ 2023-10-25 21:05:51,431 epoch 2 - iter 70/146 - loss 0.18913930 - time (sec): 4.55 - samples/sec: 4763.79 - lr: 0.000028 - momentum: 0.000000
98
+ 2023-10-25 21:05:52,200 epoch 2 - iter 84/146 - loss 0.19230771 - time (sec): 5.32 - samples/sec: 4757.70 - lr: 0.000028 - momentum: 0.000000
99
+ 2023-10-25 21:05:53,030 epoch 2 - iter 98/146 - loss 0.18747787 - time (sec): 6.15 - samples/sec: 4744.20 - lr: 0.000028 - momentum: 0.000000
100
+ 2023-10-25 21:05:54,055 epoch 2 - iter 112/146 - loss 0.18004954 - time (sec): 7.18 - samples/sec: 4732.79 - lr: 0.000027 - momentum: 0.000000
101
+ 2023-10-25 21:05:54,915 epoch 2 - iter 126/146 - loss 0.17380432 - time (sec): 8.03 - samples/sec: 4783.21 - lr: 0.000027 - momentum: 0.000000
102
+ 2023-10-25 21:05:55,761 epoch 2 - iter 140/146 - loss 0.17394433 - time (sec): 8.88 - samples/sec: 4827.33 - lr: 0.000027 - momentum: 0.000000
103
+ 2023-10-25 21:05:56,111 ----------------------------------------------------------------------------------------------------
104
+ 2023-10-25 21:05:56,111 EPOCH 2 done: loss 0.1735 - lr: 0.000027
105
+ 2023-10-25 21:05:57,015 DEV : loss 0.10457519441843033 - f1-score (micro avg) 0.7177
106
+ 2023-10-25 21:05:57,019 saving best model
107
+ 2023-10-25 21:05:57,704 ----------------------------------------------------------------------------------------------------
108
+ 2023-10-25 21:05:58,621 epoch 3 - iter 14/146 - loss 0.09931197 - time (sec): 0.91 - samples/sec: 4567.22 - lr: 0.000026 - momentum: 0.000000
109
+ 2023-10-25 21:05:59,407 epoch 3 - iter 28/146 - loss 0.09405829 - time (sec): 1.70 - samples/sec: 4431.34 - lr: 0.000026 - momentum: 0.000000
110
+ 2023-10-25 21:06:00,370 epoch 3 - iter 42/146 - loss 0.08962058 - time (sec): 2.66 - samples/sec: 4500.54 - lr: 0.000026 - momentum: 0.000000
111
+ 2023-10-25 21:06:01,259 epoch 3 - iter 56/146 - loss 0.08813612 - time (sec): 3.55 - samples/sec: 4372.04 - lr: 0.000025 - momentum: 0.000000
112
+ 2023-10-25 21:06:02,395 epoch 3 - iter 70/146 - loss 0.09225973 - time (sec): 4.69 - samples/sec: 4529.05 - lr: 0.000025 - momentum: 0.000000
113
+ 2023-10-25 21:06:03,276 epoch 3 - iter 84/146 - loss 0.09278810 - time (sec): 5.57 - samples/sec: 4642.69 - lr: 0.000025 - momentum: 0.000000
114
+ 2023-10-25 21:06:04,165 epoch 3 - iter 98/146 - loss 0.09139854 - time (sec): 6.46 - samples/sec: 4698.69 - lr: 0.000024 - momentum: 0.000000
115
+ 2023-10-25 21:06:04,901 epoch 3 - iter 112/146 - loss 0.09379452 - time (sec): 7.19 - samples/sec: 4736.93 - lr: 0.000024 - momentum: 0.000000
116
+ 2023-10-25 21:06:05,723 epoch 3 - iter 126/146 - loss 0.09468401 - time (sec): 8.02 - samples/sec: 4729.48 - lr: 0.000024 - momentum: 0.000000
117
+ 2023-10-25 21:06:06,622 epoch 3 - iter 140/146 - loss 0.09287278 - time (sec): 8.91 - samples/sec: 4744.25 - lr: 0.000024 - momentum: 0.000000
118
+ 2023-10-25 21:06:07,059 ----------------------------------------------------------------------------------------------------
119
+ 2023-10-25 21:06:07,060 EPOCH 3 done: loss 0.0934 - lr: 0.000024
120
+ 2023-10-25 21:06:08,132 DEV : loss 0.09595039486885071 - f1-score (micro avg) 0.7332
121
+ 2023-10-25 21:06:08,137 saving best model
122
+ 2023-10-25 21:06:08,810 ----------------------------------------------------------------------------------------------------
123
+ 2023-10-25 21:06:09,790 epoch 4 - iter 14/146 - loss 0.07411911 - time (sec): 0.98 - samples/sec: 5160.10 - lr: 0.000023 - momentum: 0.000000
124
+ 2023-10-25 21:06:10,610 epoch 4 - iter 28/146 - loss 0.06840387 - time (sec): 1.80 - samples/sec: 4874.67 - lr: 0.000023 - momentum: 0.000000
125
+ 2023-10-25 21:06:11,437 epoch 4 - iter 42/146 - loss 0.07292162 - time (sec): 2.62 - samples/sec: 4818.23 - lr: 0.000022 - momentum: 0.000000
126
+ 2023-10-25 21:06:12,396 epoch 4 - iter 56/146 - loss 0.06469956 - time (sec): 3.58 - samples/sec: 4740.65 - lr: 0.000022 - momentum: 0.000000
127
+ 2023-10-25 21:06:13,143 epoch 4 - iter 70/146 - loss 0.06518597 - time (sec): 4.33 - samples/sec: 4699.77 - lr: 0.000022 - momentum: 0.000000
128
+ 2023-10-25 21:06:14,098 epoch 4 - iter 84/146 - loss 0.06696645 - time (sec): 5.29 - samples/sec: 4659.86 - lr: 0.000021 - momentum: 0.000000
129
+ 2023-10-25 21:06:14,934 epoch 4 - iter 98/146 - loss 0.06661296 - time (sec): 6.12 - samples/sec: 4711.64 - lr: 0.000021 - momentum: 0.000000
130
+ 2023-10-25 21:06:15,737 epoch 4 - iter 112/146 - loss 0.06367292 - time (sec): 6.92 - samples/sec: 4697.66 - lr: 0.000021 - momentum: 0.000000
131
+ 2023-10-25 21:06:16,739 epoch 4 - iter 126/146 - loss 0.06160560 - time (sec): 7.93 - samples/sec: 4707.84 - lr: 0.000021 - momentum: 0.000000
132
+ 2023-10-25 21:06:17,616 epoch 4 - iter 140/146 - loss 0.06032160 - time (sec): 8.80 - samples/sec: 4821.21 - lr: 0.000020 - momentum: 0.000000
133
+ 2023-10-25 21:06:17,933 ----------------------------------------------------------------------------------------------------
134
+ 2023-10-25 21:06:17,933 EPOCH 4 done: loss 0.0601 - lr: 0.000020
135
+ 2023-10-25 21:06:18,846 DEV : loss 0.10524275153875351 - f1-score (micro avg) 0.7642
136
+ 2023-10-25 21:06:18,850 saving best model
137
+ 2023-10-25 21:06:19,534 ----------------------------------------------------------------------------------------------------
138
+ 2023-10-25 21:06:20,415 epoch 5 - iter 14/146 - loss 0.02996552 - time (sec): 0.88 - samples/sec: 5280.96 - lr: 0.000020 - momentum: 0.000000
139
+ 2023-10-25 21:06:21,159 epoch 5 - iter 28/146 - loss 0.03043510 - time (sec): 1.62 - samples/sec: 5157.31 - lr: 0.000019 - momentum: 0.000000
140
+ 2023-10-25 21:06:22,023 epoch 5 - iter 42/146 - loss 0.03731623 - time (sec): 2.48 - samples/sec: 5263.23 - lr: 0.000019 - momentum: 0.000000
141
+ 2023-10-25 21:06:22,930 epoch 5 - iter 56/146 - loss 0.03644991 - time (sec): 3.39 - samples/sec: 5056.90 - lr: 0.000019 - momentum: 0.000000
142
+ 2023-10-25 21:06:23,888 epoch 5 - iter 70/146 - loss 0.03446408 - time (sec): 4.35 - samples/sec: 4852.99 - lr: 0.000018 - momentum: 0.000000
143
+ 2023-10-25 21:06:24,758 epoch 5 - iter 84/146 - loss 0.03395159 - time (sec): 5.22 - samples/sec: 4784.45 - lr: 0.000018 - momentum: 0.000000
144
+ 2023-10-25 21:06:25,718 epoch 5 - iter 98/146 - loss 0.03646640 - time (sec): 6.18 - samples/sec: 4713.73 - lr: 0.000018 - momentum: 0.000000
145
+ 2023-10-25 21:06:26,607 epoch 5 - iter 112/146 - loss 0.03898602 - time (sec): 7.07 - samples/sec: 4726.24 - lr: 0.000018 - momentum: 0.000000
146
+ 2023-10-25 21:06:27,672 epoch 5 - iter 126/146 - loss 0.03853003 - time (sec): 8.13 - samples/sec: 4730.88 - lr: 0.000017 - momentum: 0.000000
147
+ 2023-10-25 21:06:28,445 epoch 5 - iter 140/146 - loss 0.03950165 - time (sec): 8.91 - samples/sec: 4785.32 - lr: 0.000017 - momentum: 0.000000
148
+ 2023-10-25 21:06:28,835 ----------------------------------------------------------------------------------------------------
149
+ 2023-10-25 21:06:28,836 EPOCH 5 done: loss 0.0395 - lr: 0.000017
150
+ 2023-10-25 21:06:29,746 DEV : loss 0.10796511173248291 - f1-score (micro avg) 0.7617
151
+ 2023-10-25 21:06:29,751 ----------------------------------------------------------------------------------------------------
152
+ 2023-10-25 21:06:30,562 epoch 6 - iter 14/146 - loss 0.02045471 - time (sec): 0.81 - samples/sec: 5141.92 - lr: 0.000016 - momentum: 0.000000
153
+ 2023-10-25 21:06:31,512 epoch 6 - iter 28/146 - loss 0.02431460 - time (sec): 1.76 - samples/sec: 4759.68 - lr: 0.000016 - momentum: 0.000000
154
+ 2023-10-25 21:06:32,460 epoch 6 - iter 42/146 - loss 0.02349163 - time (sec): 2.71 - samples/sec: 4864.11 - lr: 0.000016 - momentum: 0.000000
155
+ 2023-10-25 21:06:33,331 epoch 6 - iter 56/146 - loss 0.02207213 - time (sec): 3.58 - samples/sec: 4826.86 - lr: 0.000015 - momentum: 0.000000
156
+ 2023-10-25 21:06:34,276 epoch 6 - iter 70/146 - loss 0.02538588 - time (sec): 4.52 - samples/sec: 4762.20 - lr: 0.000015 - momentum: 0.000000
157
+ 2023-10-25 21:06:35,169 epoch 6 - iter 84/146 - loss 0.02401518 - time (sec): 5.42 - samples/sec: 4691.30 - lr: 0.000015 - momentum: 0.000000
158
+ 2023-10-25 21:06:36,103 epoch 6 - iter 98/146 - loss 0.02319494 - time (sec): 6.35 - samples/sec: 4741.03 - lr: 0.000015 - momentum: 0.000000
159
+ 2023-10-25 21:06:37,157 epoch 6 - iter 112/146 - loss 0.02399453 - time (sec): 7.41 - samples/sec: 4643.43 - lr: 0.000014 - momentum: 0.000000
160
+ 2023-10-25 21:06:38,007 epoch 6 - iter 126/146 - loss 0.02529112 - time (sec): 8.26 - samples/sec: 4654.13 - lr: 0.000014 - momentum: 0.000000
161
+ 2023-10-25 21:06:38,896 epoch 6 - iter 140/146 - loss 0.02430354 - time (sec): 9.14 - samples/sec: 4677.55 - lr: 0.000014 - momentum: 0.000000
162
+ 2023-10-25 21:06:39,258 ----------------------------------------------------------------------------------------------------
163
+ 2023-10-25 21:06:39,258 EPOCH 6 done: loss 0.0249 - lr: 0.000014
164
+ 2023-10-25 21:06:40,322 DEV : loss 0.11456426978111267 - f1-score (micro avg) 0.7849
165
+ 2023-10-25 21:06:40,327 saving best model
166
+ 2023-10-25 21:06:40,998 ----------------------------------------------------------------------------------------------------
167
+ 2023-10-25 21:06:41,831 epoch 7 - iter 14/146 - loss 0.01377810 - time (sec): 0.83 - samples/sec: 5025.09 - lr: 0.000013 - momentum: 0.000000
168
+ 2023-10-25 21:06:43,030 epoch 7 - iter 28/146 - loss 0.02000949 - time (sec): 2.03 - samples/sec: 4964.90 - lr: 0.000013 - momentum: 0.000000
169
+ 2023-10-25 21:06:43,818 epoch 7 - iter 42/146 - loss 0.02159133 - time (sec): 2.82 - samples/sec: 4774.67 - lr: 0.000012 - momentum: 0.000000
170
+ 2023-10-25 21:06:44,680 epoch 7 - iter 56/146 - loss 0.02141280 - time (sec): 3.68 - samples/sec: 4696.42 - lr: 0.000012 - momentum: 0.000000
171
+ 2023-10-25 21:06:45,530 epoch 7 - iter 70/146 - loss 0.02033340 - time (sec): 4.53 - samples/sec: 4722.77 - lr: 0.000012 - momentum: 0.000000
172
+ 2023-10-25 21:06:46,479 epoch 7 - iter 84/146 - loss 0.01873330 - time (sec): 5.48 - samples/sec: 4804.21 - lr: 0.000012 - momentum: 0.000000
173
+ 2023-10-25 21:06:47,374 epoch 7 - iter 98/146 - loss 0.01877254 - time (sec): 6.37 - samples/sec: 4815.74 - lr: 0.000011 - momentum: 0.000000
174
+ 2023-10-25 21:06:48,168 epoch 7 - iter 112/146 - loss 0.01963305 - time (sec): 7.17 - samples/sec: 4765.05 - lr: 0.000011 - momentum: 0.000000
175
+ 2023-10-25 21:06:48,984 epoch 7 - iter 126/146 - loss 0.01898464 - time (sec): 7.98 - samples/sec: 4793.92 - lr: 0.000011 - momentum: 0.000000
176
+ 2023-10-25 21:06:49,883 epoch 7 - iter 140/146 - loss 0.01845447 - time (sec): 8.88 - samples/sec: 4782.28 - lr: 0.000010 - momentum: 0.000000
177
+ 2023-10-25 21:06:50,299 ----------------------------------------------------------------------------------------------------
178
+ 2023-10-25 21:06:50,299 EPOCH 7 done: loss 0.0181 - lr: 0.000010
179
+ 2023-10-25 21:06:51,208 DEV : loss 0.13842682540416718 - f1-score (micro avg) 0.7788
180
+ 2023-10-25 21:06:51,213 ----------------------------------------------------------------------------------------------------
181
+ 2023-10-25 21:06:52,098 epoch 8 - iter 14/146 - loss 0.02179360 - time (sec): 0.88 - samples/sec: 4362.26 - lr: 0.000010 - momentum: 0.000000
182
+ 2023-10-25 21:06:53,100 epoch 8 - iter 28/146 - loss 0.01457476 - time (sec): 1.89 - samples/sec: 4502.85 - lr: 0.000009 - momentum: 0.000000
183
+ 2023-10-25 21:06:54,176 epoch 8 - iter 42/146 - loss 0.01363156 - time (sec): 2.96 - samples/sec: 4560.96 - lr: 0.000009 - momentum: 0.000000
184
+ 2023-10-25 21:06:55,065 epoch 8 - iter 56/146 - loss 0.01358221 - time (sec): 3.85 - samples/sec: 4529.92 - lr: 0.000009 - momentum: 0.000000
185
+ 2023-10-25 21:06:55,987 epoch 8 - iter 70/146 - loss 0.01502044 - time (sec): 4.77 - samples/sec: 4552.97 - lr: 0.000009 - momentum: 0.000000
186
+ 2023-10-25 21:06:56,784 epoch 8 - iter 84/146 - loss 0.01489034 - time (sec): 5.57 - samples/sec: 4658.20 - lr: 0.000008 - momentum: 0.000000
187
+ 2023-10-25 21:06:57,539 epoch 8 - iter 98/146 - loss 0.01449699 - time (sec): 6.32 - samples/sec: 4659.56 - lr: 0.000008 - momentum: 0.000000
188
+ 2023-10-25 21:06:58,527 epoch 8 - iter 112/146 - loss 0.01427531 - time (sec): 7.31 - samples/sec: 4717.80 - lr: 0.000008 - momentum: 0.000000
189
+ 2023-10-25 21:06:59,324 epoch 8 - iter 126/146 - loss 0.01369188 - time (sec): 8.11 - samples/sec: 4719.35 - lr: 0.000007 - momentum: 0.000000
190
+ 2023-10-25 21:07:00,154 epoch 8 - iter 140/146 - loss 0.01269696 - time (sec): 8.94 - samples/sec: 4804.98 - lr: 0.000007 - momentum: 0.000000
191
+ 2023-10-25 21:07:00,462 ----------------------------------------------------------------------------------------------------
192
+ 2023-10-25 21:07:00,462 EPOCH 8 done: loss 0.0128 - lr: 0.000007
193
+ 2023-10-25 21:07:01,376 DEV : loss 0.15397675335407257 - f1-score (micro avg) 0.7315
194
+ 2023-10-25 21:07:01,380 ----------------------------------------------------------------------------------------------------
195
+ 2023-10-25 21:07:02,243 epoch 9 - iter 14/146 - loss 0.00725404 - time (sec): 0.86 - samples/sec: 5396.63 - lr: 0.000006 - momentum: 0.000000
196
+ 2023-10-25 21:07:03,063 epoch 9 - iter 28/146 - loss 0.00858283 - time (sec): 1.68 - samples/sec: 5303.83 - lr: 0.000006 - momentum: 0.000000
197
+ 2023-10-25 21:07:03,844 epoch 9 - iter 42/146 - loss 0.00697012 - time (sec): 2.46 - samples/sec: 5096.55 - lr: 0.000006 - momentum: 0.000000
198
+ 2023-10-25 21:07:04,896 epoch 9 - iter 56/146 - loss 0.00943578 - time (sec): 3.51 - samples/sec: 4980.37 - lr: 0.000006 - momentum: 0.000000
199
+ 2023-10-25 21:07:05,865 epoch 9 - iter 70/146 - loss 0.01141288 - time (sec): 4.48 - samples/sec: 4965.81 - lr: 0.000005 - momentum: 0.000000
200
+ 2023-10-25 21:07:06,742 epoch 9 - iter 84/146 - loss 0.01134343 - time (sec): 5.36 - samples/sec: 4920.44 - lr: 0.000005 - momentum: 0.000000
201
+ 2023-10-25 21:07:07,668 epoch 9 - iter 98/146 - loss 0.01085725 - time (sec): 6.29 - samples/sec: 4919.30 - lr: 0.000005 - momentum: 0.000000
202
+ 2023-10-25 21:07:08,425 epoch 9 - iter 112/146 - loss 0.01140709 - time (sec): 7.04 - samples/sec: 4880.19 - lr: 0.000004 - momentum: 0.000000
203
+ 2023-10-25 21:07:09,309 epoch 9 - iter 126/146 - loss 0.01114475 - time (sec): 7.93 - samples/sec: 4855.84 - lr: 0.000004 - momentum: 0.000000
204
+ 2023-10-25 21:07:10,174 epoch 9 - iter 140/146 - loss 0.01073435 - time (sec): 8.79 - samples/sec: 4862.91 - lr: 0.000004 - momentum: 0.000000
205
+ 2023-10-25 21:07:10,483 ----------------------------------------------------------------------------------------------------
206
+ 2023-10-25 21:07:10,483 EPOCH 9 done: loss 0.0109 - lr: 0.000004
207
+ 2023-10-25 21:07:11,390 DEV : loss 0.15072251856327057 - f1-score (micro avg) 0.7598
208
+ 2023-10-25 21:07:11,395 ----------------------------------------------------------------------------------------------------
209
+ 2023-10-25 21:07:12,200 epoch 10 - iter 14/146 - loss 0.00398947 - time (sec): 0.80 - samples/sec: 5186.37 - lr: 0.000003 - momentum: 0.000000
210
+ 2023-10-25 21:07:13,172 epoch 10 - iter 28/146 - loss 0.00875898 - time (sec): 1.78 - samples/sec: 4781.14 - lr: 0.000003 - momentum: 0.000000
211
+ 2023-10-25 21:07:14,022 epoch 10 - iter 42/146 - loss 0.00922190 - time (sec): 2.63 - samples/sec: 4759.26 - lr: 0.000003 - momentum: 0.000000
212
+ 2023-10-25 21:07:14,955 epoch 10 - iter 56/146 - loss 0.01137857 - time (sec): 3.56 - samples/sec: 4758.01 - lr: 0.000002 - momentum: 0.000000
213
+ 2023-10-25 21:07:15,813 epoch 10 - iter 70/146 - loss 0.00974451 - time (sec): 4.42 - samples/sec: 4783.44 - lr: 0.000002 - momentum: 0.000000
214
+ 2023-10-25 21:07:16,574 epoch 10 - iter 84/146 - loss 0.00891124 - time (sec): 5.18 - samples/sec: 4755.93 - lr: 0.000002 - momentum: 0.000000
215
+ 2023-10-25 21:07:17,576 epoch 10 - iter 98/146 - loss 0.01023278 - time (sec): 6.18 - samples/sec: 4727.60 - lr: 0.000001 - momentum: 0.000000
216
+ 2023-10-25 21:07:18,489 epoch 10 - iter 112/146 - loss 0.00961012 - time (sec): 7.09 - samples/sec: 4783.79 - lr: 0.000001 - momentum: 0.000000
217
+ 2023-10-25 21:07:19,586 epoch 10 - iter 126/146 - loss 0.00898474 - time (sec): 8.19 - samples/sec: 4684.43 - lr: 0.000001 - momentum: 0.000000
218
+ 2023-10-25 21:07:20,438 epoch 10 - iter 140/146 - loss 0.00851157 - time (sec): 9.04 - samples/sec: 4730.83 - lr: 0.000000 - momentum: 0.000000
219
+ 2023-10-25 21:07:20,797 ----------------------------------------------------------------------------------------------------
220
+ 2023-10-25 21:07:20,797 EPOCH 10 done: loss 0.0087 - lr: 0.000000
221
+ 2023-10-25 21:07:21,711 DEV : loss 0.1518029123544693 - f1-score (micro avg) 0.7588
222
+ 2023-10-25 21:07:22,231 ----------------------------------------------------------------------------------------------------
223
+ 2023-10-25 21:07:22,233 Loading model from best epoch ...
224
+ 2023-10-25 21:07:23,959 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
225
+ 2023-10-25 21:07:25,507
226
+ Results:
227
+ - F-score (micro) 0.7581
228
+ - F-score (macro) 0.6581
229
+ - Accuracy 0.6331
230
+
231
+ By class:
232
+ precision recall f1-score support
233
+
234
+ PER 0.7855 0.8420 0.8128 348
235
+ LOC 0.6943 0.8352 0.7583 261
236
+ ORG 0.4348 0.3846 0.4082 52
237
+ HumanProd 0.5926 0.7273 0.6531 22
238
+
239
+ micro avg 0.7197 0.8009 0.7581 683
240
+ macro avg 0.6268 0.6973 0.6581 683
241
+ weighted avg 0.7177 0.8009 0.7560 683
242
+
243
+ 2023-10-25 21:07:25,507 ----------------------------------------------------------------------------------------------------