Upload ./training.log with huggingface_hub
Browse files- training.log +243 -0
training.log
ADDED
@@ -0,0 +1,243 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-25 21:05:36,363 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-25 21:05:36,364 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(64001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=768, out_features=17, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-25 21:05:36,364 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-25 21:05:36,364 MultiCorpus: 1166 train + 165 dev + 415 test sentences
|
52 |
+
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
|
53 |
+
2023-10-25 21:05:36,364 ----------------------------------------------------------------------------------------------------
|
54 |
+
2023-10-25 21:05:36,364 Train: 1166 sentences
|
55 |
+
2023-10-25 21:05:36,364 (train_with_dev=False, train_with_test=False)
|
56 |
+
2023-10-25 21:05:36,364 ----------------------------------------------------------------------------------------------------
|
57 |
+
2023-10-25 21:05:36,364 Training Params:
|
58 |
+
2023-10-25 21:05:36,364 - learning_rate: "3e-05"
|
59 |
+
2023-10-25 21:05:36,364 - mini_batch_size: "8"
|
60 |
+
2023-10-25 21:05:36,364 - max_epochs: "10"
|
61 |
+
2023-10-25 21:05:36,364 - shuffle: "True"
|
62 |
+
2023-10-25 21:05:36,364 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-25 21:05:36,364 Plugins:
|
64 |
+
2023-10-25 21:05:36,364 - TensorboardLogger
|
65 |
+
2023-10-25 21:05:36,364 - LinearScheduler | warmup_fraction: '0.1'
|
66 |
+
2023-10-25 21:05:36,364 ----------------------------------------------------------------------------------------------------
|
67 |
+
2023-10-25 21:05:36,364 Final evaluation on model from best epoch (best-model.pt)
|
68 |
+
2023-10-25 21:05:36,364 - metric: "('micro avg', 'f1-score')"
|
69 |
+
2023-10-25 21:05:36,364 ----------------------------------------------------------------------------------------------------
|
70 |
+
2023-10-25 21:05:36,365 Computation:
|
71 |
+
2023-10-25 21:05:36,365 - compute on device: cuda:0
|
72 |
+
2023-10-25 21:05:36,365 - embedding storage: none
|
73 |
+
2023-10-25 21:05:36,365 ----------------------------------------------------------------------------------------------------
|
74 |
+
2023-10-25 21:05:36,365 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
|
75 |
+
2023-10-25 21:05:36,365 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-25 21:05:36,365 ----------------------------------------------------------------------------------------------------
|
77 |
+
2023-10-25 21:05:36,365 Logging anything other than scalars to TensorBoard is currently not supported.
|
78 |
+
2023-10-25 21:05:37,164 epoch 1 - iter 14/146 - loss 2.83025878 - time (sec): 0.80 - samples/sec: 4267.88 - lr: 0.000003 - momentum: 0.000000
|
79 |
+
2023-10-25 21:05:37,955 epoch 1 - iter 28/146 - loss 2.46691922 - time (sec): 1.59 - samples/sec: 4479.04 - lr: 0.000006 - momentum: 0.000000
|
80 |
+
2023-10-25 21:05:38,902 epoch 1 - iter 42/146 - loss 1.86829011 - time (sec): 2.54 - samples/sec: 4669.08 - lr: 0.000008 - momentum: 0.000000
|
81 |
+
2023-10-25 21:05:39,724 epoch 1 - iter 56/146 - loss 1.54384758 - time (sec): 3.36 - samples/sec: 4663.76 - lr: 0.000011 - momentum: 0.000000
|
82 |
+
2023-10-25 21:05:40,514 epoch 1 - iter 70/146 - loss 1.34624853 - time (sec): 4.15 - samples/sec: 4705.75 - lr: 0.000014 - momentum: 0.000000
|
83 |
+
2023-10-25 21:05:41,496 epoch 1 - iter 84/146 - loss 1.20137129 - time (sec): 5.13 - samples/sec: 4668.86 - lr: 0.000017 - momentum: 0.000000
|
84 |
+
2023-10-25 21:05:42,434 epoch 1 - iter 98/146 - loss 1.07011610 - time (sec): 6.07 - samples/sec: 4743.31 - lr: 0.000020 - momentum: 0.000000
|
85 |
+
2023-10-25 21:05:43,498 epoch 1 - iter 112/146 - loss 0.97878778 - time (sec): 7.13 - samples/sec: 4737.41 - lr: 0.000023 - momentum: 0.000000
|
86 |
+
2023-10-25 21:05:44,331 epoch 1 - iter 126/146 - loss 0.90038816 - time (sec): 7.96 - samples/sec: 4778.42 - lr: 0.000026 - momentum: 0.000000
|
87 |
+
2023-10-25 21:05:45,275 epoch 1 - iter 140/146 - loss 0.82667252 - time (sec): 8.91 - samples/sec: 4803.31 - lr: 0.000029 - momentum: 0.000000
|
88 |
+
2023-10-25 21:05:45,697 ----------------------------------------------------------------------------------------------------
|
89 |
+
2023-10-25 21:05:45,697 EPOCH 1 done: loss 0.8075 - lr: 0.000029
|
90 |
+
2023-10-25 21:05:46,358 DEV : loss 0.17039310932159424 - f1-score (micro avg) 0.5702
|
91 |
+
2023-10-25 21:05:46,362 saving best model
|
92 |
+
2023-10-25 21:05:46,879 ----------------------------------------------------------------------------------------------------
|
93 |
+
2023-10-25 21:05:47,843 epoch 2 - iter 14/146 - loss 0.20229098 - time (sec): 0.96 - samples/sec: 4761.19 - lr: 0.000030 - momentum: 0.000000
|
94 |
+
2023-10-25 21:05:48,858 epoch 2 - iter 28/146 - loss 0.18766990 - time (sec): 1.98 - samples/sec: 4776.95 - lr: 0.000029 - momentum: 0.000000
|
95 |
+
2023-10-25 21:05:49,755 epoch 2 - iter 42/146 - loss 0.18648775 - time (sec): 2.88 - samples/sec: 4775.20 - lr: 0.000029 - momentum: 0.000000
|
96 |
+
2023-10-25 21:05:50,655 epoch 2 - iter 56/146 - loss 0.19029101 - time (sec): 3.77 - samples/sec: 4741.42 - lr: 0.000029 - momentum: 0.000000
|
97 |
+
2023-10-25 21:05:51,431 epoch 2 - iter 70/146 - loss 0.18913930 - time (sec): 4.55 - samples/sec: 4763.79 - lr: 0.000028 - momentum: 0.000000
|
98 |
+
2023-10-25 21:05:52,200 epoch 2 - iter 84/146 - loss 0.19230771 - time (sec): 5.32 - samples/sec: 4757.70 - lr: 0.000028 - momentum: 0.000000
|
99 |
+
2023-10-25 21:05:53,030 epoch 2 - iter 98/146 - loss 0.18747787 - time (sec): 6.15 - samples/sec: 4744.20 - lr: 0.000028 - momentum: 0.000000
|
100 |
+
2023-10-25 21:05:54,055 epoch 2 - iter 112/146 - loss 0.18004954 - time (sec): 7.18 - samples/sec: 4732.79 - lr: 0.000027 - momentum: 0.000000
|
101 |
+
2023-10-25 21:05:54,915 epoch 2 - iter 126/146 - loss 0.17380432 - time (sec): 8.03 - samples/sec: 4783.21 - lr: 0.000027 - momentum: 0.000000
|
102 |
+
2023-10-25 21:05:55,761 epoch 2 - iter 140/146 - loss 0.17394433 - time (sec): 8.88 - samples/sec: 4827.33 - lr: 0.000027 - momentum: 0.000000
|
103 |
+
2023-10-25 21:05:56,111 ----------------------------------------------------------------------------------------------------
|
104 |
+
2023-10-25 21:05:56,111 EPOCH 2 done: loss 0.1735 - lr: 0.000027
|
105 |
+
2023-10-25 21:05:57,015 DEV : loss 0.10457519441843033 - f1-score (micro avg) 0.7177
|
106 |
+
2023-10-25 21:05:57,019 saving best model
|
107 |
+
2023-10-25 21:05:57,704 ----------------------------------------------------------------------------------------------------
|
108 |
+
2023-10-25 21:05:58,621 epoch 3 - iter 14/146 - loss 0.09931197 - time (sec): 0.91 - samples/sec: 4567.22 - lr: 0.000026 - momentum: 0.000000
|
109 |
+
2023-10-25 21:05:59,407 epoch 3 - iter 28/146 - loss 0.09405829 - time (sec): 1.70 - samples/sec: 4431.34 - lr: 0.000026 - momentum: 0.000000
|
110 |
+
2023-10-25 21:06:00,370 epoch 3 - iter 42/146 - loss 0.08962058 - time (sec): 2.66 - samples/sec: 4500.54 - lr: 0.000026 - momentum: 0.000000
|
111 |
+
2023-10-25 21:06:01,259 epoch 3 - iter 56/146 - loss 0.08813612 - time (sec): 3.55 - samples/sec: 4372.04 - lr: 0.000025 - momentum: 0.000000
|
112 |
+
2023-10-25 21:06:02,395 epoch 3 - iter 70/146 - loss 0.09225973 - time (sec): 4.69 - samples/sec: 4529.05 - lr: 0.000025 - momentum: 0.000000
|
113 |
+
2023-10-25 21:06:03,276 epoch 3 - iter 84/146 - loss 0.09278810 - time (sec): 5.57 - samples/sec: 4642.69 - lr: 0.000025 - momentum: 0.000000
|
114 |
+
2023-10-25 21:06:04,165 epoch 3 - iter 98/146 - loss 0.09139854 - time (sec): 6.46 - samples/sec: 4698.69 - lr: 0.000024 - momentum: 0.000000
|
115 |
+
2023-10-25 21:06:04,901 epoch 3 - iter 112/146 - loss 0.09379452 - time (sec): 7.19 - samples/sec: 4736.93 - lr: 0.000024 - momentum: 0.000000
|
116 |
+
2023-10-25 21:06:05,723 epoch 3 - iter 126/146 - loss 0.09468401 - time (sec): 8.02 - samples/sec: 4729.48 - lr: 0.000024 - momentum: 0.000000
|
117 |
+
2023-10-25 21:06:06,622 epoch 3 - iter 140/146 - loss 0.09287278 - time (sec): 8.91 - samples/sec: 4744.25 - lr: 0.000024 - momentum: 0.000000
|
118 |
+
2023-10-25 21:06:07,059 ----------------------------------------------------------------------------------------------------
|
119 |
+
2023-10-25 21:06:07,060 EPOCH 3 done: loss 0.0934 - lr: 0.000024
|
120 |
+
2023-10-25 21:06:08,132 DEV : loss 0.09595039486885071 - f1-score (micro avg) 0.7332
|
121 |
+
2023-10-25 21:06:08,137 saving best model
|
122 |
+
2023-10-25 21:06:08,810 ----------------------------------------------------------------------------------------------------
|
123 |
+
2023-10-25 21:06:09,790 epoch 4 - iter 14/146 - loss 0.07411911 - time (sec): 0.98 - samples/sec: 5160.10 - lr: 0.000023 - momentum: 0.000000
|
124 |
+
2023-10-25 21:06:10,610 epoch 4 - iter 28/146 - loss 0.06840387 - time (sec): 1.80 - samples/sec: 4874.67 - lr: 0.000023 - momentum: 0.000000
|
125 |
+
2023-10-25 21:06:11,437 epoch 4 - iter 42/146 - loss 0.07292162 - time (sec): 2.62 - samples/sec: 4818.23 - lr: 0.000022 - momentum: 0.000000
|
126 |
+
2023-10-25 21:06:12,396 epoch 4 - iter 56/146 - loss 0.06469956 - time (sec): 3.58 - samples/sec: 4740.65 - lr: 0.000022 - momentum: 0.000000
|
127 |
+
2023-10-25 21:06:13,143 epoch 4 - iter 70/146 - loss 0.06518597 - time (sec): 4.33 - samples/sec: 4699.77 - lr: 0.000022 - momentum: 0.000000
|
128 |
+
2023-10-25 21:06:14,098 epoch 4 - iter 84/146 - loss 0.06696645 - time (sec): 5.29 - samples/sec: 4659.86 - lr: 0.000021 - momentum: 0.000000
|
129 |
+
2023-10-25 21:06:14,934 epoch 4 - iter 98/146 - loss 0.06661296 - time (sec): 6.12 - samples/sec: 4711.64 - lr: 0.000021 - momentum: 0.000000
|
130 |
+
2023-10-25 21:06:15,737 epoch 4 - iter 112/146 - loss 0.06367292 - time (sec): 6.92 - samples/sec: 4697.66 - lr: 0.000021 - momentum: 0.000000
|
131 |
+
2023-10-25 21:06:16,739 epoch 4 - iter 126/146 - loss 0.06160560 - time (sec): 7.93 - samples/sec: 4707.84 - lr: 0.000021 - momentum: 0.000000
|
132 |
+
2023-10-25 21:06:17,616 epoch 4 - iter 140/146 - loss 0.06032160 - time (sec): 8.80 - samples/sec: 4821.21 - lr: 0.000020 - momentum: 0.000000
|
133 |
+
2023-10-25 21:06:17,933 ----------------------------------------------------------------------------------------------------
|
134 |
+
2023-10-25 21:06:17,933 EPOCH 4 done: loss 0.0601 - lr: 0.000020
|
135 |
+
2023-10-25 21:06:18,846 DEV : loss 0.10524275153875351 - f1-score (micro avg) 0.7642
|
136 |
+
2023-10-25 21:06:18,850 saving best model
|
137 |
+
2023-10-25 21:06:19,534 ----------------------------------------------------------------------------------------------------
|
138 |
+
2023-10-25 21:06:20,415 epoch 5 - iter 14/146 - loss 0.02996552 - time (sec): 0.88 - samples/sec: 5280.96 - lr: 0.000020 - momentum: 0.000000
|
139 |
+
2023-10-25 21:06:21,159 epoch 5 - iter 28/146 - loss 0.03043510 - time (sec): 1.62 - samples/sec: 5157.31 - lr: 0.000019 - momentum: 0.000000
|
140 |
+
2023-10-25 21:06:22,023 epoch 5 - iter 42/146 - loss 0.03731623 - time (sec): 2.48 - samples/sec: 5263.23 - lr: 0.000019 - momentum: 0.000000
|
141 |
+
2023-10-25 21:06:22,930 epoch 5 - iter 56/146 - loss 0.03644991 - time (sec): 3.39 - samples/sec: 5056.90 - lr: 0.000019 - momentum: 0.000000
|
142 |
+
2023-10-25 21:06:23,888 epoch 5 - iter 70/146 - loss 0.03446408 - time (sec): 4.35 - samples/sec: 4852.99 - lr: 0.000018 - momentum: 0.000000
|
143 |
+
2023-10-25 21:06:24,758 epoch 5 - iter 84/146 - loss 0.03395159 - time (sec): 5.22 - samples/sec: 4784.45 - lr: 0.000018 - momentum: 0.000000
|
144 |
+
2023-10-25 21:06:25,718 epoch 5 - iter 98/146 - loss 0.03646640 - time (sec): 6.18 - samples/sec: 4713.73 - lr: 0.000018 - momentum: 0.000000
|
145 |
+
2023-10-25 21:06:26,607 epoch 5 - iter 112/146 - loss 0.03898602 - time (sec): 7.07 - samples/sec: 4726.24 - lr: 0.000018 - momentum: 0.000000
|
146 |
+
2023-10-25 21:06:27,672 epoch 5 - iter 126/146 - loss 0.03853003 - time (sec): 8.13 - samples/sec: 4730.88 - lr: 0.000017 - momentum: 0.000000
|
147 |
+
2023-10-25 21:06:28,445 epoch 5 - iter 140/146 - loss 0.03950165 - time (sec): 8.91 - samples/sec: 4785.32 - lr: 0.000017 - momentum: 0.000000
|
148 |
+
2023-10-25 21:06:28,835 ----------------------------------------------------------------------------------------------------
|
149 |
+
2023-10-25 21:06:28,836 EPOCH 5 done: loss 0.0395 - lr: 0.000017
|
150 |
+
2023-10-25 21:06:29,746 DEV : loss 0.10796511173248291 - f1-score (micro avg) 0.7617
|
151 |
+
2023-10-25 21:06:29,751 ----------------------------------------------------------------------------------------------------
|
152 |
+
2023-10-25 21:06:30,562 epoch 6 - iter 14/146 - loss 0.02045471 - time (sec): 0.81 - samples/sec: 5141.92 - lr: 0.000016 - momentum: 0.000000
|
153 |
+
2023-10-25 21:06:31,512 epoch 6 - iter 28/146 - loss 0.02431460 - time (sec): 1.76 - samples/sec: 4759.68 - lr: 0.000016 - momentum: 0.000000
|
154 |
+
2023-10-25 21:06:32,460 epoch 6 - iter 42/146 - loss 0.02349163 - time (sec): 2.71 - samples/sec: 4864.11 - lr: 0.000016 - momentum: 0.000000
|
155 |
+
2023-10-25 21:06:33,331 epoch 6 - iter 56/146 - loss 0.02207213 - time (sec): 3.58 - samples/sec: 4826.86 - lr: 0.000015 - momentum: 0.000000
|
156 |
+
2023-10-25 21:06:34,276 epoch 6 - iter 70/146 - loss 0.02538588 - time (sec): 4.52 - samples/sec: 4762.20 - lr: 0.000015 - momentum: 0.000000
|
157 |
+
2023-10-25 21:06:35,169 epoch 6 - iter 84/146 - loss 0.02401518 - time (sec): 5.42 - samples/sec: 4691.30 - lr: 0.000015 - momentum: 0.000000
|
158 |
+
2023-10-25 21:06:36,103 epoch 6 - iter 98/146 - loss 0.02319494 - time (sec): 6.35 - samples/sec: 4741.03 - lr: 0.000015 - momentum: 0.000000
|
159 |
+
2023-10-25 21:06:37,157 epoch 6 - iter 112/146 - loss 0.02399453 - time (sec): 7.41 - samples/sec: 4643.43 - lr: 0.000014 - momentum: 0.000000
|
160 |
+
2023-10-25 21:06:38,007 epoch 6 - iter 126/146 - loss 0.02529112 - time (sec): 8.26 - samples/sec: 4654.13 - lr: 0.000014 - momentum: 0.000000
|
161 |
+
2023-10-25 21:06:38,896 epoch 6 - iter 140/146 - loss 0.02430354 - time (sec): 9.14 - samples/sec: 4677.55 - lr: 0.000014 - momentum: 0.000000
|
162 |
+
2023-10-25 21:06:39,258 ----------------------------------------------------------------------------------------------------
|
163 |
+
2023-10-25 21:06:39,258 EPOCH 6 done: loss 0.0249 - lr: 0.000014
|
164 |
+
2023-10-25 21:06:40,322 DEV : loss 0.11456426978111267 - f1-score (micro avg) 0.7849
|
165 |
+
2023-10-25 21:06:40,327 saving best model
|
166 |
+
2023-10-25 21:06:40,998 ----------------------------------------------------------------------------------------------------
|
167 |
+
2023-10-25 21:06:41,831 epoch 7 - iter 14/146 - loss 0.01377810 - time (sec): 0.83 - samples/sec: 5025.09 - lr: 0.000013 - momentum: 0.000000
|
168 |
+
2023-10-25 21:06:43,030 epoch 7 - iter 28/146 - loss 0.02000949 - time (sec): 2.03 - samples/sec: 4964.90 - lr: 0.000013 - momentum: 0.000000
|
169 |
+
2023-10-25 21:06:43,818 epoch 7 - iter 42/146 - loss 0.02159133 - time (sec): 2.82 - samples/sec: 4774.67 - lr: 0.000012 - momentum: 0.000000
|
170 |
+
2023-10-25 21:06:44,680 epoch 7 - iter 56/146 - loss 0.02141280 - time (sec): 3.68 - samples/sec: 4696.42 - lr: 0.000012 - momentum: 0.000000
|
171 |
+
2023-10-25 21:06:45,530 epoch 7 - iter 70/146 - loss 0.02033340 - time (sec): 4.53 - samples/sec: 4722.77 - lr: 0.000012 - momentum: 0.000000
|
172 |
+
2023-10-25 21:06:46,479 epoch 7 - iter 84/146 - loss 0.01873330 - time (sec): 5.48 - samples/sec: 4804.21 - lr: 0.000012 - momentum: 0.000000
|
173 |
+
2023-10-25 21:06:47,374 epoch 7 - iter 98/146 - loss 0.01877254 - time (sec): 6.37 - samples/sec: 4815.74 - lr: 0.000011 - momentum: 0.000000
|
174 |
+
2023-10-25 21:06:48,168 epoch 7 - iter 112/146 - loss 0.01963305 - time (sec): 7.17 - samples/sec: 4765.05 - lr: 0.000011 - momentum: 0.000000
|
175 |
+
2023-10-25 21:06:48,984 epoch 7 - iter 126/146 - loss 0.01898464 - time (sec): 7.98 - samples/sec: 4793.92 - lr: 0.000011 - momentum: 0.000000
|
176 |
+
2023-10-25 21:06:49,883 epoch 7 - iter 140/146 - loss 0.01845447 - time (sec): 8.88 - samples/sec: 4782.28 - lr: 0.000010 - momentum: 0.000000
|
177 |
+
2023-10-25 21:06:50,299 ----------------------------------------------------------------------------------------------------
|
178 |
+
2023-10-25 21:06:50,299 EPOCH 7 done: loss 0.0181 - lr: 0.000010
|
179 |
+
2023-10-25 21:06:51,208 DEV : loss 0.13842682540416718 - f1-score (micro avg) 0.7788
|
180 |
+
2023-10-25 21:06:51,213 ----------------------------------------------------------------------------------------------------
|
181 |
+
2023-10-25 21:06:52,098 epoch 8 - iter 14/146 - loss 0.02179360 - time (sec): 0.88 - samples/sec: 4362.26 - lr: 0.000010 - momentum: 0.000000
|
182 |
+
2023-10-25 21:06:53,100 epoch 8 - iter 28/146 - loss 0.01457476 - time (sec): 1.89 - samples/sec: 4502.85 - lr: 0.000009 - momentum: 0.000000
|
183 |
+
2023-10-25 21:06:54,176 epoch 8 - iter 42/146 - loss 0.01363156 - time (sec): 2.96 - samples/sec: 4560.96 - lr: 0.000009 - momentum: 0.000000
|
184 |
+
2023-10-25 21:06:55,065 epoch 8 - iter 56/146 - loss 0.01358221 - time (sec): 3.85 - samples/sec: 4529.92 - lr: 0.000009 - momentum: 0.000000
|
185 |
+
2023-10-25 21:06:55,987 epoch 8 - iter 70/146 - loss 0.01502044 - time (sec): 4.77 - samples/sec: 4552.97 - lr: 0.000009 - momentum: 0.000000
|
186 |
+
2023-10-25 21:06:56,784 epoch 8 - iter 84/146 - loss 0.01489034 - time (sec): 5.57 - samples/sec: 4658.20 - lr: 0.000008 - momentum: 0.000000
|
187 |
+
2023-10-25 21:06:57,539 epoch 8 - iter 98/146 - loss 0.01449699 - time (sec): 6.32 - samples/sec: 4659.56 - lr: 0.000008 - momentum: 0.000000
|
188 |
+
2023-10-25 21:06:58,527 epoch 8 - iter 112/146 - loss 0.01427531 - time (sec): 7.31 - samples/sec: 4717.80 - lr: 0.000008 - momentum: 0.000000
|
189 |
+
2023-10-25 21:06:59,324 epoch 8 - iter 126/146 - loss 0.01369188 - time (sec): 8.11 - samples/sec: 4719.35 - lr: 0.000007 - momentum: 0.000000
|
190 |
+
2023-10-25 21:07:00,154 epoch 8 - iter 140/146 - loss 0.01269696 - time (sec): 8.94 - samples/sec: 4804.98 - lr: 0.000007 - momentum: 0.000000
|
191 |
+
2023-10-25 21:07:00,462 ----------------------------------------------------------------------------------------------------
|
192 |
+
2023-10-25 21:07:00,462 EPOCH 8 done: loss 0.0128 - lr: 0.000007
|
193 |
+
2023-10-25 21:07:01,376 DEV : loss 0.15397675335407257 - f1-score (micro avg) 0.7315
|
194 |
+
2023-10-25 21:07:01,380 ----------------------------------------------------------------------------------------------------
|
195 |
+
2023-10-25 21:07:02,243 epoch 9 - iter 14/146 - loss 0.00725404 - time (sec): 0.86 - samples/sec: 5396.63 - lr: 0.000006 - momentum: 0.000000
|
196 |
+
2023-10-25 21:07:03,063 epoch 9 - iter 28/146 - loss 0.00858283 - time (sec): 1.68 - samples/sec: 5303.83 - lr: 0.000006 - momentum: 0.000000
|
197 |
+
2023-10-25 21:07:03,844 epoch 9 - iter 42/146 - loss 0.00697012 - time (sec): 2.46 - samples/sec: 5096.55 - lr: 0.000006 - momentum: 0.000000
|
198 |
+
2023-10-25 21:07:04,896 epoch 9 - iter 56/146 - loss 0.00943578 - time (sec): 3.51 - samples/sec: 4980.37 - lr: 0.000006 - momentum: 0.000000
|
199 |
+
2023-10-25 21:07:05,865 epoch 9 - iter 70/146 - loss 0.01141288 - time (sec): 4.48 - samples/sec: 4965.81 - lr: 0.000005 - momentum: 0.000000
|
200 |
+
2023-10-25 21:07:06,742 epoch 9 - iter 84/146 - loss 0.01134343 - time (sec): 5.36 - samples/sec: 4920.44 - lr: 0.000005 - momentum: 0.000000
|
201 |
+
2023-10-25 21:07:07,668 epoch 9 - iter 98/146 - loss 0.01085725 - time (sec): 6.29 - samples/sec: 4919.30 - lr: 0.000005 - momentum: 0.000000
|
202 |
+
2023-10-25 21:07:08,425 epoch 9 - iter 112/146 - loss 0.01140709 - time (sec): 7.04 - samples/sec: 4880.19 - lr: 0.000004 - momentum: 0.000000
|
203 |
+
2023-10-25 21:07:09,309 epoch 9 - iter 126/146 - loss 0.01114475 - time (sec): 7.93 - samples/sec: 4855.84 - lr: 0.000004 - momentum: 0.000000
|
204 |
+
2023-10-25 21:07:10,174 epoch 9 - iter 140/146 - loss 0.01073435 - time (sec): 8.79 - samples/sec: 4862.91 - lr: 0.000004 - momentum: 0.000000
|
205 |
+
2023-10-25 21:07:10,483 ----------------------------------------------------------------------------------------------------
|
206 |
+
2023-10-25 21:07:10,483 EPOCH 9 done: loss 0.0109 - lr: 0.000004
|
207 |
+
2023-10-25 21:07:11,390 DEV : loss 0.15072251856327057 - f1-score (micro avg) 0.7598
|
208 |
+
2023-10-25 21:07:11,395 ----------------------------------------------------------------------------------------------------
|
209 |
+
2023-10-25 21:07:12,200 epoch 10 - iter 14/146 - loss 0.00398947 - time (sec): 0.80 - samples/sec: 5186.37 - lr: 0.000003 - momentum: 0.000000
|
210 |
+
2023-10-25 21:07:13,172 epoch 10 - iter 28/146 - loss 0.00875898 - time (sec): 1.78 - samples/sec: 4781.14 - lr: 0.000003 - momentum: 0.000000
|
211 |
+
2023-10-25 21:07:14,022 epoch 10 - iter 42/146 - loss 0.00922190 - time (sec): 2.63 - samples/sec: 4759.26 - lr: 0.000003 - momentum: 0.000000
|
212 |
+
2023-10-25 21:07:14,955 epoch 10 - iter 56/146 - loss 0.01137857 - time (sec): 3.56 - samples/sec: 4758.01 - lr: 0.000002 - momentum: 0.000000
|
213 |
+
2023-10-25 21:07:15,813 epoch 10 - iter 70/146 - loss 0.00974451 - time (sec): 4.42 - samples/sec: 4783.44 - lr: 0.000002 - momentum: 0.000000
|
214 |
+
2023-10-25 21:07:16,574 epoch 10 - iter 84/146 - loss 0.00891124 - time (sec): 5.18 - samples/sec: 4755.93 - lr: 0.000002 - momentum: 0.000000
|
215 |
+
2023-10-25 21:07:17,576 epoch 10 - iter 98/146 - loss 0.01023278 - time (sec): 6.18 - samples/sec: 4727.60 - lr: 0.000001 - momentum: 0.000000
|
216 |
+
2023-10-25 21:07:18,489 epoch 10 - iter 112/146 - loss 0.00961012 - time (sec): 7.09 - samples/sec: 4783.79 - lr: 0.000001 - momentum: 0.000000
|
217 |
+
2023-10-25 21:07:19,586 epoch 10 - iter 126/146 - loss 0.00898474 - time (sec): 8.19 - samples/sec: 4684.43 - lr: 0.000001 - momentum: 0.000000
|
218 |
+
2023-10-25 21:07:20,438 epoch 10 - iter 140/146 - loss 0.00851157 - time (sec): 9.04 - samples/sec: 4730.83 - lr: 0.000000 - momentum: 0.000000
|
219 |
+
2023-10-25 21:07:20,797 ----------------------------------------------------------------------------------------------------
|
220 |
+
2023-10-25 21:07:20,797 EPOCH 10 done: loss 0.0087 - lr: 0.000000
|
221 |
+
2023-10-25 21:07:21,711 DEV : loss 0.1518029123544693 - f1-score (micro avg) 0.7588
|
222 |
+
2023-10-25 21:07:22,231 ----------------------------------------------------------------------------------------------------
|
223 |
+
2023-10-25 21:07:22,233 Loading model from best epoch ...
|
224 |
+
2023-10-25 21:07:23,959 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
|
225 |
+
2023-10-25 21:07:25,507
|
226 |
+
Results:
|
227 |
+
- F-score (micro) 0.7581
|
228 |
+
- F-score (macro) 0.6581
|
229 |
+
- Accuracy 0.6331
|
230 |
+
|
231 |
+
By class:
|
232 |
+
precision recall f1-score support
|
233 |
+
|
234 |
+
PER 0.7855 0.8420 0.8128 348
|
235 |
+
LOC 0.6943 0.8352 0.7583 261
|
236 |
+
ORG 0.4348 0.3846 0.4082 52
|
237 |
+
HumanProd 0.5926 0.7273 0.6531 22
|
238 |
+
|
239 |
+
micro avg 0.7197 0.8009 0.7581 683
|
240 |
+
macro avg 0.6268 0.6973 0.6581 683
|
241 |
+
weighted avg 0.7177 0.8009 0.7560 683
|
242 |
+
|
243 |
+
2023-10-25 21:07:25,507 ----------------------------------------------------------------------------------------------------
|