stefan-it commited on
Commit
face203
1 Parent(s): f51ce17

Upload folder using huggingface_hub

Browse files
Files changed (5) hide show
  1. best-model.pt +3 -0
  2. dev.tsv +0 -0
  3. loss.tsv +11 -0
  4. test.tsv +0 -0
  5. training.log +241 -0
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b2a46f5dfe07a3828078a6ad589a70c820d9ac64ebd0f0828defb65c273f7e4b
3
+ size 443311175
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 23:38:25 0.0000 0.3270 0.0676 0.7342 0.6878 0.7102 0.5621
3
+ 2 23:39:37 0.0000 0.0833 0.0614 0.7410 0.7848 0.7623 0.6242
4
+ 3 23:40:49 0.0000 0.0563 0.0841 0.7229 0.7595 0.7407 0.6102
5
+ 4 23:41:59 0.0000 0.0376 0.0839 0.7158 0.8397 0.7728 0.6482
6
+ 5 23:43:11 0.0000 0.0278 0.1024 0.7519 0.8312 0.7896 0.6633
7
+ 6 23:44:22 0.0000 0.0200 0.1068 0.7424 0.8270 0.7824 0.6577
8
+ 7 23:45:33 0.0000 0.0128 0.1083 0.7654 0.8397 0.8008 0.6792
9
+ 8 23:46:46 0.0000 0.0088 0.1109 0.7647 0.8228 0.7927 0.6747
10
+ 9 23:47:58 0.0000 0.0048 0.1213 0.7751 0.8143 0.7942 0.6772
11
+ 10 23:49:10 0.0000 0.0033 0.1214 0.7795 0.8354 0.8065 0.6899
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,241 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-16 23:37:15,287 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-16 23:37:15,288 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(32001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=13, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-16 23:37:15,288 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-16 23:37:15,288 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
52
+ - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
53
+ 2023-10-16 23:37:15,288 ----------------------------------------------------------------------------------------------------
54
+ 2023-10-16 23:37:15,288 Train: 6183 sentences
55
+ 2023-10-16 23:37:15,288 (train_with_dev=False, train_with_test=False)
56
+ 2023-10-16 23:37:15,288 ----------------------------------------------------------------------------------------------------
57
+ 2023-10-16 23:37:15,288 Training Params:
58
+ 2023-10-16 23:37:15,288 - learning_rate: "3e-05"
59
+ 2023-10-16 23:37:15,288 - mini_batch_size: "4"
60
+ 2023-10-16 23:37:15,288 - max_epochs: "10"
61
+ 2023-10-16 23:37:15,288 - shuffle: "True"
62
+ 2023-10-16 23:37:15,288 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-16 23:37:15,288 Plugins:
64
+ 2023-10-16 23:37:15,289 - LinearScheduler | warmup_fraction: '0.1'
65
+ 2023-10-16 23:37:15,289 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-16 23:37:15,289 Final evaluation on model from best epoch (best-model.pt)
67
+ 2023-10-16 23:37:15,289 - metric: "('micro avg', 'f1-score')"
68
+ 2023-10-16 23:37:15,289 ----------------------------------------------------------------------------------------------------
69
+ 2023-10-16 23:37:15,289 Computation:
70
+ 2023-10-16 23:37:15,289 - compute on device: cuda:0
71
+ 2023-10-16 23:37:15,289 - embedding storage: none
72
+ 2023-10-16 23:37:15,289 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-16 23:37:15,289 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
74
+ 2023-10-16 23:37:15,289 ----------------------------------------------------------------------------------------------------
75
+ 2023-10-16 23:37:15,289 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-16 23:37:22,073 epoch 1 - iter 154/1546 - loss 1.92565161 - time (sec): 6.78 - samples/sec: 1898.22 - lr: 0.000003 - momentum: 0.000000
77
+ 2023-10-16 23:37:28,810 epoch 1 - iter 308/1546 - loss 1.09983562 - time (sec): 13.52 - samples/sec: 1886.82 - lr: 0.000006 - momentum: 0.000000
78
+ 2023-10-16 23:37:35,629 epoch 1 - iter 462/1546 - loss 0.79988559 - time (sec): 20.34 - samples/sec: 1855.62 - lr: 0.000009 - momentum: 0.000000
79
+ 2023-10-16 23:37:42,402 epoch 1 - iter 616/1546 - loss 0.64149237 - time (sec): 27.11 - samples/sec: 1841.30 - lr: 0.000012 - momentum: 0.000000
80
+ 2023-10-16 23:37:49,223 epoch 1 - iter 770/1546 - loss 0.53678387 - time (sec): 33.93 - samples/sec: 1842.84 - lr: 0.000015 - momentum: 0.000000
81
+ 2023-10-16 23:37:56,159 epoch 1 - iter 924/1546 - loss 0.47121079 - time (sec): 40.87 - samples/sec: 1817.10 - lr: 0.000018 - momentum: 0.000000
82
+ 2023-10-16 23:38:02,949 epoch 1 - iter 1078/1546 - loss 0.42227833 - time (sec): 47.66 - samples/sec: 1810.49 - lr: 0.000021 - momentum: 0.000000
83
+ 2023-10-16 23:38:09,776 epoch 1 - iter 1232/1546 - loss 0.38574946 - time (sec): 54.49 - samples/sec: 1807.68 - lr: 0.000024 - momentum: 0.000000
84
+ 2023-10-16 23:38:16,772 epoch 1 - iter 1386/1546 - loss 0.35423814 - time (sec): 61.48 - samples/sec: 1809.07 - lr: 0.000027 - momentum: 0.000000
85
+ 2023-10-16 23:38:23,585 epoch 1 - iter 1540/1546 - loss 0.32780022 - time (sec): 68.30 - samples/sec: 1814.47 - lr: 0.000030 - momentum: 0.000000
86
+ 2023-10-16 23:38:23,847 ----------------------------------------------------------------------------------------------------
87
+ 2023-10-16 23:38:23,847 EPOCH 1 done: loss 0.3270 - lr: 0.000030
88
+ 2023-10-16 23:38:25,866 DEV : loss 0.06755758821964264 - f1-score (micro avg) 0.7102
89
+ 2023-10-16 23:38:25,894 saving best model
90
+ 2023-10-16 23:38:26,224 ----------------------------------------------------------------------------------------------------
91
+ 2023-10-16 23:38:33,203 epoch 2 - iter 154/1546 - loss 0.09083964 - time (sec): 6.98 - samples/sec: 1894.92 - lr: 0.000030 - momentum: 0.000000
92
+ 2023-10-16 23:38:40,075 epoch 2 - iter 308/1546 - loss 0.08611689 - time (sec): 13.85 - samples/sec: 1864.36 - lr: 0.000029 - momentum: 0.000000
93
+ 2023-10-16 23:38:46,893 epoch 2 - iter 462/1546 - loss 0.08435768 - time (sec): 20.67 - samples/sec: 1843.68 - lr: 0.000029 - momentum: 0.000000
94
+ 2023-10-16 23:38:53,775 epoch 2 - iter 616/1546 - loss 0.08829396 - time (sec): 27.55 - samples/sec: 1815.83 - lr: 0.000029 - momentum: 0.000000
95
+ 2023-10-16 23:39:00,586 epoch 2 - iter 770/1546 - loss 0.08874921 - time (sec): 34.36 - samples/sec: 1790.29 - lr: 0.000028 - momentum: 0.000000
96
+ 2023-10-16 23:39:07,326 epoch 2 - iter 924/1546 - loss 0.08870459 - time (sec): 41.10 - samples/sec: 1801.42 - lr: 0.000028 - momentum: 0.000000
97
+ 2023-10-16 23:39:14,168 epoch 2 - iter 1078/1546 - loss 0.08797980 - time (sec): 47.94 - samples/sec: 1809.00 - lr: 0.000028 - momentum: 0.000000
98
+ 2023-10-16 23:39:21,230 epoch 2 - iter 1232/1546 - loss 0.08446684 - time (sec): 55.00 - samples/sec: 1812.34 - lr: 0.000027 - momentum: 0.000000
99
+ 2023-10-16 23:39:28,023 epoch 2 - iter 1386/1546 - loss 0.08329303 - time (sec): 61.80 - samples/sec: 1803.61 - lr: 0.000027 - momentum: 0.000000
100
+ 2023-10-16 23:39:34,880 epoch 2 - iter 1540/1546 - loss 0.08339448 - time (sec): 68.65 - samples/sec: 1805.87 - lr: 0.000027 - momentum: 0.000000
101
+ 2023-10-16 23:39:35,138 ----------------------------------------------------------------------------------------------------
102
+ 2023-10-16 23:39:35,138 EPOCH 2 done: loss 0.0833 - lr: 0.000027
103
+ 2023-10-16 23:39:37,244 DEV : loss 0.06139129400253296 - f1-score (micro avg) 0.7623
104
+ 2023-10-16 23:39:37,257 saving best model
105
+ 2023-10-16 23:39:37,686 ----------------------------------------------------------------------------------------------------
106
+ 2023-10-16 23:39:44,558 epoch 3 - iter 154/1546 - loss 0.04155378 - time (sec): 6.87 - samples/sec: 1859.19 - lr: 0.000026 - momentum: 0.000000
107
+ 2023-10-16 23:39:51,450 epoch 3 - iter 308/1546 - loss 0.05960039 - time (sec): 13.76 - samples/sec: 1877.23 - lr: 0.000026 - momentum: 0.000000
108
+ 2023-10-16 23:39:58,368 epoch 3 - iter 462/1546 - loss 0.05708094 - time (sec): 20.68 - samples/sec: 1894.00 - lr: 0.000026 - momentum: 0.000000
109
+ 2023-10-16 23:40:05,183 epoch 3 - iter 616/1546 - loss 0.05432285 - time (sec): 27.50 - samples/sec: 1847.87 - lr: 0.000025 - momentum: 0.000000
110
+ 2023-10-16 23:40:12,029 epoch 3 - iter 770/1546 - loss 0.05492115 - time (sec): 34.34 - samples/sec: 1830.11 - lr: 0.000025 - momentum: 0.000000
111
+ 2023-10-16 23:40:18,815 epoch 3 - iter 924/1546 - loss 0.05565481 - time (sec): 41.13 - samples/sec: 1815.96 - lr: 0.000025 - momentum: 0.000000
112
+ 2023-10-16 23:40:25,763 epoch 3 - iter 1078/1546 - loss 0.05690822 - time (sec): 48.08 - samples/sec: 1824.49 - lr: 0.000024 - momentum: 0.000000
113
+ 2023-10-16 23:40:32,745 epoch 3 - iter 1232/1546 - loss 0.05538883 - time (sec): 55.06 - samples/sec: 1813.92 - lr: 0.000024 - momentum: 0.000000
114
+ 2023-10-16 23:40:39,610 epoch 3 - iter 1386/1546 - loss 0.05683208 - time (sec): 61.92 - samples/sec: 1799.53 - lr: 0.000024 - momentum: 0.000000
115
+ 2023-10-16 23:40:46,465 epoch 3 - iter 1540/1546 - loss 0.05646703 - time (sec): 68.78 - samples/sec: 1801.06 - lr: 0.000023 - momentum: 0.000000
116
+ 2023-10-16 23:40:46,726 ----------------------------------------------------------------------------------------------------
117
+ 2023-10-16 23:40:46,727 EPOCH 3 done: loss 0.0563 - lr: 0.000023
118
+ 2023-10-16 23:40:49,096 DEV : loss 0.08413656055927277 - f1-score (micro avg) 0.7407
119
+ 2023-10-16 23:40:49,109 ----------------------------------------------------------------------------------------------------
120
+ 2023-10-16 23:40:56,043 epoch 4 - iter 154/1546 - loss 0.03911177 - time (sec): 6.93 - samples/sec: 1680.71 - lr: 0.000023 - momentum: 0.000000
121
+ 2023-10-16 23:41:03,048 epoch 4 - iter 308/1546 - loss 0.03409785 - time (sec): 13.94 - samples/sec: 1698.01 - lr: 0.000023 - momentum: 0.000000
122
+ 2023-10-16 23:41:09,927 epoch 4 - iter 462/1546 - loss 0.03599079 - time (sec): 20.82 - samples/sec: 1745.38 - lr: 0.000022 - momentum: 0.000000
123
+ 2023-10-16 23:41:16,679 epoch 4 - iter 616/1546 - loss 0.03396381 - time (sec): 27.57 - samples/sec: 1762.13 - lr: 0.000022 - momentum: 0.000000
124
+ 2023-10-16 23:41:23,254 epoch 4 - iter 770/1546 - loss 0.03546867 - time (sec): 34.14 - samples/sec: 1781.08 - lr: 0.000022 - momentum: 0.000000
125
+ 2023-10-16 23:41:30,011 epoch 4 - iter 924/1546 - loss 0.03515207 - time (sec): 40.90 - samples/sec: 1780.02 - lr: 0.000021 - momentum: 0.000000
126
+ 2023-10-16 23:41:36,908 epoch 4 - iter 1078/1546 - loss 0.03640270 - time (sec): 47.80 - samples/sec: 1787.74 - lr: 0.000021 - momentum: 0.000000
127
+ 2023-10-16 23:41:43,799 epoch 4 - iter 1232/1546 - loss 0.03764293 - time (sec): 54.69 - samples/sec: 1786.11 - lr: 0.000021 - momentum: 0.000000
128
+ 2023-10-16 23:41:50,658 epoch 4 - iter 1386/1546 - loss 0.03780300 - time (sec): 61.55 - samples/sec: 1794.32 - lr: 0.000020 - momentum: 0.000000
129
+ 2023-10-16 23:41:57,608 epoch 4 - iter 1540/1546 - loss 0.03770330 - time (sec): 68.50 - samples/sec: 1805.56 - lr: 0.000020 - momentum: 0.000000
130
+ 2023-10-16 23:41:57,872 ----------------------------------------------------------------------------------------------------
131
+ 2023-10-16 23:41:57,872 EPOCH 4 done: loss 0.0376 - lr: 0.000020
132
+ 2023-10-16 23:41:59,933 DEV : loss 0.08385952562093735 - f1-score (micro avg) 0.7728
133
+ 2023-10-16 23:41:59,945 saving best model
134
+ 2023-10-16 23:42:00,364 ----------------------------------------------------------------------------------------------------
135
+ 2023-10-16 23:42:06,961 epoch 5 - iter 154/1546 - loss 0.01723144 - time (sec): 6.59 - samples/sec: 1877.96 - lr: 0.000020 - momentum: 0.000000
136
+ 2023-10-16 23:42:13,855 epoch 5 - iter 308/1546 - loss 0.02130077 - time (sec): 13.49 - samples/sec: 1808.30 - lr: 0.000019 - momentum: 0.000000
137
+ 2023-10-16 23:42:20,643 epoch 5 - iter 462/1546 - loss 0.02384876 - time (sec): 20.28 - samples/sec: 1816.13 - lr: 0.000019 - momentum: 0.000000
138
+ 2023-10-16 23:42:27,408 epoch 5 - iter 616/1546 - loss 0.02637631 - time (sec): 27.04 - samples/sec: 1818.43 - lr: 0.000019 - momentum: 0.000000
139
+ 2023-10-16 23:42:34,374 epoch 5 - iter 770/1546 - loss 0.02693432 - time (sec): 34.01 - samples/sec: 1824.58 - lr: 0.000018 - momentum: 0.000000
140
+ 2023-10-16 23:42:41,338 epoch 5 - iter 924/1546 - loss 0.02773812 - time (sec): 40.97 - samples/sec: 1806.78 - lr: 0.000018 - momentum: 0.000000
141
+ 2023-10-16 23:42:48,253 epoch 5 - iter 1078/1546 - loss 0.02756016 - time (sec): 47.89 - samples/sec: 1828.54 - lr: 0.000018 - momentum: 0.000000
142
+ 2023-10-16 23:42:55,133 epoch 5 - iter 1232/1546 - loss 0.02690068 - time (sec): 54.77 - samples/sec: 1815.36 - lr: 0.000017 - momentum: 0.000000
143
+ 2023-10-16 23:43:01,995 epoch 5 - iter 1386/1546 - loss 0.02760696 - time (sec): 61.63 - samples/sec: 1813.32 - lr: 0.000017 - momentum: 0.000000
144
+ 2023-10-16 23:43:08,871 epoch 5 - iter 1540/1546 - loss 0.02778396 - time (sec): 68.50 - samples/sec: 1809.87 - lr: 0.000017 - momentum: 0.000000
145
+ 2023-10-16 23:43:09,124 ----------------------------------------------------------------------------------------------------
146
+ 2023-10-16 23:43:09,125 EPOCH 5 done: loss 0.0278 - lr: 0.000017
147
+ 2023-10-16 23:43:11,166 DEV : loss 0.10244771093130112 - f1-score (micro avg) 0.7896
148
+ 2023-10-16 23:43:11,178 saving best model
149
+ 2023-10-16 23:43:11,591 ----------------------------------------------------------------------------------------------------
150
+ 2023-10-16 23:43:18,310 epoch 6 - iter 154/1546 - loss 0.01205148 - time (sec): 6.72 - samples/sec: 1874.61 - lr: 0.000016 - momentum: 0.000000
151
+ 2023-10-16 23:43:25,177 epoch 6 - iter 308/1546 - loss 0.01547760 - time (sec): 13.58 - samples/sec: 1868.15 - lr: 0.000016 - momentum: 0.000000
152
+ 2023-10-16 23:43:32,027 epoch 6 - iter 462/1546 - loss 0.02030563 - time (sec): 20.44 - samples/sec: 1818.03 - lr: 0.000016 - momentum: 0.000000
153
+ 2023-10-16 23:43:38,895 epoch 6 - iter 616/1546 - loss 0.02101298 - time (sec): 27.30 - samples/sec: 1818.55 - lr: 0.000015 - momentum: 0.000000
154
+ 2023-10-16 23:43:45,757 epoch 6 - iter 770/1546 - loss 0.02068268 - time (sec): 34.16 - samples/sec: 1826.89 - lr: 0.000015 - momentum: 0.000000
155
+ 2023-10-16 23:43:52,699 epoch 6 - iter 924/1546 - loss 0.02111511 - time (sec): 41.11 - samples/sec: 1806.61 - lr: 0.000015 - momentum: 0.000000
156
+ 2023-10-16 23:43:59,574 epoch 6 - iter 1078/1546 - loss 0.01946876 - time (sec): 47.98 - samples/sec: 1811.25 - lr: 0.000014 - momentum: 0.000000
157
+ 2023-10-16 23:44:06,428 epoch 6 - iter 1232/1546 - loss 0.01960139 - time (sec): 54.84 - samples/sec: 1782.03 - lr: 0.000014 - momentum: 0.000000
158
+ 2023-10-16 23:44:13,375 epoch 6 - iter 1386/1546 - loss 0.01983819 - time (sec): 61.78 - samples/sec: 1791.03 - lr: 0.000014 - momentum: 0.000000
159
+ 2023-10-16 23:44:20,351 epoch 6 - iter 1540/1546 - loss 0.01996004 - time (sec): 68.76 - samples/sec: 1802.90 - lr: 0.000013 - momentum: 0.000000
160
+ 2023-10-16 23:44:20,625 ----------------------------------------------------------------------------------------------------
161
+ 2023-10-16 23:44:20,625 EPOCH 6 done: loss 0.0200 - lr: 0.000013
162
+ 2023-10-16 23:44:22,734 DEV : loss 0.10681257396936417 - f1-score (micro avg) 0.7824
163
+ 2023-10-16 23:44:22,747 ----------------------------------------------------------------------------------------------------
164
+ 2023-10-16 23:44:29,520 epoch 7 - iter 154/1546 - loss 0.01882747 - time (sec): 6.77 - samples/sec: 1707.23 - lr: 0.000013 - momentum: 0.000000
165
+ 2023-10-16 23:44:36,364 epoch 7 - iter 308/1546 - loss 0.01786356 - time (sec): 13.62 - samples/sec: 1706.79 - lr: 0.000013 - momentum: 0.000000
166
+ 2023-10-16 23:44:43,172 epoch 7 - iter 462/1546 - loss 0.01399142 - time (sec): 20.42 - samples/sec: 1720.07 - lr: 0.000012 - momentum: 0.000000
167
+ 2023-10-16 23:44:50,128 epoch 7 - iter 616/1546 - loss 0.01287695 - time (sec): 27.38 - samples/sec: 1754.57 - lr: 0.000012 - momentum: 0.000000
168
+ 2023-10-16 23:44:57,096 epoch 7 - iter 770/1546 - loss 0.01298667 - time (sec): 34.35 - samples/sec: 1771.84 - lr: 0.000012 - momentum: 0.000000
169
+ 2023-10-16 23:45:04,180 epoch 7 - iter 924/1546 - loss 0.01203834 - time (sec): 41.43 - samples/sec: 1779.29 - lr: 0.000011 - momentum: 0.000000
170
+ 2023-10-16 23:45:11,016 epoch 7 - iter 1078/1546 - loss 0.01282603 - time (sec): 48.27 - samples/sec: 1800.52 - lr: 0.000011 - momentum: 0.000000
171
+ 2023-10-16 23:45:17,812 epoch 7 - iter 1232/1546 - loss 0.01248353 - time (sec): 55.06 - samples/sec: 1805.09 - lr: 0.000011 - momentum: 0.000000
172
+ 2023-10-16 23:45:24,596 epoch 7 - iter 1386/1546 - loss 0.01265255 - time (sec): 61.85 - samples/sec: 1798.93 - lr: 0.000010 - momentum: 0.000000
173
+ 2023-10-16 23:45:31,426 epoch 7 - iter 1540/1546 - loss 0.01287871 - time (sec): 68.68 - samples/sec: 1803.61 - lr: 0.000010 - momentum: 0.000000
174
+ 2023-10-16 23:45:31,687 ----------------------------------------------------------------------------------------------------
175
+ 2023-10-16 23:45:31,687 EPOCH 7 done: loss 0.0128 - lr: 0.000010
176
+ 2023-10-16 23:45:33,866 DEV : loss 0.10830121487379074 - f1-score (micro avg) 0.8008
177
+ 2023-10-16 23:45:33,880 saving best model
178
+ 2023-10-16 23:45:34,323 ----------------------------------------------------------------------------------------------------
179
+ 2023-10-16 23:45:41,641 epoch 8 - iter 154/1546 - loss 0.01179072 - time (sec): 7.32 - samples/sec: 1690.40 - lr: 0.000010 - momentum: 0.000000
180
+ 2023-10-16 23:45:48,999 epoch 8 - iter 308/1546 - loss 0.01016285 - time (sec): 14.67 - samples/sec: 1759.76 - lr: 0.000009 - momentum: 0.000000
181
+ 2023-10-16 23:45:55,984 epoch 8 - iter 462/1546 - loss 0.01018965 - time (sec): 21.66 - samples/sec: 1748.65 - lr: 0.000009 - momentum: 0.000000
182
+ 2023-10-16 23:46:02,875 epoch 8 - iter 616/1546 - loss 0.00940054 - time (sec): 28.55 - samples/sec: 1776.30 - lr: 0.000009 - momentum: 0.000000
183
+ 2023-10-16 23:46:09,813 epoch 8 - iter 770/1546 - loss 0.00901390 - time (sec): 35.49 - samples/sec: 1789.22 - lr: 0.000008 - momentum: 0.000000
184
+ 2023-10-16 23:46:17,225 epoch 8 - iter 924/1546 - loss 0.00873861 - time (sec): 42.90 - samples/sec: 1782.49 - lr: 0.000008 - momentum: 0.000000
185
+ 2023-10-16 23:46:24,038 epoch 8 - iter 1078/1546 - loss 0.00860591 - time (sec): 49.71 - samples/sec: 1770.95 - lr: 0.000008 - momentum: 0.000000
186
+ 2023-10-16 23:46:30,861 epoch 8 - iter 1232/1546 - loss 0.00864071 - time (sec): 56.54 - samples/sec: 1759.08 - lr: 0.000007 - momentum: 0.000000
187
+ 2023-10-16 23:46:37,745 epoch 8 - iter 1386/1546 - loss 0.00860292 - time (sec): 63.42 - samples/sec: 1768.59 - lr: 0.000007 - momentum: 0.000000
188
+ 2023-10-16 23:46:44,604 epoch 8 - iter 1540/1546 - loss 0.00880895 - time (sec): 70.28 - samples/sec: 1762.21 - lr: 0.000007 - momentum: 0.000000
189
+ 2023-10-16 23:46:44,874 ----------------------------------------------------------------------------------------------------
190
+ 2023-10-16 23:46:44,875 EPOCH 8 done: loss 0.0088 - lr: 0.000007
191
+ 2023-10-16 23:46:46,985 DEV : loss 0.11089599132537842 - f1-score (micro avg) 0.7927
192
+ 2023-10-16 23:46:46,998 ----------------------------------------------------------------------------------------------------
193
+ 2023-10-16 23:46:53,836 epoch 9 - iter 154/1546 - loss 0.01019047 - time (sec): 6.84 - samples/sec: 1789.49 - lr: 0.000006 - momentum: 0.000000
194
+ 2023-10-16 23:47:00,712 epoch 9 - iter 308/1546 - loss 0.00780744 - time (sec): 13.71 - samples/sec: 1843.01 - lr: 0.000006 - momentum: 0.000000
195
+ 2023-10-16 23:47:07,681 epoch 9 - iter 462/1546 - loss 0.00633308 - time (sec): 20.68 - samples/sec: 1856.35 - lr: 0.000006 - momentum: 0.000000
196
+ 2023-10-16 23:47:14,504 epoch 9 - iter 616/1546 - loss 0.00589385 - time (sec): 27.51 - samples/sec: 1833.30 - lr: 0.000005 - momentum: 0.000000
197
+ 2023-10-16 23:47:21,527 epoch 9 - iter 770/1546 - loss 0.00517419 - time (sec): 34.53 - samples/sec: 1816.10 - lr: 0.000005 - momentum: 0.000000
198
+ 2023-10-16 23:47:28,487 epoch 9 - iter 924/1546 - loss 0.00498572 - time (sec): 41.49 - samples/sec: 1806.16 - lr: 0.000005 - momentum: 0.000000
199
+ 2023-10-16 23:47:35,321 epoch 9 - iter 1078/1546 - loss 0.00495726 - time (sec): 48.32 - samples/sec: 1788.16 - lr: 0.000004 - momentum: 0.000000
200
+ 2023-10-16 23:47:42,255 epoch 9 - iter 1232/1546 - loss 0.00485612 - time (sec): 55.26 - samples/sec: 1794.97 - lr: 0.000004 - momentum: 0.000000
201
+ 2023-10-16 23:47:49,207 epoch 9 - iter 1386/1546 - loss 0.00481666 - time (sec): 62.21 - samples/sec: 1795.14 - lr: 0.000004 - momentum: 0.000000
202
+ 2023-10-16 23:47:56,096 epoch 9 - iter 1540/1546 - loss 0.00484866 - time (sec): 69.10 - samples/sec: 1790.64 - lr: 0.000003 - momentum: 0.000000
203
+ 2023-10-16 23:47:56,367 ----------------------------------------------------------------------------------------------------
204
+ 2023-10-16 23:47:56,367 EPOCH 9 done: loss 0.0048 - lr: 0.000003
205
+ 2023-10-16 23:47:58,475 DEV : loss 0.12125992029905319 - f1-score (micro avg) 0.7942
206
+ 2023-10-16 23:47:58,488 ----------------------------------------------------------------------------------------------------
207
+ 2023-10-16 23:48:05,559 epoch 10 - iter 154/1546 - loss 0.00554820 - time (sec): 7.07 - samples/sec: 1779.17 - lr: 0.000003 - momentum: 0.000000
208
+ 2023-10-16 23:48:12,504 epoch 10 - iter 308/1546 - loss 0.00532913 - time (sec): 14.01 - samples/sec: 1782.70 - lr: 0.000003 - momentum: 0.000000
209
+ 2023-10-16 23:48:19,380 epoch 10 - iter 462/1546 - loss 0.00538433 - time (sec): 20.89 - samples/sec: 1740.74 - lr: 0.000002 - momentum: 0.000000
210
+ 2023-10-16 23:48:26,473 epoch 10 - iter 616/1546 - loss 0.00450212 - time (sec): 27.98 - samples/sec: 1758.70 - lr: 0.000002 - momentum: 0.000000
211
+ 2023-10-16 23:48:33,548 epoch 10 - iter 770/1546 - loss 0.00405722 - time (sec): 35.06 - samples/sec: 1785.83 - lr: 0.000002 - momentum: 0.000000
212
+ 2023-10-16 23:48:40,558 epoch 10 - iter 924/1546 - loss 0.00351806 - time (sec): 42.07 - samples/sec: 1799.92 - lr: 0.000001 - momentum: 0.000000
213
+ 2023-10-16 23:48:47,503 epoch 10 - iter 1078/1546 - loss 0.00321229 - time (sec): 49.01 - samples/sec: 1795.61 - lr: 0.000001 - momentum: 0.000000
214
+ 2023-10-16 23:48:54,315 epoch 10 - iter 1232/1546 - loss 0.00327509 - time (sec): 55.83 - samples/sec: 1784.03 - lr: 0.000001 - momentum: 0.000000
215
+ 2023-10-16 23:49:01,187 epoch 10 - iter 1386/1546 - loss 0.00338661 - time (sec): 62.70 - samples/sec: 1784.67 - lr: 0.000000 - momentum: 0.000000
216
+ 2023-10-16 23:49:08,070 epoch 10 - iter 1540/1546 - loss 0.00333392 - time (sec): 69.58 - samples/sec: 1779.95 - lr: 0.000000 - momentum: 0.000000
217
+ 2023-10-16 23:49:08,338 ----------------------------------------------------------------------------------------------------
218
+ 2023-10-16 23:49:08,338 EPOCH 10 done: loss 0.0033 - lr: 0.000000
219
+ 2023-10-16 23:49:10,368 DEV : loss 0.12140633165836334 - f1-score (micro avg) 0.8065
220
+ 2023-10-16 23:49:10,380 saving best model
221
+ 2023-10-16 23:49:11,227 ----------------------------------------------------------------------------------------------------
222
+ 2023-10-16 23:49:11,228 Loading model from best epoch ...
223
+ 2023-10-16 23:49:12,839 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
224
+ 2023-10-16 23:49:18,915
225
+ Results:
226
+ - F-score (micro) 0.798
227
+ - F-score (macro) 0.6998
228
+ - Accuracy 0.6823
229
+
230
+ By class:
231
+ precision recall f1-score support
232
+
233
+ LOC 0.8416 0.8647 0.8530 946
234
+ BUILDING 0.5440 0.5351 0.5395 185
235
+ STREET 0.6833 0.7321 0.7069 56
236
+
237
+ micro avg 0.7891 0.8071 0.7980 1187
238
+ macro avg 0.6896 0.7107 0.6998 1187
239
+ weighted avg 0.7877 0.8071 0.7972 1187
240
+
241
+ 2023-10-16 23:49:18,915 ----------------------------------------------------------------------------------------------------