2023-10-16 23:37:15,287 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:37:15,288 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-16 23:37:15,288 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:37:15,288 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-16 23:37:15,288 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:37:15,288 Train: 6183 sentences 2023-10-16 23:37:15,288 (train_with_dev=False, train_with_test=False) 2023-10-16 23:37:15,288 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:37:15,288 Training Params: 2023-10-16 23:37:15,288 - learning_rate: "3e-05" 2023-10-16 23:37:15,288 - mini_batch_size: "4" 2023-10-16 23:37:15,288 - max_epochs: "10" 2023-10-16 23:37:15,288 - shuffle: "True" 2023-10-16 23:37:15,288 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:37:15,288 Plugins: 2023-10-16 23:37:15,289 - LinearScheduler | warmup_fraction: '0.1' 2023-10-16 23:37:15,289 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:37:15,289 Final evaluation on model from best epoch (best-model.pt) 2023-10-16 23:37:15,289 - metric: "('micro avg', 'f1-score')" 2023-10-16 23:37:15,289 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:37:15,289 Computation: 2023-10-16 23:37:15,289 - compute on device: cuda:0 2023-10-16 23:37:15,289 - embedding storage: none 2023-10-16 23:37:15,289 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:37:15,289 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-16 23:37:15,289 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:37:15,289 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:37:22,073 epoch 1 - iter 154/1546 - loss 1.92565161 - time (sec): 6.78 - samples/sec: 1898.22 - lr: 0.000003 - momentum: 0.000000 2023-10-16 23:37:28,810 epoch 1 - iter 308/1546 - loss 1.09983562 - time (sec): 13.52 - samples/sec: 1886.82 - lr: 0.000006 - momentum: 0.000000 2023-10-16 23:37:35,629 epoch 1 - iter 462/1546 - loss 0.79988559 - time (sec): 20.34 - samples/sec: 1855.62 - lr: 0.000009 - momentum: 0.000000 2023-10-16 23:37:42,402 epoch 1 - iter 616/1546 - loss 0.64149237 - time (sec): 27.11 - samples/sec: 1841.30 - lr: 0.000012 - momentum: 0.000000 2023-10-16 23:37:49,223 epoch 1 - iter 770/1546 - loss 0.53678387 - time (sec): 33.93 - samples/sec: 1842.84 - lr: 0.000015 - momentum: 0.000000 2023-10-16 23:37:56,159 epoch 1 - iter 924/1546 - loss 0.47121079 - time (sec): 40.87 - samples/sec: 1817.10 - lr: 0.000018 - momentum: 0.000000 2023-10-16 23:38:02,949 epoch 1 - iter 1078/1546 - loss 0.42227833 - time (sec): 47.66 - samples/sec: 1810.49 - lr: 0.000021 - momentum: 0.000000 2023-10-16 23:38:09,776 epoch 1 - iter 1232/1546 - loss 0.38574946 - time (sec): 54.49 - samples/sec: 1807.68 - lr: 0.000024 - momentum: 0.000000 2023-10-16 23:38:16,772 epoch 1 - iter 1386/1546 - loss 0.35423814 - time (sec): 61.48 - samples/sec: 1809.07 - lr: 0.000027 - momentum: 0.000000 2023-10-16 23:38:23,585 epoch 1 - iter 1540/1546 - loss 0.32780022 - time (sec): 68.30 - samples/sec: 1814.47 - lr: 0.000030 - momentum: 0.000000 2023-10-16 23:38:23,847 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:38:23,847 EPOCH 1 done: loss 0.3270 - lr: 0.000030 2023-10-16 23:38:25,866 DEV : loss 0.06755758821964264 - f1-score (micro avg) 0.7102 2023-10-16 23:38:25,894 saving best model 2023-10-16 23:38:26,224 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:38:33,203 epoch 2 - iter 154/1546 - loss 0.09083964 - time (sec): 6.98 - samples/sec: 1894.92 - lr: 0.000030 - momentum: 0.000000 2023-10-16 23:38:40,075 epoch 2 - iter 308/1546 - loss 0.08611689 - time (sec): 13.85 - samples/sec: 1864.36 - lr: 0.000029 - momentum: 0.000000 2023-10-16 23:38:46,893 epoch 2 - iter 462/1546 - loss 0.08435768 - time (sec): 20.67 - samples/sec: 1843.68 - lr: 0.000029 - momentum: 0.000000 2023-10-16 23:38:53,775 epoch 2 - iter 616/1546 - loss 0.08829396 - time (sec): 27.55 - samples/sec: 1815.83 - lr: 0.000029 - momentum: 0.000000 2023-10-16 23:39:00,586 epoch 2 - iter 770/1546 - loss 0.08874921 - time (sec): 34.36 - samples/sec: 1790.29 - lr: 0.000028 - momentum: 0.000000 2023-10-16 23:39:07,326 epoch 2 - iter 924/1546 - loss 0.08870459 - time (sec): 41.10 - samples/sec: 1801.42 - lr: 0.000028 - momentum: 0.000000 2023-10-16 23:39:14,168 epoch 2 - iter 1078/1546 - loss 0.08797980 - time (sec): 47.94 - samples/sec: 1809.00 - lr: 0.000028 - momentum: 0.000000 2023-10-16 23:39:21,230 epoch 2 - iter 1232/1546 - loss 0.08446684 - time (sec): 55.00 - samples/sec: 1812.34 - lr: 0.000027 - momentum: 0.000000 2023-10-16 23:39:28,023 epoch 2 - iter 1386/1546 - loss 0.08329303 - time (sec): 61.80 - samples/sec: 1803.61 - lr: 0.000027 - momentum: 0.000000 2023-10-16 23:39:34,880 epoch 2 - iter 1540/1546 - loss 0.08339448 - time (sec): 68.65 - samples/sec: 1805.87 - lr: 0.000027 - momentum: 0.000000 2023-10-16 23:39:35,138 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:39:35,138 EPOCH 2 done: loss 0.0833 - lr: 0.000027 2023-10-16 23:39:37,244 DEV : loss 0.06139129400253296 - f1-score (micro avg) 0.7623 2023-10-16 23:39:37,257 saving best model 2023-10-16 23:39:37,686 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:39:44,558 epoch 3 - iter 154/1546 - loss 0.04155378 - time (sec): 6.87 - samples/sec: 1859.19 - lr: 0.000026 - momentum: 0.000000 2023-10-16 23:39:51,450 epoch 3 - iter 308/1546 - loss 0.05960039 - time (sec): 13.76 - samples/sec: 1877.23 - lr: 0.000026 - momentum: 0.000000 2023-10-16 23:39:58,368 epoch 3 - iter 462/1546 - loss 0.05708094 - time (sec): 20.68 - samples/sec: 1894.00 - lr: 0.000026 - momentum: 0.000000 2023-10-16 23:40:05,183 epoch 3 - iter 616/1546 - loss 0.05432285 - time (sec): 27.50 - samples/sec: 1847.87 - lr: 0.000025 - momentum: 0.000000 2023-10-16 23:40:12,029 epoch 3 - iter 770/1546 - loss 0.05492115 - time (sec): 34.34 - samples/sec: 1830.11 - lr: 0.000025 - momentum: 0.000000 2023-10-16 23:40:18,815 epoch 3 - iter 924/1546 - loss 0.05565481 - time (sec): 41.13 - samples/sec: 1815.96 - lr: 0.000025 - momentum: 0.000000 2023-10-16 23:40:25,763 epoch 3 - iter 1078/1546 - loss 0.05690822 - time (sec): 48.08 - samples/sec: 1824.49 - lr: 0.000024 - momentum: 0.000000 2023-10-16 23:40:32,745 epoch 3 - iter 1232/1546 - loss 0.05538883 - time (sec): 55.06 - samples/sec: 1813.92 - lr: 0.000024 - momentum: 0.000000 2023-10-16 23:40:39,610 epoch 3 - iter 1386/1546 - loss 0.05683208 - time (sec): 61.92 - samples/sec: 1799.53 - lr: 0.000024 - momentum: 0.000000 2023-10-16 23:40:46,465 epoch 3 - iter 1540/1546 - loss 0.05646703 - time (sec): 68.78 - samples/sec: 1801.06 - lr: 0.000023 - momentum: 0.000000 2023-10-16 23:40:46,726 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:40:46,727 EPOCH 3 done: loss 0.0563 - lr: 0.000023 2023-10-16 23:40:49,096 DEV : loss 0.08413656055927277 - f1-score (micro avg) 0.7407 2023-10-16 23:40:49,109 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:40:56,043 epoch 4 - iter 154/1546 - loss 0.03911177 - time (sec): 6.93 - samples/sec: 1680.71 - lr: 0.000023 - momentum: 0.000000 2023-10-16 23:41:03,048 epoch 4 - iter 308/1546 - loss 0.03409785 - time (sec): 13.94 - samples/sec: 1698.01 - lr: 0.000023 - momentum: 0.000000 2023-10-16 23:41:09,927 epoch 4 - iter 462/1546 - loss 0.03599079 - time (sec): 20.82 - samples/sec: 1745.38 - lr: 0.000022 - momentum: 0.000000 2023-10-16 23:41:16,679 epoch 4 - iter 616/1546 - loss 0.03396381 - time (sec): 27.57 - samples/sec: 1762.13 - lr: 0.000022 - momentum: 0.000000 2023-10-16 23:41:23,254 epoch 4 - iter 770/1546 - loss 0.03546867 - time (sec): 34.14 - samples/sec: 1781.08 - lr: 0.000022 - momentum: 0.000000 2023-10-16 23:41:30,011 epoch 4 - iter 924/1546 - loss 0.03515207 - time (sec): 40.90 - samples/sec: 1780.02 - lr: 0.000021 - momentum: 0.000000 2023-10-16 23:41:36,908 epoch 4 - iter 1078/1546 - loss 0.03640270 - time (sec): 47.80 - samples/sec: 1787.74 - lr: 0.000021 - momentum: 0.000000 2023-10-16 23:41:43,799 epoch 4 - iter 1232/1546 - loss 0.03764293 - time (sec): 54.69 - samples/sec: 1786.11 - lr: 0.000021 - momentum: 0.000000 2023-10-16 23:41:50,658 epoch 4 - iter 1386/1546 - loss 0.03780300 - time (sec): 61.55 - samples/sec: 1794.32 - lr: 0.000020 - momentum: 0.000000 2023-10-16 23:41:57,608 epoch 4 - iter 1540/1546 - loss 0.03770330 - time (sec): 68.50 - samples/sec: 1805.56 - lr: 0.000020 - momentum: 0.000000 2023-10-16 23:41:57,872 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:41:57,872 EPOCH 4 done: loss 0.0376 - lr: 0.000020 2023-10-16 23:41:59,933 DEV : loss 0.08385952562093735 - f1-score (micro avg) 0.7728 2023-10-16 23:41:59,945 saving best model 2023-10-16 23:42:00,364 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:42:06,961 epoch 5 - iter 154/1546 - loss 0.01723144 - time (sec): 6.59 - samples/sec: 1877.96 - lr: 0.000020 - momentum: 0.000000 2023-10-16 23:42:13,855 epoch 5 - iter 308/1546 - loss 0.02130077 - time (sec): 13.49 - samples/sec: 1808.30 - lr: 0.000019 - momentum: 0.000000 2023-10-16 23:42:20,643 epoch 5 - iter 462/1546 - loss 0.02384876 - time (sec): 20.28 - samples/sec: 1816.13 - lr: 0.000019 - momentum: 0.000000 2023-10-16 23:42:27,408 epoch 5 - iter 616/1546 - loss 0.02637631 - time (sec): 27.04 - samples/sec: 1818.43 - lr: 0.000019 - momentum: 0.000000 2023-10-16 23:42:34,374 epoch 5 - iter 770/1546 - loss 0.02693432 - time (sec): 34.01 - samples/sec: 1824.58 - lr: 0.000018 - momentum: 0.000000 2023-10-16 23:42:41,338 epoch 5 - iter 924/1546 - loss 0.02773812 - time (sec): 40.97 - samples/sec: 1806.78 - lr: 0.000018 - momentum: 0.000000 2023-10-16 23:42:48,253 epoch 5 - iter 1078/1546 - loss 0.02756016 - time (sec): 47.89 - samples/sec: 1828.54 - lr: 0.000018 - momentum: 0.000000 2023-10-16 23:42:55,133 epoch 5 - iter 1232/1546 - loss 0.02690068 - time (sec): 54.77 - samples/sec: 1815.36 - lr: 0.000017 - momentum: 0.000000 2023-10-16 23:43:01,995 epoch 5 - iter 1386/1546 - loss 0.02760696 - time (sec): 61.63 - samples/sec: 1813.32 - lr: 0.000017 - momentum: 0.000000 2023-10-16 23:43:08,871 epoch 5 - iter 1540/1546 - loss 0.02778396 - time (sec): 68.50 - samples/sec: 1809.87 - lr: 0.000017 - momentum: 0.000000 2023-10-16 23:43:09,124 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:43:09,125 EPOCH 5 done: loss 0.0278 - lr: 0.000017 2023-10-16 23:43:11,166 DEV : loss 0.10244771093130112 - f1-score (micro avg) 0.7896 2023-10-16 23:43:11,178 saving best model 2023-10-16 23:43:11,591 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:43:18,310 epoch 6 - iter 154/1546 - loss 0.01205148 - time (sec): 6.72 - samples/sec: 1874.61 - lr: 0.000016 - momentum: 0.000000 2023-10-16 23:43:25,177 epoch 6 - iter 308/1546 - loss 0.01547760 - time (sec): 13.58 - samples/sec: 1868.15 - lr: 0.000016 - momentum: 0.000000 2023-10-16 23:43:32,027 epoch 6 - iter 462/1546 - loss 0.02030563 - time (sec): 20.44 - samples/sec: 1818.03 - lr: 0.000016 - momentum: 0.000000 2023-10-16 23:43:38,895 epoch 6 - iter 616/1546 - loss 0.02101298 - time (sec): 27.30 - samples/sec: 1818.55 - lr: 0.000015 - momentum: 0.000000 2023-10-16 23:43:45,757 epoch 6 - iter 770/1546 - loss 0.02068268 - time (sec): 34.16 - samples/sec: 1826.89 - lr: 0.000015 - momentum: 0.000000 2023-10-16 23:43:52,699 epoch 6 - iter 924/1546 - loss 0.02111511 - time (sec): 41.11 - samples/sec: 1806.61 - lr: 0.000015 - momentum: 0.000000 2023-10-16 23:43:59,574 epoch 6 - iter 1078/1546 - loss 0.01946876 - time (sec): 47.98 - samples/sec: 1811.25 - lr: 0.000014 - momentum: 0.000000 2023-10-16 23:44:06,428 epoch 6 - iter 1232/1546 - loss 0.01960139 - time (sec): 54.84 - samples/sec: 1782.03 - lr: 0.000014 - momentum: 0.000000 2023-10-16 23:44:13,375 epoch 6 - iter 1386/1546 - loss 0.01983819 - time (sec): 61.78 - samples/sec: 1791.03 - lr: 0.000014 - momentum: 0.000000 2023-10-16 23:44:20,351 epoch 6 - iter 1540/1546 - loss 0.01996004 - time (sec): 68.76 - samples/sec: 1802.90 - lr: 0.000013 - momentum: 0.000000 2023-10-16 23:44:20,625 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:44:20,625 EPOCH 6 done: loss 0.0200 - lr: 0.000013 2023-10-16 23:44:22,734 DEV : loss 0.10681257396936417 - f1-score (micro avg) 0.7824 2023-10-16 23:44:22,747 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:44:29,520 epoch 7 - iter 154/1546 - loss 0.01882747 - time (sec): 6.77 - samples/sec: 1707.23 - lr: 0.000013 - momentum: 0.000000 2023-10-16 23:44:36,364 epoch 7 - iter 308/1546 - loss 0.01786356 - time (sec): 13.62 - samples/sec: 1706.79 - lr: 0.000013 - momentum: 0.000000 2023-10-16 23:44:43,172 epoch 7 - iter 462/1546 - loss 0.01399142 - time (sec): 20.42 - samples/sec: 1720.07 - lr: 0.000012 - momentum: 0.000000 2023-10-16 23:44:50,128 epoch 7 - iter 616/1546 - loss 0.01287695 - time (sec): 27.38 - samples/sec: 1754.57 - lr: 0.000012 - momentum: 0.000000 2023-10-16 23:44:57,096 epoch 7 - iter 770/1546 - loss 0.01298667 - time (sec): 34.35 - samples/sec: 1771.84 - lr: 0.000012 - momentum: 0.000000 2023-10-16 23:45:04,180 epoch 7 - iter 924/1546 - loss 0.01203834 - time (sec): 41.43 - samples/sec: 1779.29 - lr: 0.000011 - momentum: 0.000000 2023-10-16 23:45:11,016 epoch 7 - iter 1078/1546 - loss 0.01282603 - time (sec): 48.27 - samples/sec: 1800.52 - lr: 0.000011 - momentum: 0.000000 2023-10-16 23:45:17,812 epoch 7 - iter 1232/1546 - loss 0.01248353 - time (sec): 55.06 - samples/sec: 1805.09 - lr: 0.000011 - momentum: 0.000000 2023-10-16 23:45:24,596 epoch 7 - iter 1386/1546 - loss 0.01265255 - time (sec): 61.85 - samples/sec: 1798.93 - lr: 0.000010 - momentum: 0.000000 2023-10-16 23:45:31,426 epoch 7 - iter 1540/1546 - loss 0.01287871 - time (sec): 68.68 - samples/sec: 1803.61 - lr: 0.000010 - momentum: 0.000000 2023-10-16 23:45:31,687 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:45:31,687 EPOCH 7 done: loss 0.0128 - lr: 0.000010 2023-10-16 23:45:33,866 DEV : loss 0.10830121487379074 - f1-score (micro avg) 0.8008 2023-10-16 23:45:33,880 saving best model 2023-10-16 23:45:34,323 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:45:41,641 epoch 8 - iter 154/1546 - loss 0.01179072 - time (sec): 7.32 - samples/sec: 1690.40 - lr: 0.000010 - momentum: 0.000000 2023-10-16 23:45:48,999 epoch 8 - iter 308/1546 - loss 0.01016285 - time (sec): 14.67 - samples/sec: 1759.76 - lr: 0.000009 - momentum: 0.000000 2023-10-16 23:45:55,984 epoch 8 - iter 462/1546 - loss 0.01018965 - time (sec): 21.66 - samples/sec: 1748.65 - lr: 0.000009 - momentum: 0.000000 2023-10-16 23:46:02,875 epoch 8 - iter 616/1546 - loss 0.00940054 - time (sec): 28.55 - samples/sec: 1776.30 - lr: 0.000009 - momentum: 0.000000 2023-10-16 23:46:09,813 epoch 8 - iter 770/1546 - loss 0.00901390 - time (sec): 35.49 - samples/sec: 1789.22 - lr: 0.000008 - momentum: 0.000000 2023-10-16 23:46:17,225 epoch 8 - iter 924/1546 - loss 0.00873861 - time (sec): 42.90 - samples/sec: 1782.49 - lr: 0.000008 - momentum: 0.000000 2023-10-16 23:46:24,038 epoch 8 - iter 1078/1546 - loss 0.00860591 - time (sec): 49.71 - samples/sec: 1770.95 - lr: 0.000008 - momentum: 0.000000 2023-10-16 23:46:30,861 epoch 8 - iter 1232/1546 - loss 0.00864071 - time (sec): 56.54 - samples/sec: 1759.08 - lr: 0.000007 - momentum: 0.000000 2023-10-16 23:46:37,745 epoch 8 - iter 1386/1546 - loss 0.00860292 - time (sec): 63.42 - samples/sec: 1768.59 - lr: 0.000007 - momentum: 0.000000 2023-10-16 23:46:44,604 epoch 8 - iter 1540/1546 - loss 0.00880895 - time (sec): 70.28 - samples/sec: 1762.21 - lr: 0.000007 - momentum: 0.000000 2023-10-16 23:46:44,874 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:46:44,875 EPOCH 8 done: loss 0.0088 - lr: 0.000007 2023-10-16 23:46:46,985 DEV : loss 0.11089599132537842 - f1-score (micro avg) 0.7927 2023-10-16 23:46:46,998 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:46:53,836 epoch 9 - iter 154/1546 - loss 0.01019047 - time (sec): 6.84 - samples/sec: 1789.49 - lr: 0.000006 - momentum: 0.000000 2023-10-16 23:47:00,712 epoch 9 - iter 308/1546 - loss 0.00780744 - time (sec): 13.71 - samples/sec: 1843.01 - lr: 0.000006 - momentum: 0.000000 2023-10-16 23:47:07,681 epoch 9 - iter 462/1546 - loss 0.00633308 - time (sec): 20.68 - samples/sec: 1856.35 - lr: 0.000006 - momentum: 0.000000 2023-10-16 23:47:14,504 epoch 9 - iter 616/1546 - loss 0.00589385 - time (sec): 27.51 - samples/sec: 1833.30 - lr: 0.000005 - momentum: 0.000000 2023-10-16 23:47:21,527 epoch 9 - iter 770/1546 - loss 0.00517419 - time (sec): 34.53 - samples/sec: 1816.10 - lr: 0.000005 - momentum: 0.000000 2023-10-16 23:47:28,487 epoch 9 - iter 924/1546 - loss 0.00498572 - time (sec): 41.49 - samples/sec: 1806.16 - lr: 0.000005 - momentum: 0.000000 2023-10-16 23:47:35,321 epoch 9 - iter 1078/1546 - loss 0.00495726 - time (sec): 48.32 - samples/sec: 1788.16 - lr: 0.000004 - momentum: 0.000000 2023-10-16 23:47:42,255 epoch 9 - iter 1232/1546 - loss 0.00485612 - time (sec): 55.26 - samples/sec: 1794.97 - lr: 0.000004 - momentum: 0.000000 2023-10-16 23:47:49,207 epoch 9 - iter 1386/1546 - loss 0.00481666 - time (sec): 62.21 - samples/sec: 1795.14 - lr: 0.000004 - momentum: 0.000000 2023-10-16 23:47:56,096 epoch 9 - iter 1540/1546 - loss 0.00484866 - time (sec): 69.10 - samples/sec: 1790.64 - lr: 0.000003 - momentum: 0.000000 2023-10-16 23:47:56,367 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:47:56,367 EPOCH 9 done: loss 0.0048 - lr: 0.000003 2023-10-16 23:47:58,475 DEV : loss 0.12125992029905319 - f1-score (micro avg) 0.7942 2023-10-16 23:47:58,488 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:48:05,559 epoch 10 - iter 154/1546 - loss 0.00554820 - time (sec): 7.07 - samples/sec: 1779.17 - lr: 0.000003 - momentum: 0.000000 2023-10-16 23:48:12,504 epoch 10 - iter 308/1546 - loss 0.00532913 - time (sec): 14.01 - samples/sec: 1782.70 - lr: 0.000003 - momentum: 0.000000 2023-10-16 23:48:19,380 epoch 10 - iter 462/1546 - loss 0.00538433 - time (sec): 20.89 - samples/sec: 1740.74 - lr: 0.000002 - momentum: 0.000000 2023-10-16 23:48:26,473 epoch 10 - iter 616/1546 - loss 0.00450212 - time (sec): 27.98 - samples/sec: 1758.70 - lr: 0.000002 - momentum: 0.000000 2023-10-16 23:48:33,548 epoch 10 - iter 770/1546 - loss 0.00405722 - time (sec): 35.06 - samples/sec: 1785.83 - lr: 0.000002 - momentum: 0.000000 2023-10-16 23:48:40,558 epoch 10 - iter 924/1546 - loss 0.00351806 - time (sec): 42.07 - samples/sec: 1799.92 - lr: 0.000001 - momentum: 0.000000 2023-10-16 23:48:47,503 epoch 10 - iter 1078/1546 - loss 0.00321229 - time (sec): 49.01 - samples/sec: 1795.61 - lr: 0.000001 - momentum: 0.000000 2023-10-16 23:48:54,315 epoch 10 - iter 1232/1546 - loss 0.00327509 - time (sec): 55.83 - samples/sec: 1784.03 - lr: 0.000001 - momentum: 0.000000 2023-10-16 23:49:01,187 epoch 10 - iter 1386/1546 - loss 0.00338661 - time (sec): 62.70 - samples/sec: 1784.67 - lr: 0.000000 - momentum: 0.000000 2023-10-16 23:49:08,070 epoch 10 - iter 1540/1546 - loss 0.00333392 - time (sec): 69.58 - samples/sec: 1779.95 - lr: 0.000000 - momentum: 0.000000 2023-10-16 23:49:08,338 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:49:08,338 EPOCH 10 done: loss 0.0033 - lr: 0.000000 2023-10-16 23:49:10,368 DEV : loss 0.12140633165836334 - f1-score (micro avg) 0.8065 2023-10-16 23:49:10,380 saving best model 2023-10-16 23:49:11,227 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:49:11,228 Loading model from best epoch ... 2023-10-16 23:49:12,839 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-16 23:49:18,915 Results: - F-score (micro) 0.798 - F-score (macro) 0.6998 - Accuracy 0.6823 By class: precision recall f1-score support LOC 0.8416 0.8647 0.8530 946 BUILDING 0.5440 0.5351 0.5395 185 STREET 0.6833 0.7321 0.7069 56 micro avg 0.7891 0.8071 0.7980 1187 macro avg 0.6896 0.7107 0.6998 1187 weighted avg 0.7877 0.8071 0.7972 1187 2023-10-16 23:49:18,915 ----------------------------------------------------------------------------------------------------