2023-10-25 19:56:58,111 ---------------------------------------------------------------------------------------------------- 2023-10-25 19:56:58,112 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-25 19:56:58,112 ---------------------------------------------------------------------------------------------------- 2023-10-25 19:56:58,112 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences - NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator 2023-10-25 19:56:58,112 ---------------------------------------------------------------------------------------------------- 2023-10-25 19:56:58,112 Train: 20847 sentences 2023-10-25 19:56:58,112 (train_with_dev=False, train_with_test=False) 2023-10-25 19:56:58,112 ---------------------------------------------------------------------------------------------------- 2023-10-25 19:56:58,112 Training Params: 2023-10-25 19:56:58,112 - learning_rate: "5e-05" 2023-10-25 19:56:58,112 - mini_batch_size: "4" 2023-10-25 19:56:58,112 - max_epochs: "10" 2023-10-25 19:56:58,112 - shuffle: "True" 2023-10-25 19:56:58,112 ---------------------------------------------------------------------------------------------------- 2023-10-25 19:56:58,112 Plugins: 2023-10-25 19:56:58,112 - TensorboardLogger 2023-10-25 19:56:58,112 - LinearScheduler | warmup_fraction: '0.1' 2023-10-25 19:56:58,112 ---------------------------------------------------------------------------------------------------- 2023-10-25 19:56:58,112 Final evaluation on model from best epoch (best-model.pt) 2023-10-25 19:56:58,112 - metric: "('micro avg', 'f1-score')" 2023-10-25 19:56:58,112 ---------------------------------------------------------------------------------------------------- 2023-10-25 19:56:58,112 Computation: 2023-10-25 19:56:58,112 - compute on device: cuda:0 2023-10-25 19:56:58,112 - embedding storage: none 2023-10-25 19:56:58,112 ---------------------------------------------------------------------------------------------------- 2023-10-25 19:56:58,112 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-25 19:56:58,112 ---------------------------------------------------------------------------------------------------- 2023-10-25 19:56:58,113 ---------------------------------------------------------------------------------------------------- 2023-10-25 19:56:58,113 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-25 19:57:20,386 epoch 1 - iter 521/5212 - loss 1.08348570 - time (sec): 22.27 - samples/sec: 1626.00 - lr: 0.000005 - momentum: 0.000000 2023-10-25 19:57:42,372 epoch 1 - iter 1042/5212 - loss 0.70758840 - time (sec): 44.26 - samples/sec: 1663.56 - lr: 0.000010 - momentum: 0.000000 2023-10-25 19:58:04,781 epoch 1 - iter 1563/5212 - loss 0.55092863 - time (sec): 66.67 - samples/sec: 1681.34 - lr: 0.000015 - momentum: 0.000000 2023-10-25 19:58:26,646 epoch 1 - iter 2084/5212 - loss 0.47203657 - time (sec): 88.53 - samples/sec: 1679.64 - lr: 0.000020 - momentum: 0.000000 2023-10-25 19:58:48,297 epoch 1 - iter 2605/5212 - loss 0.42902220 - time (sec): 110.18 - samples/sec: 1665.66 - lr: 0.000025 - momentum: 0.000000 2023-10-25 19:59:10,349 epoch 1 - iter 3126/5212 - loss 0.39940587 - time (sec): 132.24 - samples/sec: 1653.17 - lr: 0.000030 - momentum: 0.000000 2023-10-25 19:59:32,713 epoch 1 - iter 3647/5212 - loss 0.37156643 - time (sec): 154.60 - samples/sec: 1673.93 - lr: 0.000035 - momentum: 0.000000 2023-10-25 19:59:54,575 epoch 1 - iter 4168/5212 - loss 0.34705201 - time (sec): 176.46 - samples/sec: 1679.91 - lr: 0.000040 - momentum: 0.000000 2023-10-25 20:00:16,906 epoch 1 - iter 4689/5212 - loss 0.33147071 - time (sec): 198.79 - samples/sec: 1670.29 - lr: 0.000045 - momentum: 0.000000 2023-10-25 20:00:39,372 epoch 1 - iter 5210/5212 - loss 0.32226811 - time (sec): 221.26 - samples/sec: 1659.98 - lr: 0.000050 - momentum: 0.000000 2023-10-25 20:00:39,451 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:00:39,451 EPOCH 1 done: loss 0.3222 - lr: 0.000050 2023-10-25 20:00:43,129 DEV : loss 0.11420618742704391 - f1-score (micro avg) 0.2931 2023-10-25 20:00:43,155 saving best model 2023-10-25 20:00:43,631 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:01:05,416 epoch 2 - iter 521/5212 - loss 0.18552954 - time (sec): 21.78 - samples/sec: 1681.53 - lr: 0.000049 - momentum: 0.000000 2023-10-25 20:01:27,647 epoch 2 - iter 1042/5212 - loss 0.17954476 - time (sec): 44.01 - samples/sec: 1655.99 - lr: 0.000049 - momentum: 0.000000 2023-10-25 20:01:49,727 epoch 2 - iter 1563/5212 - loss 0.18022652 - time (sec): 66.09 - samples/sec: 1672.98 - lr: 0.000048 - momentum: 0.000000 2023-10-25 20:02:11,870 epoch 2 - iter 2084/5212 - loss 0.18252575 - time (sec): 88.24 - samples/sec: 1672.56 - lr: 0.000048 - momentum: 0.000000 2023-10-25 20:02:34,162 epoch 2 - iter 2605/5212 - loss 0.18537995 - time (sec): 110.53 - samples/sec: 1660.49 - lr: 0.000047 - momentum: 0.000000 2023-10-25 20:02:56,495 epoch 2 - iter 3126/5212 - loss 0.18574013 - time (sec): 132.86 - samples/sec: 1670.44 - lr: 0.000047 - momentum: 0.000000 2023-10-25 20:03:18,676 epoch 2 - iter 3647/5212 - loss 0.21371016 - time (sec): 155.04 - samples/sec: 1663.40 - lr: 0.000046 - momentum: 0.000000 2023-10-25 20:03:40,512 epoch 2 - iter 4168/5212 - loss 0.22735133 - time (sec): 176.88 - samples/sec: 1654.56 - lr: 0.000046 - momentum: 0.000000 2023-10-25 20:04:02,042 epoch 2 - iter 4689/5212 - loss 0.26232838 - time (sec): 198.41 - samples/sec: 1658.91 - lr: 0.000045 - momentum: 0.000000 2023-10-25 20:04:23,713 epoch 2 - iter 5210/5212 - loss 0.29428995 - time (sec): 220.08 - samples/sec: 1669.05 - lr: 0.000044 - momentum: 0.000000 2023-10-25 20:04:23,796 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:04:23,796 EPOCH 2 done: loss 0.2942 - lr: 0.000044 2023-10-25 20:04:30,604 DEV : loss 0.21574755012989044 - f1-score (micro avg) 0.0 2023-10-25 20:04:30,629 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:04:52,387 epoch 3 - iter 521/5212 - loss 0.61023887 - time (sec): 21.76 - samples/sec: 1676.40 - lr: 0.000044 - momentum: 0.000000 2023-10-25 20:05:14,600 epoch 3 - iter 1042/5212 - loss 0.56119949 - time (sec): 43.97 - samples/sec: 1730.25 - lr: 0.000043 - momentum: 0.000000 2023-10-25 20:05:36,372 epoch 3 - iter 1563/5212 - loss 0.56421056 - time (sec): 65.74 - samples/sec: 1717.45 - lr: 0.000043 - momentum: 0.000000 2023-10-25 20:05:58,622 epoch 3 - iter 2084/5212 - loss 0.57799536 - time (sec): 87.99 - samples/sec: 1690.37 - lr: 0.000042 - momentum: 0.000000 2023-10-25 20:06:20,339 epoch 3 - iter 2605/5212 - loss 0.57725572 - time (sec): 109.71 - samples/sec: 1690.27 - lr: 0.000042 - momentum: 0.000000 2023-10-25 20:06:42,574 epoch 3 - iter 3126/5212 - loss 0.57924881 - time (sec): 131.94 - samples/sec: 1666.46 - lr: 0.000041 - momentum: 0.000000 2023-10-25 20:07:04,690 epoch 3 - iter 3647/5212 - loss 0.58468851 - time (sec): 154.06 - samples/sec: 1662.21 - lr: 0.000041 - momentum: 0.000000 2023-10-25 20:07:27,168 epoch 3 - iter 4168/5212 - loss 0.58359992 - time (sec): 176.54 - samples/sec: 1672.35 - lr: 0.000040 - momentum: 0.000000 2023-10-25 20:07:48,934 epoch 3 - iter 4689/5212 - loss 0.58385727 - time (sec): 198.30 - samples/sec: 1661.25 - lr: 0.000039 - momentum: 0.000000 2023-10-25 20:08:10,931 epoch 3 - iter 5210/5212 - loss 0.57967064 - time (sec): 220.30 - samples/sec: 1667.45 - lr: 0.000039 - momentum: 0.000000 2023-10-25 20:08:11,008 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:08:11,008 EPOCH 3 done: loss 0.5798 - lr: 0.000039 2023-10-25 20:08:17,809 DEV : loss 0.22797180712223053 - f1-score (micro avg) 0.0 2023-10-25 20:08:17,834 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:08:39,704 epoch 4 - iter 521/5212 - loss 0.53509142 - time (sec): 21.87 - samples/sec: 1670.95 - lr: 0.000038 - momentum: 0.000000 2023-10-25 20:09:01,224 epoch 4 - iter 1042/5212 - loss 0.57089598 - time (sec): 43.39 - samples/sec: 1769.02 - lr: 0.000038 - momentum: 0.000000 2023-10-25 20:09:23,217 epoch 4 - iter 1563/5212 - loss 0.56098897 - time (sec): 65.38 - samples/sec: 1736.26 - lr: 0.000037 - momentum: 0.000000 2023-10-25 20:09:45,468 epoch 4 - iter 2084/5212 - loss 0.56792280 - time (sec): 87.63 - samples/sec: 1733.55 - lr: 0.000037 - momentum: 0.000000 2023-10-25 20:10:07,654 epoch 4 - iter 2605/5212 - loss 0.55856576 - time (sec): 109.82 - samples/sec: 1732.46 - lr: 0.000036 - momentum: 0.000000 2023-10-25 20:10:29,680 epoch 4 - iter 3126/5212 - loss 0.55882546 - time (sec): 131.84 - samples/sec: 1704.50 - lr: 0.000036 - momentum: 0.000000 2023-10-25 20:10:51,853 epoch 4 - iter 3647/5212 - loss 0.56106127 - time (sec): 154.02 - samples/sec: 1706.01 - lr: 0.000035 - momentum: 0.000000 2023-10-25 20:11:13,676 epoch 4 - iter 4168/5212 - loss 0.55614039 - time (sec): 175.84 - samples/sec: 1698.08 - lr: 0.000034 - momentum: 0.000000 2023-10-25 20:11:35,559 epoch 4 - iter 4689/5212 - loss 0.55568875 - time (sec): 197.72 - samples/sec: 1687.66 - lr: 0.000034 - momentum: 0.000000 2023-10-25 20:11:57,314 epoch 4 - iter 5210/5212 - loss 0.56143141 - time (sec): 219.48 - samples/sec: 1673.89 - lr: 0.000033 - momentum: 0.000000 2023-10-25 20:11:57,394 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:11:57,394 EPOCH 4 done: loss 0.5614 - lr: 0.000033 2023-10-25 20:12:04,213 DEV : loss 0.22103846073150635 - f1-score (micro avg) 0.0 2023-10-25 20:12:04,238 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:12:26,199 epoch 5 - iter 521/5212 - loss 0.59690938 - time (sec): 21.96 - samples/sec: 1651.61 - lr: 0.000033 - momentum: 0.000000 2023-10-25 20:12:47,893 epoch 5 - iter 1042/5212 - loss 0.57083926 - time (sec): 43.65 - samples/sec: 1634.39 - lr: 0.000032 - momentum: 0.000000 2023-10-25 20:13:09,967 epoch 5 - iter 1563/5212 - loss 0.55181100 - time (sec): 65.73 - samples/sec: 1668.56 - lr: 0.000032 - momentum: 0.000000 2023-10-25 20:13:32,378 epoch 5 - iter 2084/5212 - loss 0.55381084 - time (sec): 88.14 - samples/sec: 1674.65 - lr: 0.000031 - momentum: 0.000000 2023-10-25 20:13:54,700 epoch 5 - iter 2605/5212 - loss 0.55370422 - time (sec): 110.46 - samples/sec: 1669.48 - lr: 0.000031 - momentum: 0.000000 2023-10-25 20:14:16,931 epoch 5 - iter 3126/5212 - loss 0.55580912 - time (sec): 132.69 - samples/sec: 1670.75 - lr: 0.000030 - momentum: 0.000000 2023-10-25 20:14:39,010 epoch 5 - iter 3647/5212 - loss 0.55361204 - time (sec): 154.77 - samples/sec: 1648.30 - lr: 0.000029 - momentum: 0.000000 2023-10-25 20:15:01,019 epoch 5 - iter 4168/5212 - loss 0.55662199 - time (sec): 176.78 - samples/sec: 1657.09 - lr: 0.000029 - momentum: 0.000000 2023-10-25 20:15:23,088 epoch 5 - iter 4689/5212 - loss 0.54917164 - time (sec): 198.85 - samples/sec: 1654.63 - lr: 0.000028 - momentum: 0.000000 2023-10-25 20:15:45,019 epoch 5 - iter 5210/5212 - loss 0.54696535 - time (sec): 220.78 - samples/sec: 1664.02 - lr: 0.000028 - momentum: 0.000000 2023-10-25 20:15:45,098 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:15:45,098 EPOCH 5 done: loss 0.5469 - lr: 0.000028 2023-10-25 20:15:51,857 DEV : loss 0.2455786168575287 - f1-score (micro avg) 0.0 2023-10-25 20:15:51,883 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:16:14,379 epoch 6 - iter 521/5212 - loss 0.57254029 - time (sec): 22.49 - samples/sec: 1691.82 - lr: 0.000027 - momentum: 0.000000 2023-10-25 20:16:36,220 epoch 6 - iter 1042/5212 - loss 0.52755461 - time (sec): 44.34 - samples/sec: 1655.77 - lr: 0.000027 - momentum: 0.000000 2023-10-25 20:16:58,010 epoch 6 - iter 1563/5212 - loss 0.53602606 - time (sec): 66.13 - samples/sec: 1643.95 - lr: 0.000026 - momentum: 0.000000 2023-10-25 20:17:20,136 epoch 6 - iter 2084/5212 - loss 0.53959820 - time (sec): 88.25 - samples/sec: 1666.58 - lr: 0.000026 - momentum: 0.000000 2023-10-25 20:17:41,835 epoch 6 - iter 2605/5212 - loss 0.54382641 - time (sec): 109.95 - samples/sec: 1658.66 - lr: 0.000025 - momentum: 0.000000 2023-10-25 20:18:04,025 epoch 6 - iter 3126/5212 - loss 0.54769962 - time (sec): 132.14 - samples/sec: 1669.15 - lr: 0.000024 - momentum: 0.000000 2023-10-25 20:18:25,736 epoch 6 - iter 3647/5212 - loss 0.54874941 - time (sec): 153.85 - samples/sec: 1658.88 - lr: 0.000024 - momentum: 0.000000 2023-10-25 20:18:47,563 epoch 6 - iter 4168/5212 - loss 0.54164387 - time (sec): 175.68 - samples/sec: 1663.19 - lr: 0.000023 - momentum: 0.000000 2023-10-25 20:19:09,794 epoch 6 - iter 4689/5212 - loss 0.53460429 - time (sec): 197.91 - samples/sec: 1668.66 - lr: 0.000023 - momentum: 0.000000 2023-10-25 20:19:32,171 epoch 6 - iter 5210/5212 - loss 0.53273458 - time (sec): 220.29 - samples/sec: 1667.76 - lr: 0.000022 - momentum: 0.000000 2023-10-25 20:19:32,256 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:19:32,256 EPOCH 6 done: loss 0.5327 - lr: 0.000022 2023-10-25 20:19:39,076 DEV : loss 0.24423913657665253 - f1-score (micro avg) 0.0 2023-10-25 20:19:39,103 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:20:01,010 epoch 7 - iter 521/5212 - loss 0.52293661 - time (sec): 21.91 - samples/sec: 1654.87 - lr: 0.000022 - momentum: 0.000000 2023-10-25 20:20:22,981 epoch 7 - iter 1042/5212 - loss 0.55864614 - time (sec): 43.88 - samples/sec: 1685.68 - lr: 0.000021 - momentum: 0.000000 2023-10-25 20:20:44,977 epoch 7 - iter 1563/5212 - loss 0.54801286 - time (sec): 65.87 - samples/sec: 1687.51 - lr: 0.000021 - momentum: 0.000000 2023-10-25 20:21:07,465 epoch 7 - iter 2084/5212 - loss 0.52281712 - time (sec): 88.36 - samples/sec: 1697.53 - lr: 0.000020 - momentum: 0.000000 2023-10-25 20:21:29,412 epoch 7 - iter 2605/5212 - loss 0.52541212 - time (sec): 110.31 - samples/sec: 1686.47 - lr: 0.000019 - momentum: 0.000000 2023-10-25 20:21:50,941 epoch 7 - iter 3126/5212 - loss 0.52460589 - time (sec): 131.84 - samples/sec: 1702.56 - lr: 0.000019 - momentum: 0.000000 2023-10-25 20:22:12,842 epoch 7 - iter 3647/5212 - loss 0.52696389 - time (sec): 153.74 - samples/sec: 1706.10 - lr: 0.000018 - momentum: 0.000000 2023-10-25 20:22:34,666 epoch 7 - iter 4168/5212 - loss 0.52978826 - time (sec): 175.56 - samples/sec: 1703.61 - lr: 0.000018 - momentum: 0.000000 2023-10-25 20:22:56,241 epoch 7 - iter 4689/5212 - loss 0.53279906 - time (sec): 197.14 - samples/sec: 1687.62 - lr: 0.000017 - momentum: 0.000000 2023-10-25 20:23:18,267 epoch 7 - iter 5210/5212 - loss 0.53320841 - time (sec): 219.16 - samples/sec: 1676.23 - lr: 0.000017 - momentum: 0.000000 2023-10-25 20:23:18,343 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:23:18,343 EPOCH 7 done: loss 0.5332 - lr: 0.000017 2023-10-25 20:23:24,491 DEV : loss 0.24582862854003906 - f1-score (micro avg) 0.0 2023-10-25 20:23:24,518 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:23:46,181 epoch 8 - iter 521/5212 - loss 0.51997984 - time (sec): 21.66 - samples/sec: 1675.53 - lr: 0.000016 - momentum: 0.000000 2023-10-25 20:24:08,717 epoch 8 - iter 1042/5212 - loss 0.50827131 - time (sec): 44.20 - samples/sec: 1632.62 - lr: 0.000016 - momentum: 0.000000 2023-10-25 20:24:30,464 epoch 8 - iter 1563/5212 - loss 0.52442736 - time (sec): 65.94 - samples/sec: 1655.07 - lr: 0.000015 - momentum: 0.000000 2023-10-25 20:24:52,677 epoch 8 - iter 2084/5212 - loss 0.53205725 - time (sec): 88.16 - samples/sec: 1627.91 - lr: 0.000014 - momentum: 0.000000 2023-10-25 20:25:14,585 epoch 8 - iter 2605/5212 - loss 0.52219100 - time (sec): 110.06 - samples/sec: 1645.61 - lr: 0.000014 - momentum: 0.000000 2023-10-25 20:25:36,027 epoch 8 - iter 3126/5212 - loss 0.53222967 - time (sec): 131.51 - samples/sec: 1636.67 - lr: 0.000013 - momentum: 0.000000 2023-10-25 20:25:58,145 epoch 8 - iter 3647/5212 - loss 0.53929555 - time (sec): 153.63 - samples/sec: 1646.93 - lr: 0.000013 - momentum: 0.000000 2023-10-25 20:26:20,446 epoch 8 - iter 4168/5212 - loss 0.54041489 - time (sec): 175.93 - samples/sec: 1649.90 - lr: 0.000012 - momentum: 0.000000 2023-10-25 20:26:42,442 epoch 8 - iter 4689/5212 - loss 0.53637281 - time (sec): 197.92 - samples/sec: 1667.54 - lr: 0.000012 - momentum: 0.000000 2023-10-25 20:27:04,221 epoch 8 - iter 5210/5212 - loss 0.53009907 - time (sec): 219.70 - samples/sec: 1671.95 - lr: 0.000011 - momentum: 0.000000 2023-10-25 20:27:04,313 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:27:04,314 EPOCH 8 done: loss 0.5301 - lr: 0.000011 2023-10-25 20:27:10,440 DEV : loss 0.2523580491542816 - f1-score (micro avg) 0.0 2023-10-25 20:27:10,466 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:27:32,371 epoch 9 - iter 521/5212 - loss 0.52508937 - time (sec): 21.90 - samples/sec: 1695.81 - lr: 0.000011 - momentum: 0.000000 2023-10-25 20:27:54,597 epoch 9 - iter 1042/5212 - loss 0.54085716 - time (sec): 44.13 - samples/sec: 1626.44 - lr: 0.000010 - momentum: 0.000000 2023-10-25 20:28:16,737 epoch 9 - iter 1563/5212 - loss 0.54010812 - time (sec): 66.27 - samples/sec: 1656.38 - lr: 0.000009 - momentum: 0.000000 2023-10-25 20:28:39,178 epoch 9 - iter 2084/5212 - loss 0.54472189 - time (sec): 88.71 - samples/sec: 1642.16 - lr: 0.000009 - momentum: 0.000000 2023-10-25 20:29:01,052 epoch 9 - iter 2605/5212 - loss 0.54518847 - time (sec): 110.58 - samples/sec: 1633.86 - lr: 0.000008 - momentum: 0.000000 2023-10-25 20:29:23,296 epoch 9 - iter 3126/5212 - loss 0.54101988 - time (sec): 132.83 - samples/sec: 1638.13 - lr: 0.000008 - momentum: 0.000000 2023-10-25 20:29:45,113 epoch 9 - iter 3647/5212 - loss 0.53936286 - time (sec): 154.65 - samples/sec: 1632.42 - lr: 0.000007 - momentum: 0.000000 2023-10-25 20:30:07,429 epoch 9 - iter 4168/5212 - loss 0.53352108 - time (sec): 176.96 - samples/sec: 1649.12 - lr: 0.000007 - momentum: 0.000000 2023-10-25 20:30:29,902 epoch 9 - iter 4689/5212 - loss 0.53176607 - time (sec): 199.43 - samples/sec: 1661.49 - lr: 0.000006 - momentum: 0.000000 2023-10-25 20:30:52,350 epoch 9 - iter 5210/5212 - loss 0.52897509 - time (sec): 221.88 - samples/sec: 1655.79 - lr: 0.000006 - momentum: 0.000000 2023-10-25 20:30:52,435 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:30:52,435 EPOCH 9 done: loss 0.5289 - lr: 0.000006 2023-10-25 20:30:58,598 DEV : loss 0.26791492104530334 - f1-score (micro avg) 0.0 2023-10-25 20:30:58,624 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:31:20,588 epoch 10 - iter 521/5212 - loss 0.48688069 - time (sec): 21.96 - samples/sec: 1689.43 - lr: 0.000005 - momentum: 0.000000 2023-10-25 20:31:42,505 epoch 10 - iter 1042/5212 - loss 0.52394542 - time (sec): 43.88 - samples/sec: 1671.09 - lr: 0.000004 - momentum: 0.000000 2023-10-25 20:32:04,555 epoch 10 - iter 1563/5212 - loss 0.54120385 - time (sec): 65.93 - samples/sec: 1662.55 - lr: 0.000004 - momentum: 0.000000 2023-10-25 20:32:26,521 epoch 10 - iter 2084/5212 - loss 0.54119260 - time (sec): 87.90 - samples/sec: 1654.95 - lr: 0.000003 - momentum: 0.000000 2023-10-25 20:32:48,533 epoch 10 - iter 2605/5212 - loss 0.53504612 - time (sec): 109.91 - samples/sec: 1653.51 - lr: 0.000003 - momentum: 0.000000 2023-10-25 20:33:09,946 epoch 10 - iter 3126/5212 - loss 0.54141693 - time (sec): 131.32 - samples/sec: 1638.10 - lr: 0.000002 - momentum: 0.000000 2023-10-25 20:33:31,270 epoch 10 - iter 3647/5212 - loss 0.53472820 - time (sec): 152.64 - samples/sec: 1649.26 - lr: 0.000002 - momentum: 0.000000 2023-10-25 20:33:53,359 epoch 10 - iter 4168/5212 - loss 0.53331190 - time (sec): 174.73 - samples/sec: 1649.47 - lr: 0.000001 - momentum: 0.000000 2023-10-25 20:34:15,455 epoch 10 - iter 4689/5212 - loss 0.52669984 - time (sec): 196.83 - samples/sec: 1667.97 - lr: 0.000001 - momentum: 0.000000 2023-10-25 20:34:37,198 epoch 10 - iter 5210/5212 - loss 0.52741026 - time (sec): 218.57 - samples/sec: 1680.12 - lr: 0.000000 - momentum: 0.000000 2023-10-25 20:34:37,280 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:34:37,280 EPOCH 10 done: loss 0.5273 - lr: 0.000000 2023-10-25 20:34:44,070 DEV : loss 0.26033899188041687 - f1-score (micro avg) 0.0 2023-10-25 20:34:44,436 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:34:44,437 Loading model from best epoch ... 2023-10-25 20:34:46,052 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-25 20:34:55,751 Results: - F-score (micro) 0.3206 - F-score (macro) 0.1588 - Accuracy 0.1914 By class: precision recall f1-score support LOC 0.4903 0.4580 0.4736 1214 PER 0.1940 0.1126 0.1425 808 ORG 0.0588 0.0113 0.0190 353 HumanProd 0.0000 0.0000 0.0000 15 micro avg 0.3896 0.2724 0.3206 2390 macro avg 0.1858 0.1455 0.1588 2390 weighted avg 0.3233 0.2724 0.2916 2390 2023-10-25 20:34:55,751 ----------------------------------------------------------------------------------------------------