2023-10-17 09:30:01,159 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:30:01,161 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 09:30:01,161 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:30:01,161 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences - NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator 2023-10-17 09:30:01,161 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:30:01,162 Train: 20847 sentences 2023-10-17 09:30:01,162 (train_with_dev=False, train_with_test=False) 2023-10-17 09:30:01,162 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:30:01,162 Training Params: 2023-10-17 09:30:01,162 - learning_rate: "3e-05" 2023-10-17 09:30:01,162 - mini_batch_size: "8" 2023-10-17 09:30:01,162 - max_epochs: "10" 2023-10-17 09:30:01,162 - shuffle: "True" 2023-10-17 09:30:01,162 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:30:01,162 Plugins: 2023-10-17 09:30:01,162 - TensorboardLogger 2023-10-17 09:30:01,162 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 09:30:01,162 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:30:01,162 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 09:30:01,163 - metric: "('micro avg', 'f1-score')" 2023-10-17 09:30:01,163 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:30:01,163 Computation: 2023-10-17 09:30:01,163 - compute on device: cuda:0 2023-10-17 09:30:01,163 - embedding storage: none 2023-10-17 09:30:01,163 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:30:01,163 Model training base path: "hmbench-newseye/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-17 09:30:01,163 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:30:01,163 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:30:01,163 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 09:30:28,403 epoch 1 - iter 260/2606 - loss 2.25088289 - time (sec): 27.24 - samples/sec: 1234.86 - lr: 0.000003 - momentum: 0.000000 2023-10-17 09:30:54,326 epoch 1 - iter 520/2606 - loss 1.33220100 - time (sec): 53.16 - samples/sec: 1301.86 - lr: 0.000006 - momentum: 0.000000 2023-10-17 09:31:22,097 epoch 1 - iter 780/2606 - loss 0.98391208 - time (sec): 80.93 - samples/sec: 1323.01 - lr: 0.000009 - momentum: 0.000000 2023-10-17 09:31:49,397 epoch 1 - iter 1040/2606 - loss 0.80391753 - time (sec): 108.23 - samples/sec: 1342.02 - lr: 0.000012 - momentum: 0.000000 2023-10-17 09:32:17,795 epoch 1 - iter 1300/2606 - loss 0.68584452 - time (sec): 136.63 - samples/sec: 1347.11 - lr: 0.000015 - momentum: 0.000000 2023-10-17 09:32:45,676 epoch 1 - iter 1560/2606 - loss 0.60379202 - time (sec): 164.51 - samples/sec: 1355.73 - lr: 0.000018 - momentum: 0.000000 2023-10-17 09:33:12,577 epoch 1 - iter 1820/2606 - loss 0.55271439 - time (sec): 191.41 - samples/sec: 1349.07 - lr: 0.000021 - momentum: 0.000000 2023-10-17 09:33:38,828 epoch 1 - iter 2080/2606 - loss 0.51354418 - time (sec): 217.66 - samples/sec: 1344.45 - lr: 0.000024 - momentum: 0.000000 2023-10-17 09:34:05,563 epoch 1 - iter 2340/2606 - loss 0.47477753 - time (sec): 244.40 - samples/sec: 1351.70 - lr: 0.000027 - momentum: 0.000000 2023-10-17 09:34:32,217 epoch 1 - iter 2600/2606 - loss 0.44664224 - time (sec): 271.05 - samples/sec: 1352.12 - lr: 0.000030 - momentum: 0.000000 2023-10-17 09:34:32,817 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:34:32,818 EPOCH 1 done: loss 0.4458 - lr: 0.000030 2023-10-17 09:34:40,667 DEV : loss 0.1086118295788765 - f1-score (micro avg) 0.3237 2023-10-17 09:34:40,758 saving best model 2023-10-17 09:34:41,318 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:35:09,694 epoch 2 - iter 260/2606 - loss 0.16500413 - time (sec): 28.37 - samples/sec: 1348.55 - lr: 0.000030 - momentum: 0.000000 2023-10-17 09:35:36,113 epoch 2 - iter 520/2606 - loss 0.16842197 - time (sec): 54.79 - samples/sec: 1357.95 - lr: 0.000029 - momentum: 0.000000 2023-10-17 09:36:02,780 epoch 2 - iter 780/2606 - loss 0.16787525 - time (sec): 81.46 - samples/sec: 1376.37 - lr: 0.000029 - momentum: 0.000000 2023-10-17 09:36:28,195 epoch 2 - iter 1040/2606 - loss 0.17031578 - time (sec): 106.87 - samples/sec: 1375.07 - lr: 0.000029 - momentum: 0.000000 2023-10-17 09:36:55,432 epoch 2 - iter 1300/2606 - loss 0.16885613 - time (sec): 134.11 - samples/sec: 1364.78 - lr: 0.000028 - momentum: 0.000000 2023-10-17 09:37:21,987 epoch 2 - iter 1560/2606 - loss 0.16775941 - time (sec): 160.67 - samples/sec: 1363.53 - lr: 0.000028 - momentum: 0.000000 2023-10-17 09:37:50,974 epoch 2 - iter 1820/2606 - loss 0.16332198 - time (sec): 189.65 - samples/sec: 1364.78 - lr: 0.000028 - momentum: 0.000000 2023-10-17 09:38:18,648 epoch 2 - iter 2080/2606 - loss 0.16165163 - time (sec): 217.33 - samples/sec: 1359.55 - lr: 0.000027 - momentum: 0.000000 2023-10-17 09:38:44,428 epoch 2 - iter 2340/2606 - loss 0.15766630 - time (sec): 243.11 - samples/sec: 1353.04 - lr: 0.000027 - momentum: 0.000000 2023-10-17 09:39:10,822 epoch 2 - iter 2600/2606 - loss 0.15452230 - time (sec): 269.50 - samples/sec: 1360.07 - lr: 0.000027 - momentum: 0.000000 2023-10-17 09:39:11,536 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:39:11,537 EPOCH 2 done: loss 0.1543 - lr: 0.000027 2023-10-17 09:39:23,551 DEV : loss 0.15823118388652802 - f1-score (micro avg) 0.3853 2023-10-17 09:39:23,600 saving best model 2023-10-17 09:39:24,968 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:39:52,144 epoch 3 - iter 260/2606 - loss 0.10975264 - time (sec): 27.17 - samples/sec: 1395.25 - lr: 0.000026 - momentum: 0.000000 2023-10-17 09:40:18,361 epoch 3 - iter 520/2606 - loss 0.11499363 - time (sec): 53.39 - samples/sec: 1401.34 - lr: 0.000026 - momentum: 0.000000 2023-10-17 09:40:45,126 epoch 3 - iter 780/2606 - loss 0.11675016 - time (sec): 80.15 - samples/sec: 1383.45 - lr: 0.000026 - momentum: 0.000000 2023-10-17 09:41:13,561 epoch 3 - iter 1040/2606 - loss 0.11239788 - time (sec): 108.59 - samples/sec: 1348.82 - lr: 0.000025 - momentum: 0.000000 2023-10-17 09:41:41,922 epoch 3 - iter 1300/2606 - loss 0.11126414 - time (sec): 136.95 - samples/sec: 1336.41 - lr: 0.000025 - momentum: 0.000000 2023-10-17 09:42:10,754 epoch 3 - iter 1560/2606 - loss 0.11419868 - time (sec): 165.78 - samples/sec: 1311.53 - lr: 0.000025 - momentum: 0.000000 2023-10-17 09:42:39,575 epoch 3 - iter 1820/2606 - loss 0.11341704 - time (sec): 194.60 - samples/sec: 1302.87 - lr: 0.000024 - momentum: 0.000000 2023-10-17 09:43:07,853 epoch 3 - iter 2080/2606 - loss 0.11495451 - time (sec): 222.88 - samples/sec: 1301.91 - lr: 0.000024 - momentum: 0.000000 2023-10-17 09:43:37,213 epoch 3 - iter 2340/2606 - loss 0.11395293 - time (sec): 252.24 - samples/sec: 1305.66 - lr: 0.000024 - momentum: 0.000000 2023-10-17 09:44:06,911 epoch 3 - iter 2600/2606 - loss 0.11142963 - time (sec): 281.94 - samples/sec: 1299.99 - lr: 0.000023 - momentum: 0.000000 2023-10-17 09:44:07,588 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:44:07,588 EPOCH 3 done: loss 0.1113 - lr: 0.000023 2023-10-17 09:44:20,477 DEV : loss 0.17116275429725647 - f1-score (micro avg) 0.3633 2023-10-17 09:44:20,552 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:44:48,244 epoch 4 - iter 260/2606 - loss 0.06627966 - time (sec): 27.69 - samples/sec: 1345.02 - lr: 0.000023 - momentum: 0.000000 2023-10-17 09:45:15,505 epoch 4 - iter 520/2606 - loss 0.07058697 - time (sec): 54.95 - samples/sec: 1338.05 - lr: 0.000023 - momentum: 0.000000 2023-10-17 09:45:43,997 epoch 4 - iter 780/2606 - loss 0.07407823 - time (sec): 83.44 - samples/sec: 1299.82 - lr: 0.000022 - momentum: 0.000000 2023-10-17 09:46:12,193 epoch 4 - iter 1040/2606 - loss 0.07567564 - time (sec): 111.64 - samples/sec: 1289.62 - lr: 0.000022 - momentum: 0.000000 2023-10-17 09:46:39,000 epoch 4 - iter 1300/2606 - loss 0.07897282 - time (sec): 138.45 - samples/sec: 1291.11 - lr: 0.000022 - momentum: 0.000000 2023-10-17 09:47:04,612 epoch 4 - iter 1560/2606 - loss 0.07917414 - time (sec): 164.06 - samples/sec: 1299.47 - lr: 0.000021 - momentum: 0.000000 2023-10-17 09:47:31,818 epoch 4 - iter 1820/2606 - loss 0.07940009 - time (sec): 191.26 - samples/sec: 1315.20 - lr: 0.000021 - momentum: 0.000000 2023-10-17 09:47:59,566 epoch 4 - iter 2080/2606 - loss 0.07834344 - time (sec): 219.01 - samples/sec: 1324.68 - lr: 0.000021 - momentum: 0.000000 2023-10-17 09:48:26,560 epoch 4 - iter 2340/2606 - loss 0.07864842 - time (sec): 246.01 - samples/sec: 1337.08 - lr: 0.000020 - momentum: 0.000000 2023-10-17 09:48:53,518 epoch 4 - iter 2600/2606 - loss 0.07692990 - time (sec): 272.96 - samples/sec: 1343.43 - lr: 0.000020 - momentum: 0.000000 2023-10-17 09:48:54,050 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:48:54,051 EPOCH 4 done: loss 0.0768 - lr: 0.000020 2023-10-17 09:49:05,685 DEV : loss 0.2451845407485962 - f1-score (micro avg) 0.4014 2023-10-17 09:49:05,753 saving best model 2023-10-17 09:49:07,232 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:49:35,311 epoch 5 - iter 260/2606 - loss 0.05314060 - time (sec): 28.07 - samples/sec: 1341.97 - lr: 0.000020 - momentum: 0.000000 2023-10-17 09:50:01,569 epoch 5 - iter 520/2606 - loss 0.05044802 - time (sec): 54.33 - samples/sec: 1315.80 - lr: 0.000019 - momentum: 0.000000 2023-10-17 09:50:29,907 epoch 5 - iter 780/2606 - loss 0.05326352 - time (sec): 82.67 - samples/sec: 1309.89 - lr: 0.000019 - momentum: 0.000000 2023-10-17 09:50:57,502 epoch 5 - iter 1040/2606 - loss 0.05366294 - time (sec): 110.26 - samples/sec: 1291.32 - lr: 0.000019 - momentum: 0.000000 2023-10-17 09:51:26,195 epoch 5 - iter 1300/2606 - loss 0.05498199 - time (sec): 138.96 - samples/sec: 1309.81 - lr: 0.000018 - momentum: 0.000000 2023-10-17 09:51:54,026 epoch 5 - iter 1560/2606 - loss 0.05426621 - time (sec): 166.79 - samples/sec: 1330.16 - lr: 0.000018 - momentum: 0.000000 2023-10-17 09:52:20,601 epoch 5 - iter 1820/2606 - loss 0.05587133 - time (sec): 193.36 - samples/sec: 1336.37 - lr: 0.000018 - momentum: 0.000000 2023-10-17 09:52:47,680 epoch 5 - iter 2080/2606 - loss 0.05634824 - time (sec): 220.44 - samples/sec: 1342.08 - lr: 0.000017 - momentum: 0.000000 2023-10-17 09:53:13,117 epoch 5 - iter 2340/2606 - loss 0.05651126 - time (sec): 245.88 - samples/sec: 1343.38 - lr: 0.000017 - momentum: 0.000000 2023-10-17 09:53:40,513 epoch 5 - iter 2600/2606 - loss 0.05627857 - time (sec): 273.27 - samples/sec: 1342.00 - lr: 0.000017 - momentum: 0.000000 2023-10-17 09:53:41,040 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:53:41,040 EPOCH 5 done: loss 0.0562 - lr: 0.000017 2023-10-17 09:53:52,676 DEV : loss 0.3224625885486603 - f1-score (micro avg) 0.3969 2023-10-17 09:53:52,727 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:54:19,235 epoch 6 - iter 260/2606 - loss 0.03924381 - time (sec): 26.51 - samples/sec: 1406.84 - lr: 0.000016 - momentum: 0.000000 2023-10-17 09:54:45,336 epoch 6 - iter 520/2606 - loss 0.03874725 - time (sec): 52.61 - samples/sec: 1373.16 - lr: 0.000016 - momentum: 0.000000 2023-10-17 09:55:11,344 epoch 6 - iter 780/2606 - loss 0.03742118 - time (sec): 78.62 - samples/sec: 1362.73 - lr: 0.000016 - momentum: 0.000000 2023-10-17 09:55:40,097 epoch 6 - iter 1040/2606 - loss 0.03839947 - time (sec): 107.37 - samples/sec: 1364.72 - lr: 0.000015 - momentum: 0.000000 2023-10-17 09:56:07,453 epoch 6 - iter 1300/2606 - loss 0.03791276 - time (sec): 134.72 - samples/sec: 1374.77 - lr: 0.000015 - momentum: 0.000000 2023-10-17 09:56:35,827 epoch 6 - iter 1560/2606 - loss 0.03744929 - time (sec): 163.10 - samples/sec: 1363.32 - lr: 0.000015 - momentum: 0.000000 2023-10-17 09:57:03,098 epoch 6 - iter 1820/2606 - loss 0.03737686 - time (sec): 190.37 - samples/sec: 1356.77 - lr: 0.000014 - momentum: 0.000000 2023-10-17 09:57:29,637 epoch 6 - iter 2080/2606 - loss 0.03795561 - time (sec): 216.91 - samples/sec: 1354.31 - lr: 0.000014 - momentum: 0.000000 2023-10-17 09:57:57,734 epoch 6 - iter 2340/2606 - loss 0.03813117 - time (sec): 245.01 - samples/sec: 1349.31 - lr: 0.000014 - momentum: 0.000000 2023-10-17 09:58:24,424 epoch 6 - iter 2600/2606 - loss 0.03944101 - time (sec): 271.70 - samples/sec: 1349.32 - lr: 0.000013 - momentum: 0.000000 2023-10-17 09:58:24,984 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:58:24,985 EPOCH 6 done: loss 0.0394 - lr: 0.000013 2023-10-17 09:58:36,849 DEV : loss 0.3648042380809784 - f1-score (micro avg) 0.3625 2023-10-17 09:58:36,902 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:59:05,215 epoch 7 - iter 260/2606 - loss 0.02720017 - time (sec): 28.31 - samples/sec: 1336.12 - lr: 0.000013 - momentum: 0.000000 2023-10-17 09:59:32,311 epoch 7 - iter 520/2606 - loss 0.02717709 - time (sec): 55.41 - samples/sec: 1358.85 - lr: 0.000013 - momentum: 0.000000 2023-10-17 10:00:00,539 epoch 7 - iter 780/2606 - loss 0.02642744 - time (sec): 83.64 - samples/sec: 1343.73 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:00:28,610 epoch 7 - iter 1040/2606 - loss 0.02623881 - time (sec): 111.71 - samples/sec: 1334.72 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:00:56,928 epoch 7 - iter 1300/2606 - loss 0.02775206 - time (sec): 140.02 - samples/sec: 1317.88 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:01:25,241 epoch 7 - iter 1560/2606 - loss 0.02724799 - time (sec): 168.34 - samples/sec: 1305.30 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:01:52,672 epoch 7 - iter 1820/2606 - loss 0.02777008 - time (sec): 195.77 - samples/sec: 1306.60 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:02:22,308 epoch 7 - iter 2080/2606 - loss 0.02801962 - time (sec): 225.40 - samples/sec: 1309.14 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:02:49,284 epoch 7 - iter 2340/2606 - loss 0.02899872 - time (sec): 252.38 - samples/sec: 1314.34 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:03:16,271 epoch 7 - iter 2600/2606 - loss 0.02876676 - time (sec): 279.37 - samples/sec: 1313.30 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:03:16,830 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:03:16,830 EPOCH 7 done: loss 0.0289 - lr: 0.000010 2023-10-17 10:03:28,209 DEV : loss 0.45044124126434326 - f1-score (micro avg) 0.3814 2023-10-17 10:03:28,309 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:03:57,126 epoch 8 - iter 260/2606 - loss 0.02238485 - time (sec): 28.81 - samples/sec: 1244.56 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:04:23,837 epoch 8 - iter 520/2606 - loss 0.01797237 - time (sec): 55.52 - samples/sec: 1286.92 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:04:50,384 epoch 8 - iter 780/2606 - loss 0.02174076 - time (sec): 82.07 - samples/sec: 1293.72 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:05:18,106 epoch 8 - iter 1040/2606 - loss 0.02091817 - time (sec): 109.79 - samples/sec: 1296.58 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:05:45,037 epoch 8 - iter 1300/2606 - loss 0.02109893 - time (sec): 136.72 - samples/sec: 1301.66 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:06:12,476 epoch 8 - iter 1560/2606 - loss 0.02153126 - time (sec): 164.16 - samples/sec: 1313.34 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:06:41,454 epoch 8 - iter 1820/2606 - loss 0.02139044 - time (sec): 193.14 - samples/sec: 1309.13 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:07:10,898 epoch 8 - iter 2080/2606 - loss 0.02121414 - time (sec): 222.59 - samples/sec: 1310.86 - lr: 0.000007 - momentum: 0.000000 2023-10-17 10:07:38,754 epoch 8 - iter 2340/2606 - loss 0.02101148 - time (sec): 250.44 - samples/sec: 1317.35 - lr: 0.000007 - momentum: 0.000000 2023-10-17 10:08:06,855 epoch 8 - iter 2600/2606 - loss 0.02092631 - time (sec): 278.54 - samples/sec: 1316.74 - lr: 0.000007 - momentum: 0.000000 2023-10-17 10:08:07,425 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:08:07,426 EPOCH 8 done: loss 0.0209 - lr: 0.000007 2023-10-17 10:08:18,600 DEV : loss 0.42215481400489807 - f1-score (micro avg) 0.407 2023-10-17 10:08:18,660 saving best model 2023-10-17 10:08:20,061 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:08:45,721 epoch 9 - iter 260/2606 - loss 0.01312644 - time (sec): 25.66 - samples/sec: 1325.50 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:09:13,633 epoch 9 - iter 520/2606 - loss 0.01574287 - time (sec): 53.57 - samples/sec: 1330.74 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:09:40,184 epoch 9 - iter 780/2606 - loss 0.01573689 - time (sec): 80.12 - samples/sec: 1310.09 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:10:08,355 epoch 9 - iter 1040/2606 - loss 0.01668065 - time (sec): 108.29 - samples/sec: 1289.44 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:10:37,981 epoch 9 - iter 1300/2606 - loss 0.01614498 - time (sec): 137.92 - samples/sec: 1277.89 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:11:07,301 epoch 9 - iter 1560/2606 - loss 0.01604974 - time (sec): 167.24 - samples/sec: 1272.28 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:11:34,772 epoch 9 - iter 1820/2606 - loss 0.01508587 - time (sec): 194.71 - samples/sec: 1285.03 - lr: 0.000004 - momentum: 0.000000 2023-10-17 10:12:02,900 epoch 9 - iter 2080/2606 - loss 0.01508610 - time (sec): 222.83 - samples/sec: 1294.49 - lr: 0.000004 - momentum: 0.000000 2023-10-17 10:12:31,271 epoch 9 - iter 2340/2606 - loss 0.01501549 - time (sec): 251.21 - samples/sec: 1300.14 - lr: 0.000004 - momentum: 0.000000 2023-10-17 10:12:59,474 epoch 9 - iter 2600/2606 - loss 0.01491614 - time (sec): 279.41 - samples/sec: 1312.37 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:13:00,101 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:13:00,102 EPOCH 9 done: loss 0.0149 - lr: 0.000003 2023-10-17 10:13:11,180 DEV : loss 0.49753403663635254 - f1-score (micro avg) 0.3888 2023-10-17 10:13:11,232 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:13:39,070 epoch 10 - iter 260/2606 - loss 0.00723474 - time (sec): 27.84 - samples/sec: 1334.69 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:14:05,802 epoch 10 - iter 520/2606 - loss 0.00843281 - time (sec): 54.57 - samples/sec: 1331.10 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:14:32,181 epoch 10 - iter 780/2606 - loss 0.00946028 - time (sec): 80.95 - samples/sec: 1319.62 - lr: 0.000002 - momentum: 0.000000 2023-10-17 10:14:59,920 epoch 10 - iter 1040/2606 - loss 0.01032869 - time (sec): 108.69 - samples/sec: 1325.34 - lr: 0.000002 - momentum: 0.000000 2023-10-17 10:15:28,134 epoch 10 - iter 1300/2606 - loss 0.01072025 - time (sec): 136.90 - samples/sec: 1326.39 - lr: 0.000002 - momentum: 0.000000 2023-10-17 10:15:55,515 epoch 10 - iter 1560/2606 - loss 0.01059612 - time (sec): 164.28 - samples/sec: 1319.50 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:16:24,905 epoch 10 - iter 1820/2606 - loss 0.01061368 - time (sec): 193.67 - samples/sec: 1306.54 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:16:53,151 epoch 10 - iter 2080/2606 - loss 0.01049307 - time (sec): 221.92 - samples/sec: 1305.30 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:17:21,322 epoch 10 - iter 2340/2606 - loss 0.01055228 - time (sec): 250.09 - samples/sec: 1312.43 - lr: 0.000000 - momentum: 0.000000 2023-10-17 10:17:51,087 epoch 10 - iter 2600/2606 - loss 0.01071052 - time (sec): 279.85 - samples/sec: 1310.61 - lr: 0.000000 - momentum: 0.000000 2023-10-17 10:17:51,600 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:17:51,600 EPOCH 10 done: loss 0.0108 - lr: 0.000000 2023-10-17 10:18:02,989 DEV : loss 0.5493963360786438 - f1-score (micro avg) 0.3928 2023-10-17 10:18:03,675 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:18:03,677 Loading model from best epoch ... 2023-10-17 10:18:06,329 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-17 10:18:27,097 Results: - F-score (micro) 0.48 - F-score (macro) 0.33 - Accuracy 0.3196 By class: precision recall f1-score support LOC 0.5488 0.5881 0.5678 1214 PER 0.4133 0.4245 0.4188 808 ORG 0.3127 0.3569 0.3333 353 HumanProd 0.0000 0.0000 0.0000 15 micro avg 0.4659 0.4950 0.4800 2390 macro avg 0.3187 0.3424 0.3300 2390 weighted avg 0.4647 0.4950 0.4792 2390 2023-10-17 10:18:27,097 ----------------------------------------------------------------------------------------------------