stefan-it's picture
Upload folder using huggingface_hub
1851430
2023-10-17 18:32:41,025 ----------------------------------------------------------------------------------------------------
2023-10-17 18:32:41,027 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 18:32:41,028 ----------------------------------------------------------------------------------------------------
2023-10-17 18:32:41,028 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-17 18:32:41,028 ----------------------------------------------------------------------------------------------------
2023-10-17 18:32:41,028 Train: 3575 sentences
2023-10-17 18:32:41,028 (train_with_dev=False, train_with_test=False)
2023-10-17 18:32:41,028 ----------------------------------------------------------------------------------------------------
2023-10-17 18:32:41,028 Training Params:
2023-10-17 18:32:41,028 - learning_rate: "5e-05"
2023-10-17 18:32:41,028 - mini_batch_size: "8"
2023-10-17 18:32:41,028 - max_epochs: "10"
2023-10-17 18:32:41,029 - shuffle: "True"
2023-10-17 18:32:41,029 ----------------------------------------------------------------------------------------------------
2023-10-17 18:32:41,029 Plugins:
2023-10-17 18:32:41,029 - TensorboardLogger
2023-10-17 18:32:41,029 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 18:32:41,029 ----------------------------------------------------------------------------------------------------
2023-10-17 18:32:41,029 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 18:32:41,029 - metric: "('micro avg', 'f1-score')"
2023-10-17 18:32:41,029 ----------------------------------------------------------------------------------------------------
2023-10-17 18:32:41,029 Computation:
2023-10-17 18:32:41,029 - compute on device: cuda:0
2023-10-17 18:32:41,029 - embedding storage: none
2023-10-17 18:32:41,029 ----------------------------------------------------------------------------------------------------
2023-10-17 18:32:41,029 Model training base path: "hmbench-hipe2020/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-17 18:32:41,029 ----------------------------------------------------------------------------------------------------
2023-10-17 18:32:41,030 ----------------------------------------------------------------------------------------------------
2023-10-17 18:32:41,030 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 18:32:45,281 epoch 1 - iter 44/447 - loss 3.28319391 - time (sec): 4.25 - samples/sec: 1909.75 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:32:49,847 epoch 1 - iter 88/447 - loss 2.16729802 - time (sec): 8.82 - samples/sec: 1871.58 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:32:54,474 epoch 1 - iter 132/447 - loss 1.58472566 - time (sec): 13.44 - samples/sec: 1867.59 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:32:59,143 epoch 1 - iter 176/447 - loss 1.28290411 - time (sec): 18.11 - samples/sec: 1824.30 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:33:03,625 epoch 1 - iter 220/447 - loss 1.09679775 - time (sec): 22.59 - samples/sec: 1828.99 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:33:07,987 epoch 1 - iter 264/447 - loss 0.98249636 - time (sec): 26.96 - samples/sec: 1845.66 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:33:12,161 epoch 1 - iter 308/447 - loss 0.88418366 - time (sec): 31.13 - samples/sec: 1860.65 - lr: 0.000034 - momentum: 0.000000
2023-10-17 18:33:16,473 epoch 1 - iter 352/447 - loss 0.79838079 - time (sec): 35.44 - samples/sec: 1883.13 - lr: 0.000039 - momentum: 0.000000
2023-10-17 18:33:20,636 epoch 1 - iter 396/447 - loss 0.72651329 - time (sec): 39.60 - samples/sec: 1918.58 - lr: 0.000044 - momentum: 0.000000
2023-10-17 18:33:25,125 epoch 1 - iter 440/447 - loss 0.67193639 - time (sec): 44.09 - samples/sec: 1930.31 - lr: 0.000049 - momentum: 0.000000
2023-10-17 18:33:25,762 ----------------------------------------------------------------------------------------------------
2023-10-17 18:33:25,763 EPOCH 1 done: loss 0.6633 - lr: 0.000049
2023-10-17 18:33:32,386 DEV : loss 0.1744980365037918 - f1-score (micro avg) 0.6035
2023-10-17 18:33:32,445 saving best model
2023-10-17 18:33:33,042 ----------------------------------------------------------------------------------------------------
2023-10-17 18:33:37,924 epoch 2 - iter 44/447 - loss 0.17779822 - time (sec): 4.88 - samples/sec: 2036.36 - lr: 0.000049 - momentum: 0.000000
2023-10-17 18:33:42,693 epoch 2 - iter 88/447 - loss 0.18081454 - time (sec): 9.65 - samples/sec: 1902.25 - lr: 0.000049 - momentum: 0.000000
2023-10-17 18:33:46,869 epoch 2 - iter 132/447 - loss 0.17033044 - time (sec): 13.82 - samples/sec: 1895.20 - lr: 0.000048 - momentum: 0.000000
2023-10-17 18:33:51,250 epoch 2 - iter 176/447 - loss 0.16248166 - time (sec): 18.21 - samples/sec: 1910.43 - lr: 0.000048 - momentum: 0.000000
2023-10-17 18:33:55,643 epoch 2 - iter 220/447 - loss 0.15837240 - time (sec): 22.60 - samples/sec: 1891.36 - lr: 0.000047 - momentum: 0.000000
2023-10-17 18:34:00,068 epoch 2 - iter 264/447 - loss 0.15761608 - time (sec): 27.02 - samples/sec: 1921.87 - lr: 0.000047 - momentum: 0.000000
2023-10-17 18:34:04,112 epoch 2 - iter 308/447 - loss 0.15372266 - time (sec): 31.07 - samples/sec: 1939.68 - lr: 0.000046 - momentum: 0.000000
2023-10-17 18:34:08,523 epoch 2 - iter 352/447 - loss 0.14866787 - time (sec): 35.48 - samples/sec: 1938.18 - lr: 0.000046 - momentum: 0.000000
2023-10-17 18:34:12,905 epoch 2 - iter 396/447 - loss 0.14808787 - time (sec): 39.86 - samples/sec: 1928.10 - lr: 0.000045 - momentum: 0.000000
2023-10-17 18:34:17,054 epoch 2 - iter 440/447 - loss 0.14534280 - time (sec): 44.01 - samples/sec: 1937.36 - lr: 0.000045 - momentum: 0.000000
2023-10-17 18:34:17,696 ----------------------------------------------------------------------------------------------------
2023-10-17 18:34:17,696 EPOCH 2 done: loss 0.1448 - lr: 0.000045
2023-10-17 18:34:29,558 DEV : loss 0.1310628205537796 - f1-score (micro avg) 0.7292
2023-10-17 18:34:29,614 saving best model
2023-10-17 18:34:31,054 ----------------------------------------------------------------------------------------------------
2023-10-17 18:34:35,742 epoch 3 - iter 44/447 - loss 0.09348001 - time (sec): 4.68 - samples/sec: 1807.17 - lr: 0.000044 - momentum: 0.000000
2023-10-17 18:34:40,331 epoch 3 - iter 88/447 - loss 0.09261690 - time (sec): 9.27 - samples/sec: 1785.80 - lr: 0.000043 - momentum: 0.000000
2023-10-17 18:34:44,948 epoch 3 - iter 132/447 - loss 0.08918888 - time (sec): 13.89 - samples/sec: 1849.68 - lr: 0.000043 - momentum: 0.000000
2023-10-17 18:34:49,456 epoch 3 - iter 176/447 - loss 0.08598151 - time (sec): 18.40 - samples/sec: 1851.63 - lr: 0.000042 - momentum: 0.000000
2023-10-17 18:34:54,018 epoch 3 - iter 220/447 - loss 0.09004317 - time (sec): 22.96 - samples/sec: 1851.86 - lr: 0.000042 - momentum: 0.000000
2023-10-17 18:34:58,752 epoch 3 - iter 264/447 - loss 0.08930716 - time (sec): 27.69 - samples/sec: 1881.64 - lr: 0.000041 - momentum: 0.000000
2023-10-17 18:35:02,714 epoch 3 - iter 308/447 - loss 0.08984280 - time (sec): 31.66 - samples/sec: 1900.59 - lr: 0.000041 - momentum: 0.000000
2023-10-17 18:35:06,959 epoch 3 - iter 352/447 - loss 0.09017038 - time (sec): 35.90 - samples/sec: 1900.69 - lr: 0.000040 - momentum: 0.000000
2023-10-17 18:35:11,651 epoch 3 - iter 396/447 - loss 0.08914909 - time (sec): 40.59 - samples/sec: 1886.73 - lr: 0.000040 - momentum: 0.000000
2023-10-17 18:35:16,362 epoch 3 - iter 440/447 - loss 0.09020947 - time (sec): 45.30 - samples/sec: 1879.93 - lr: 0.000039 - momentum: 0.000000
2023-10-17 18:35:17,046 ----------------------------------------------------------------------------------------------------
2023-10-17 18:35:17,046 EPOCH 3 done: loss 0.0896 - lr: 0.000039
2023-10-17 18:35:28,443 DEV : loss 0.13421297073364258 - f1-score (micro avg) 0.7515
2023-10-17 18:35:28,500 saving best model
2023-10-17 18:35:29,985 ----------------------------------------------------------------------------------------------------
2023-10-17 18:35:34,305 epoch 4 - iter 44/447 - loss 0.04624539 - time (sec): 4.31 - samples/sec: 1728.30 - lr: 0.000038 - momentum: 0.000000
2023-10-17 18:35:38,772 epoch 4 - iter 88/447 - loss 0.04440189 - time (sec): 8.78 - samples/sec: 1851.77 - lr: 0.000038 - momentum: 0.000000
2023-10-17 18:35:43,463 epoch 4 - iter 132/447 - loss 0.05384950 - time (sec): 13.47 - samples/sec: 1910.60 - lr: 0.000037 - momentum: 0.000000
2023-10-17 18:35:47,579 epoch 4 - iter 176/447 - loss 0.05588999 - time (sec): 17.58 - samples/sec: 1957.95 - lr: 0.000037 - momentum: 0.000000
2023-10-17 18:35:51,671 epoch 4 - iter 220/447 - loss 0.05836835 - time (sec): 21.68 - samples/sec: 1974.46 - lr: 0.000036 - momentum: 0.000000
2023-10-17 18:35:56,069 epoch 4 - iter 264/447 - loss 0.05627529 - time (sec): 26.07 - samples/sec: 1969.72 - lr: 0.000036 - momentum: 0.000000
2023-10-17 18:36:00,754 epoch 4 - iter 308/447 - loss 0.05990380 - time (sec): 30.76 - samples/sec: 1945.81 - lr: 0.000035 - momentum: 0.000000
2023-10-17 18:36:05,256 epoch 4 - iter 352/447 - loss 0.05894391 - time (sec): 35.26 - samples/sec: 1941.97 - lr: 0.000035 - momentum: 0.000000
2023-10-17 18:36:09,723 epoch 4 - iter 396/447 - loss 0.05734348 - time (sec): 39.73 - samples/sec: 1941.59 - lr: 0.000034 - momentum: 0.000000
2023-10-17 18:36:14,112 epoch 4 - iter 440/447 - loss 0.05645626 - time (sec): 44.12 - samples/sec: 1937.81 - lr: 0.000033 - momentum: 0.000000
2023-10-17 18:36:14,760 ----------------------------------------------------------------------------------------------------
2023-10-17 18:36:14,761 EPOCH 4 done: loss 0.0564 - lr: 0.000033
2023-10-17 18:36:26,224 DEV : loss 0.16498109698295593 - f1-score (micro avg) 0.7697
2023-10-17 18:36:26,284 saving best model
2023-10-17 18:36:27,687 ----------------------------------------------------------------------------------------------------
2023-10-17 18:36:31,851 epoch 5 - iter 44/447 - loss 0.02199943 - time (sec): 4.16 - samples/sec: 2052.74 - lr: 0.000033 - momentum: 0.000000
2023-10-17 18:36:36,000 epoch 5 - iter 88/447 - loss 0.03388540 - time (sec): 8.31 - samples/sec: 2049.75 - lr: 0.000032 - momentum: 0.000000
2023-10-17 18:36:40,377 epoch 5 - iter 132/447 - loss 0.03279965 - time (sec): 12.69 - samples/sec: 2086.55 - lr: 0.000032 - momentum: 0.000000
2023-10-17 18:36:44,397 epoch 5 - iter 176/447 - loss 0.03416884 - time (sec): 16.71 - samples/sec: 2081.94 - lr: 0.000031 - momentum: 0.000000
2023-10-17 18:36:48,272 epoch 5 - iter 220/447 - loss 0.03457991 - time (sec): 20.58 - samples/sec: 2068.03 - lr: 0.000031 - momentum: 0.000000
2023-10-17 18:36:52,516 epoch 5 - iter 264/447 - loss 0.03419096 - time (sec): 24.82 - samples/sec: 2056.72 - lr: 0.000030 - momentum: 0.000000
2023-10-17 18:36:56,737 epoch 5 - iter 308/447 - loss 0.03362099 - time (sec): 29.04 - samples/sec: 2050.86 - lr: 0.000030 - momentum: 0.000000
2023-10-17 18:37:00,804 epoch 5 - iter 352/447 - loss 0.03318720 - time (sec): 33.11 - samples/sec: 2041.25 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:37:04,979 epoch 5 - iter 396/447 - loss 0.03273012 - time (sec): 37.29 - samples/sec: 2029.72 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:37:09,434 epoch 5 - iter 440/447 - loss 0.03445776 - time (sec): 41.74 - samples/sec: 2021.35 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:37:10,503 ----------------------------------------------------------------------------------------------------
2023-10-17 18:37:10,503 EPOCH 5 done: loss 0.0343 - lr: 0.000028
2023-10-17 18:37:22,287 DEV : loss 0.1768861711025238 - f1-score (micro avg) 0.7794
2023-10-17 18:37:22,350 saving best model
2023-10-17 18:37:23,824 ----------------------------------------------------------------------------------------------------
2023-10-17 18:37:28,237 epoch 6 - iter 44/447 - loss 0.01272213 - time (sec): 4.41 - samples/sec: 2169.06 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:37:32,518 epoch 6 - iter 88/447 - loss 0.01539498 - time (sec): 8.69 - samples/sec: 2037.92 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:37:37,101 epoch 6 - iter 132/447 - loss 0.01851038 - time (sec): 13.27 - samples/sec: 1935.72 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:37:41,284 epoch 6 - iter 176/447 - loss 0.01975938 - time (sec): 17.46 - samples/sec: 1936.02 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:37:45,549 epoch 6 - iter 220/447 - loss 0.02077733 - time (sec): 21.72 - samples/sec: 1930.62 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:37:49,663 epoch 6 - iter 264/447 - loss 0.02148132 - time (sec): 25.84 - samples/sec: 1965.90 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:37:53,750 epoch 6 - iter 308/447 - loss 0.02125641 - time (sec): 29.92 - samples/sec: 1979.97 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:37:58,443 epoch 6 - iter 352/447 - loss 0.02171037 - time (sec): 34.62 - samples/sec: 1966.41 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:38:03,472 epoch 6 - iter 396/447 - loss 0.02173515 - time (sec): 39.64 - samples/sec: 1954.12 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:38:07,761 epoch 6 - iter 440/447 - loss 0.02098498 - time (sec): 43.93 - samples/sec: 1946.46 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:38:08,414 ----------------------------------------------------------------------------------------------------
2023-10-17 18:38:08,415 EPOCH 6 done: loss 0.0209 - lr: 0.000022
2023-10-17 18:38:19,003 DEV : loss 0.21698522567749023 - f1-score (micro avg) 0.7814
2023-10-17 18:38:19,057 saving best model
2023-10-17 18:38:20,481 ----------------------------------------------------------------------------------------------------
2023-10-17 18:38:24,541 epoch 7 - iter 44/447 - loss 0.00663934 - time (sec): 4.06 - samples/sec: 2150.22 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:38:28,549 epoch 7 - iter 88/447 - loss 0.00916382 - time (sec): 8.06 - samples/sec: 2101.54 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:38:32,548 epoch 7 - iter 132/447 - loss 0.00862219 - time (sec): 12.06 - samples/sec: 2083.10 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:38:36,714 epoch 7 - iter 176/447 - loss 0.01076391 - time (sec): 16.23 - samples/sec: 2075.89 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:38:40,916 epoch 7 - iter 220/447 - loss 0.01133687 - time (sec): 20.43 - samples/sec: 2056.94 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:38:45,009 epoch 7 - iter 264/447 - loss 0.01169682 - time (sec): 24.52 - samples/sec: 2045.15 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:38:49,616 epoch 7 - iter 308/447 - loss 0.01149309 - time (sec): 29.13 - samples/sec: 2017.06 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:38:53,734 epoch 7 - iter 352/447 - loss 0.01240281 - time (sec): 33.25 - samples/sec: 2000.31 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:38:58,381 epoch 7 - iter 396/447 - loss 0.01335837 - time (sec): 37.90 - samples/sec: 2020.64 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:39:02,568 epoch 7 - iter 440/447 - loss 0.01332056 - time (sec): 42.08 - samples/sec: 2020.03 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:39:03,195 ----------------------------------------------------------------------------------------------------
2023-10-17 18:39:03,196 EPOCH 7 done: loss 0.0132 - lr: 0.000017
2023-10-17 18:39:13,976 DEV : loss 0.2289014309644699 - f1-score (micro avg) 0.7865
2023-10-17 18:39:14,038 saving best model
2023-10-17 18:39:15,456 ----------------------------------------------------------------------------------------------------
2023-10-17 18:39:19,748 epoch 8 - iter 44/447 - loss 0.00757963 - time (sec): 4.29 - samples/sec: 1852.38 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:39:24,373 epoch 8 - iter 88/447 - loss 0.01028323 - time (sec): 8.91 - samples/sec: 1853.54 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:39:29,308 epoch 8 - iter 132/447 - loss 0.00833315 - time (sec): 13.85 - samples/sec: 1933.16 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:39:33,329 epoch 8 - iter 176/447 - loss 0.00791567 - time (sec): 17.87 - samples/sec: 1929.69 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:39:37,380 epoch 8 - iter 220/447 - loss 0.00938211 - time (sec): 21.92 - samples/sec: 1952.84 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:39:41,533 epoch 8 - iter 264/447 - loss 0.00986939 - time (sec): 26.07 - samples/sec: 1959.68 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:39:45,823 epoch 8 - iter 308/447 - loss 0.01043189 - time (sec): 30.36 - samples/sec: 1980.94 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:39:50,183 epoch 8 - iter 352/447 - loss 0.01072671 - time (sec): 34.72 - samples/sec: 1972.90 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:39:54,559 epoch 8 - iter 396/447 - loss 0.01039982 - time (sec): 39.10 - samples/sec: 1951.97 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:39:59,113 epoch 8 - iter 440/447 - loss 0.01008795 - time (sec): 43.65 - samples/sec: 1957.71 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:39:59,749 ----------------------------------------------------------------------------------------------------
2023-10-17 18:39:59,749 EPOCH 8 done: loss 0.0100 - lr: 0.000011
2023-10-17 18:40:10,771 DEV : loss 0.24633415043354034 - f1-score (micro avg) 0.7813
2023-10-17 18:40:10,826 ----------------------------------------------------------------------------------------------------
2023-10-17 18:40:14,890 epoch 9 - iter 44/447 - loss 0.00587764 - time (sec): 4.06 - samples/sec: 1932.48 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:40:18,943 epoch 9 - iter 88/447 - loss 0.00574376 - time (sec): 8.11 - samples/sec: 2002.18 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:40:23,087 epoch 9 - iter 132/447 - loss 0.00498408 - time (sec): 12.26 - samples/sec: 2009.48 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:40:27,716 epoch 9 - iter 176/447 - loss 0.00542873 - time (sec): 16.89 - samples/sec: 2019.96 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:40:31,791 epoch 9 - iter 220/447 - loss 0.00462645 - time (sec): 20.96 - samples/sec: 1989.58 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:40:35,880 epoch 9 - iter 264/447 - loss 0.00439777 - time (sec): 25.05 - samples/sec: 1988.43 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:40:40,307 epoch 9 - iter 308/447 - loss 0.00454180 - time (sec): 29.48 - samples/sec: 1972.83 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:40:44,975 epoch 9 - iter 352/447 - loss 0.00448785 - time (sec): 34.15 - samples/sec: 1973.74 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:40:49,421 epoch 9 - iter 396/447 - loss 0.00445712 - time (sec): 38.59 - samples/sec: 1983.69 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:40:53,474 epoch 9 - iter 440/447 - loss 0.00457860 - time (sec): 42.65 - samples/sec: 2001.83 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:40:54,093 ----------------------------------------------------------------------------------------------------
2023-10-17 18:40:54,093 EPOCH 9 done: loss 0.0045 - lr: 0.000006
2023-10-17 18:41:05,718 DEV : loss 0.2427874058485031 - f1-score (micro avg) 0.7886
2023-10-17 18:41:05,781 saving best model
2023-10-17 18:41:07,280 ----------------------------------------------------------------------------------------------------
2023-10-17 18:41:11,725 epoch 10 - iter 44/447 - loss 0.00160514 - time (sec): 4.44 - samples/sec: 2042.33 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:41:16,229 epoch 10 - iter 88/447 - loss 0.00319179 - time (sec): 8.94 - samples/sec: 2118.49 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:41:20,172 epoch 10 - iter 132/447 - loss 0.00275824 - time (sec): 12.89 - samples/sec: 2119.39 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:41:23,939 epoch 10 - iter 176/447 - loss 0.00498494 - time (sec): 16.65 - samples/sec: 2156.62 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:41:27,774 epoch 10 - iter 220/447 - loss 0.00437952 - time (sec): 20.49 - samples/sec: 2147.41 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:41:31,811 epoch 10 - iter 264/447 - loss 0.00435338 - time (sec): 24.53 - samples/sec: 2135.63 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:41:36,331 epoch 10 - iter 308/447 - loss 0.00540662 - time (sec): 29.05 - samples/sec: 2079.00 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:41:40,384 epoch 10 - iter 352/447 - loss 0.00523384 - time (sec): 33.10 - samples/sec: 2059.89 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:41:44,730 epoch 10 - iter 396/447 - loss 0.00489409 - time (sec): 37.45 - samples/sec: 2042.80 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:41:48,878 epoch 10 - iter 440/447 - loss 0.00484791 - time (sec): 41.59 - samples/sec: 2047.20 - lr: 0.000000 - momentum: 0.000000
2023-10-17 18:41:49,568 ----------------------------------------------------------------------------------------------------
2023-10-17 18:41:49,568 EPOCH 10 done: loss 0.0048 - lr: 0.000000
2023-10-17 18:42:01,233 DEV : loss 0.24444200098514557 - f1-score (micro avg) 0.794
2023-10-17 18:42:01,295 saving best model
2023-10-17 18:42:03,304 ----------------------------------------------------------------------------------------------------
2023-10-17 18:42:03,306 Loading model from best epoch ...
2023-10-17 18:42:05,494 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-17 18:42:11,511
Results:
- F-score (micro) 0.7688
- F-score (macro) 0.6973
- Accuracy 0.6446
By class:
precision recall f1-score support
loc 0.8431 0.8658 0.8543 596
pers 0.7048 0.7958 0.7475 333
org 0.5075 0.5152 0.5113 132
prod 0.6731 0.5303 0.5932 66
time 0.7647 0.7959 0.7800 49
micro avg 0.7535 0.7849 0.7688 1176
macro avg 0.6986 0.7006 0.6973 1176
weighted avg 0.7535 0.7849 0.7678 1176
2023-10-17 18:42:11,511 ----------------------------------------------------------------------------------------------------