|
2023-10-16 21:44:09,762 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:44:09,763 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 21:44:09,763 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:44:09,764 MultiCorpus: 6183 train + 680 dev + 2113 test sentences |
|
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator |
|
2023-10-16 21:44:09,764 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:44:09,764 Train: 6183 sentences |
|
2023-10-16 21:44:09,764 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 21:44:09,764 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:44:09,764 Training Params: |
|
2023-10-16 21:44:09,764 - learning_rate: "5e-05" |
|
2023-10-16 21:44:09,764 - mini_batch_size: "4" |
|
2023-10-16 21:44:09,764 - max_epochs: "10" |
|
2023-10-16 21:44:09,764 - shuffle: "True" |
|
2023-10-16 21:44:09,764 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:44:09,764 Plugins: |
|
2023-10-16 21:44:09,764 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 21:44:09,764 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:44:09,764 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 21:44:09,764 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 21:44:09,764 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:44:09,764 Computation: |
|
2023-10-16 21:44:09,764 - compute on device: cuda:0 |
|
2023-10-16 21:44:09,764 - embedding storage: none |
|
2023-10-16 21:44:09,764 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:44:09,764 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-16 21:44:09,764 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:44:09,764 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:44:16,668 epoch 1 - iter 154/1546 - loss 1.65007029 - time (sec): 6.90 - samples/sec: 1715.81 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 21:44:23,611 epoch 1 - iter 308/1546 - loss 0.94129662 - time (sec): 13.85 - samples/sec: 1714.44 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 21:44:30,625 epoch 1 - iter 462/1546 - loss 0.64763640 - time (sec): 20.86 - samples/sec: 1777.67 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 21:44:37,494 epoch 1 - iter 616/1546 - loss 0.52517050 - time (sec): 27.73 - samples/sec: 1772.61 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 21:44:44,367 epoch 1 - iter 770/1546 - loss 0.44714948 - time (sec): 34.60 - samples/sec: 1761.60 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 21:44:51,242 epoch 1 - iter 924/1546 - loss 0.39140356 - time (sec): 41.48 - samples/sec: 1762.58 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 21:44:58,147 epoch 1 - iter 1078/1546 - loss 0.35053117 - time (sec): 48.38 - samples/sec: 1776.97 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 21:45:05,162 epoch 1 - iter 1232/1546 - loss 0.32273432 - time (sec): 55.40 - samples/sec: 1793.05 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 21:45:12,237 epoch 1 - iter 1386/1546 - loss 0.29966657 - time (sec): 62.47 - samples/sec: 1779.42 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-16 21:45:19,081 epoch 1 - iter 1540/1546 - loss 0.28039720 - time (sec): 69.32 - samples/sec: 1787.73 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-16 21:45:19,336 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:45:19,336 EPOCH 1 done: loss 0.2799 - lr: 0.000050 |
|
2023-10-16 21:45:21,072 DEV : loss 0.0895804762840271 - f1-score (micro avg) 0.637 |
|
2023-10-16 21:45:21,095 saving best model |
|
2023-10-16 21:45:21,441 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:45:28,278 epoch 2 - iter 154/1546 - loss 0.09243588 - time (sec): 6.84 - samples/sec: 1772.05 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-16 21:45:35,087 epoch 2 - iter 308/1546 - loss 0.10563878 - time (sec): 13.65 - samples/sec: 1755.06 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-16 21:45:42,042 epoch 2 - iter 462/1546 - loss 0.10498928 - time (sec): 20.60 - samples/sec: 1786.65 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-16 21:45:48,905 epoch 2 - iter 616/1546 - loss 0.10253956 - time (sec): 27.46 - samples/sec: 1779.24 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-16 21:45:55,770 epoch 2 - iter 770/1546 - loss 0.10307917 - time (sec): 34.33 - samples/sec: 1793.99 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-16 21:46:02,644 epoch 2 - iter 924/1546 - loss 0.10366383 - time (sec): 41.20 - samples/sec: 1775.94 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-16 21:46:09,503 epoch 2 - iter 1078/1546 - loss 0.10275984 - time (sec): 48.06 - samples/sec: 1780.26 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-16 21:46:16,416 epoch 2 - iter 1232/1546 - loss 0.10145230 - time (sec): 54.97 - samples/sec: 1779.20 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-16 21:46:23,230 epoch 2 - iter 1386/1546 - loss 0.09850958 - time (sec): 61.79 - samples/sec: 1784.50 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-16 21:46:30,555 epoch 2 - iter 1540/1546 - loss 0.09639554 - time (sec): 69.11 - samples/sec: 1791.82 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-16 21:46:30,812 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:46:30,812 EPOCH 2 done: loss 0.0961 - lr: 0.000044 |
|
2023-10-16 21:46:32,814 DEV : loss 0.0685592070221901 - f1-score (micro avg) 0.7696 |
|
2023-10-16 21:46:32,827 saving best model |
|
2023-10-16 21:46:33,306 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:46:40,223 epoch 3 - iter 154/1546 - loss 0.06577964 - time (sec): 6.91 - samples/sec: 1853.17 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-16 21:46:46,999 epoch 3 - iter 308/1546 - loss 0.06885278 - time (sec): 13.69 - samples/sec: 1832.42 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 21:46:53,930 epoch 3 - iter 462/1546 - loss 0.07613517 - time (sec): 20.62 - samples/sec: 1828.52 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 21:47:00,686 epoch 3 - iter 616/1546 - loss 0.07339670 - time (sec): 27.38 - samples/sec: 1822.81 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-16 21:47:07,536 epoch 3 - iter 770/1546 - loss 0.07279666 - time (sec): 34.23 - samples/sec: 1827.46 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-16 21:47:14,494 epoch 3 - iter 924/1546 - loss 0.07159335 - time (sec): 41.19 - samples/sec: 1818.28 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-16 21:47:21,360 epoch 3 - iter 1078/1546 - loss 0.06968918 - time (sec): 48.05 - samples/sec: 1814.87 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-16 21:47:28,153 epoch 3 - iter 1232/1546 - loss 0.06933368 - time (sec): 54.84 - samples/sec: 1802.51 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 21:47:35,020 epoch 3 - iter 1386/1546 - loss 0.06859729 - time (sec): 61.71 - samples/sec: 1812.58 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-16 21:47:41,777 epoch 3 - iter 1540/1546 - loss 0.06849329 - time (sec): 68.47 - samples/sec: 1810.30 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-16 21:47:42,035 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:47:42,035 EPOCH 3 done: loss 0.0685 - lr: 0.000039 |
|
2023-10-16 21:47:44,051 DEV : loss 0.08517798036336899 - f1-score (micro avg) 0.7447 |
|
2023-10-16 21:47:44,063 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:47:51,034 epoch 4 - iter 154/1546 - loss 0.05538366 - time (sec): 6.97 - samples/sec: 1785.10 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 21:47:57,870 epoch 4 - iter 308/1546 - loss 0.05887323 - time (sec): 13.81 - samples/sec: 1724.84 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 21:48:04,737 epoch 4 - iter 462/1546 - loss 0.05491630 - time (sec): 20.67 - samples/sec: 1741.61 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-16 21:48:11,492 epoch 4 - iter 616/1546 - loss 0.05243751 - time (sec): 27.43 - samples/sec: 1758.25 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-16 21:48:18,347 epoch 4 - iter 770/1546 - loss 0.05291705 - time (sec): 34.28 - samples/sec: 1778.29 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-16 21:48:25,115 epoch 4 - iter 924/1546 - loss 0.05316064 - time (sec): 41.05 - samples/sec: 1782.75 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-16 21:48:31,936 epoch 4 - iter 1078/1546 - loss 0.05226355 - time (sec): 47.87 - samples/sec: 1787.50 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 21:48:38,834 epoch 4 - iter 1232/1546 - loss 0.05147292 - time (sec): 54.77 - samples/sec: 1793.52 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-16 21:48:45,749 epoch 4 - iter 1386/1546 - loss 0.04971427 - time (sec): 61.68 - samples/sec: 1797.52 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-16 21:48:52,720 epoch 4 - iter 1540/1546 - loss 0.04975106 - time (sec): 68.66 - samples/sec: 1804.48 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-16 21:48:52,978 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:48:52,978 EPOCH 4 done: loss 0.0497 - lr: 0.000033 |
|
2023-10-16 21:48:55,028 DEV : loss 0.09497705101966858 - f1-score (micro avg) 0.7821 |
|
2023-10-16 21:48:55,040 saving best model |
|
2023-10-16 21:48:55,509 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:49:02,349 epoch 5 - iter 154/1546 - loss 0.01734854 - time (sec): 6.83 - samples/sec: 1839.45 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-16 21:49:09,103 epoch 5 - iter 308/1546 - loss 0.02591889 - time (sec): 13.58 - samples/sec: 1813.31 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-16 21:49:15,969 epoch 5 - iter 462/1546 - loss 0.03215375 - time (sec): 20.45 - samples/sec: 1762.51 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-16 21:49:22,877 epoch 5 - iter 616/1546 - loss 0.03561342 - time (sec): 27.36 - samples/sec: 1798.27 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-16 21:49:29,717 epoch 5 - iter 770/1546 - loss 0.04014613 - time (sec): 34.20 - samples/sec: 1807.31 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-16 21:49:36,641 epoch 5 - iter 924/1546 - loss 0.04102622 - time (sec): 41.12 - samples/sec: 1818.94 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 21:49:43,561 epoch 5 - iter 1078/1546 - loss 0.03999871 - time (sec): 48.04 - samples/sec: 1820.62 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 21:49:50,438 epoch 5 - iter 1232/1546 - loss 0.04053831 - time (sec): 54.92 - samples/sec: 1816.84 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 21:49:57,266 epoch 5 - iter 1386/1546 - loss 0.04007914 - time (sec): 61.75 - samples/sec: 1811.89 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 21:50:04,082 epoch 5 - iter 1540/1546 - loss 0.03801735 - time (sec): 68.56 - samples/sec: 1807.74 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 21:50:04,338 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:50:04,338 EPOCH 5 done: loss 0.0380 - lr: 0.000028 |
|
2023-10-16 21:50:06,356 DEV : loss 0.09318046271800995 - f1-score (micro avg) 0.7782 |
|
2023-10-16 21:50:06,368 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:50:13,194 epoch 6 - iter 154/1546 - loss 0.02556469 - time (sec): 6.82 - samples/sec: 1809.18 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 21:50:20,094 epoch 6 - iter 308/1546 - loss 0.03097773 - time (sec): 13.72 - samples/sec: 1736.11 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 21:50:26,958 epoch 6 - iter 462/1546 - loss 0.02965652 - time (sec): 20.59 - samples/sec: 1746.62 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 21:50:33,830 epoch 6 - iter 616/1546 - loss 0.02790282 - time (sec): 27.46 - samples/sec: 1771.83 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 21:50:40,624 epoch 6 - iter 770/1546 - loss 0.02749592 - time (sec): 34.25 - samples/sec: 1771.92 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 21:50:47,433 epoch 6 - iter 924/1546 - loss 0.02610977 - time (sec): 41.06 - samples/sec: 1768.73 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 21:50:54,284 epoch 6 - iter 1078/1546 - loss 0.02639060 - time (sec): 47.91 - samples/sec: 1772.33 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 21:51:01,306 epoch 6 - iter 1232/1546 - loss 0.02605220 - time (sec): 54.94 - samples/sec: 1777.47 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 21:51:08,170 epoch 6 - iter 1386/1546 - loss 0.02522412 - time (sec): 61.80 - samples/sec: 1771.19 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 21:51:15,235 epoch 6 - iter 1540/1546 - loss 0.02642731 - time (sec): 68.87 - samples/sec: 1795.68 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 21:51:15,502 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:51:15,502 EPOCH 6 done: loss 0.0266 - lr: 0.000022 |
|
2023-10-16 21:51:17,895 DEV : loss 0.10446853190660477 - f1-score (micro avg) 0.7795 |
|
2023-10-16 21:51:17,908 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:51:24,779 epoch 7 - iter 154/1546 - loss 0.02404472 - time (sec): 6.87 - samples/sec: 1839.81 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 21:51:31,664 epoch 7 - iter 308/1546 - loss 0.02056482 - time (sec): 13.75 - samples/sec: 1861.72 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 21:51:38,496 epoch 7 - iter 462/1546 - loss 0.02046697 - time (sec): 20.59 - samples/sec: 1840.24 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 21:51:45,451 epoch 7 - iter 616/1546 - loss 0.02166477 - time (sec): 27.54 - samples/sec: 1809.86 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 21:51:52,463 epoch 7 - iter 770/1546 - loss 0.02146336 - time (sec): 34.55 - samples/sec: 1805.13 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 21:51:59,383 epoch 7 - iter 924/1546 - loss 0.02253466 - time (sec): 41.47 - samples/sec: 1791.51 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 21:52:06,243 epoch 7 - iter 1078/1546 - loss 0.02170121 - time (sec): 48.33 - samples/sec: 1787.33 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 21:52:13,045 epoch 7 - iter 1232/1546 - loss 0.02115814 - time (sec): 55.14 - samples/sec: 1788.46 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 21:52:19,960 epoch 7 - iter 1386/1546 - loss 0.02075815 - time (sec): 62.05 - samples/sec: 1795.57 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 21:52:26,876 epoch 7 - iter 1540/1546 - loss 0.02048075 - time (sec): 68.97 - samples/sec: 1796.06 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 21:52:27,154 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:52:27,154 EPOCH 7 done: loss 0.0204 - lr: 0.000017 |
|
2023-10-16 21:52:29,150 DEV : loss 0.11760783195495605 - f1-score (micro avg) 0.7724 |
|
2023-10-16 21:52:29,163 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:52:36,119 epoch 8 - iter 154/1546 - loss 0.00904914 - time (sec): 6.96 - samples/sec: 1799.83 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 21:52:43,087 epoch 8 - iter 308/1546 - loss 0.01120019 - time (sec): 13.92 - samples/sec: 1821.61 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 21:52:49,903 epoch 8 - iter 462/1546 - loss 0.01332359 - time (sec): 20.74 - samples/sec: 1822.53 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 21:52:56,846 epoch 8 - iter 616/1546 - loss 0.01192538 - time (sec): 27.68 - samples/sec: 1842.43 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 21:53:03,643 epoch 8 - iter 770/1546 - loss 0.01078426 - time (sec): 34.48 - samples/sec: 1815.73 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 21:53:10,438 epoch 8 - iter 924/1546 - loss 0.01161029 - time (sec): 41.27 - samples/sec: 1813.84 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 21:53:17,145 epoch 8 - iter 1078/1546 - loss 0.01243620 - time (sec): 47.98 - samples/sec: 1810.28 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 21:53:23,967 epoch 8 - iter 1232/1546 - loss 0.01233856 - time (sec): 54.80 - samples/sec: 1804.05 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 21:53:30,776 epoch 8 - iter 1386/1546 - loss 0.01272431 - time (sec): 61.61 - samples/sec: 1800.38 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 21:53:37,834 epoch 8 - iter 1540/1546 - loss 0.01300626 - time (sec): 68.67 - samples/sec: 1804.05 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 21:53:38,096 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:53:38,096 EPOCH 8 done: loss 0.0130 - lr: 0.000011 |
|
2023-10-16 21:53:40,162 DEV : loss 0.11045785248279572 - f1-score (micro avg) 0.7898 |
|
2023-10-16 21:53:40,175 saving best model |
|
2023-10-16 21:53:40,649 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:53:47,486 epoch 9 - iter 154/1546 - loss 0.00775423 - time (sec): 6.83 - samples/sec: 1831.86 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 21:53:54,019 epoch 9 - iter 308/1546 - loss 0.00790082 - time (sec): 13.36 - samples/sec: 1827.72 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 21:54:00,556 epoch 9 - iter 462/1546 - loss 0.00714745 - time (sec): 19.90 - samples/sec: 1837.52 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 21:54:07,065 epoch 9 - iter 616/1546 - loss 0.00761023 - time (sec): 26.41 - samples/sec: 1842.09 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 21:54:13,656 epoch 9 - iter 770/1546 - loss 0.00769417 - time (sec): 33.00 - samples/sec: 1841.10 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 21:54:20,344 epoch 9 - iter 924/1546 - loss 0.00778298 - time (sec): 39.69 - samples/sec: 1855.33 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 21:54:26,945 epoch 9 - iter 1078/1546 - loss 0.00794647 - time (sec): 46.29 - samples/sec: 1864.63 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 21:54:33,540 epoch 9 - iter 1232/1546 - loss 0.00813811 - time (sec): 52.89 - samples/sec: 1864.94 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 21:54:40,134 epoch 9 - iter 1386/1546 - loss 0.00823860 - time (sec): 59.48 - samples/sec: 1871.30 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 21:54:46,726 epoch 9 - iter 1540/1546 - loss 0.00873508 - time (sec): 66.07 - samples/sec: 1876.32 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 21:54:46,975 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:54:46,975 EPOCH 9 done: loss 0.0087 - lr: 0.000006 |
|
2023-10-16 21:54:48,998 DEV : loss 0.11864420771598816 - f1-score (micro avg) 0.7941 |
|
2023-10-16 21:54:49,011 saving best model |
|
2023-10-16 21:54:49,452 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:54:56,289 epoch 10 - iter 154/1546 - loss 0.00586620 - time (sec): 6.83 - samples/sec: 1818.10 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 21:55:03,238 epoch 10 - iter 308/1546 - loss 0.00524542 - time (sec): 13.78 - samples/sec: 1808.72 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 21:55:10,143 epoch 10 - iter 462/1546 - loss 0.00563260 - time (sec): 20.69 - samples/sec: 1780.72 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 21:55:17,083 epoch 10 - iter 616/1546 - loss 0.00560671 - time (sec): 27.63 - samples/sec: 1810.14 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 21:55:23,981 epoch 10 - iter 770/1546 - loss 0.00551268 - time (sec): 34.53 - samples/sec: 1798.02 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 21:55:30,938 epoch 10 - iter 924/1546 - loss 0.00541465 - time (sec): 41.48 - samples/sec: 1806.94 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 21:55:37,821 epoch 10 - iter 1078/1546 - loss 0.00581212 - time (sec): 48.37 - samples/sec: 1798.64 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 21:55:44,637 epoch 10 - iter 1232/1546 - loss 0.00550965 - time (sec): 55.18 - samples/sec: 1802.49 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 21:55:51,547 epoch 10 - iter 1386/1546 - loss 0.00535177 - time (sec): 62.09 - samples/sec: 1793.62 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 21:55:58,452 epoch 10 - iter 1540/1546 - loss 0.00556052 - time (sec): 69.00 - samples/sec: 1791.86 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 21:55:58,718 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:55:58,719 EPOCH 10 done: loss 0.0055 - lr: 0.000000 |
|
2023-10-16 21:56:00,754 DEV : loss 0.1224876418709755 - f1-score (micro avg) 0.7966 |
|
2023-10-16 21:56:00,767 saving best model |
|
2023-10-16 21:56:01,611 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 21:56:01,612 Loading model from best epoch ... |
|
2023-10-16 21:56:03,412 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET |
|
2023-10-16 21:56:09,090 |
|
Results: |
|
- F-score (micro) 0.8125 |
|
- F-score (macro) 0.7224 |
|
- Accuracy 0.7061 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8716 0.8467 0.8590 946 |
|
BUILDING 0.6347 0.5730 0.6023 185 |
|
STREET 0.6667 0.7500 0.7059 56 |
|
|
|
micro avg 0.8259 0.7995 0.8125 1187 |
|
macro avg 0.7243 0.7232 0.7224 1187 |
|
weighted avg 0.8250 0.7995 0.8117 1187 |
|
|
|
2023-10-16 21:56:09,090 ---------------------------------------------------------------------------------------------------- |
|
|