2023-10-17 08:36:58,474 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:36:58,475 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 08:36:58,475 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:36:58,475 MultiCorpus: 1100 train + 206 dev + 240 test sentences - NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator 2023-10-17 08:36:58,475 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:36:58,475 Train: 1100 sentences 2023-10-17 08:36:58,475 (train_with_dev=False, train_with_test=False) 2023-10-17 08:36:58,475 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:36:58,475 Training Params: 2023-10-17 08:36:58,475 - learning_rate: "3e-05" 2023-10-17 08:36:58,475 - mini_batch_size: "4" 2023-10-17 08:36:58,475 - max_epochs: "10" 2023-10-17 08:36:58,475 - shuffle: "True" 2023-10-17 08:36:58,476 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:36:58,476 Plugins: 2023-10-17 08:36:58,476 - TensorboardLogger 2023-10-17 08:36:58,476 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 08:36:58,476 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:36:58,476 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 08:36:58,476 - metric: "('micro avg', 'f1-score')" 2023-10-17 08:36:58,476 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:36:58,476 Computation: 2023-10-17 08:36:58,476 - compute on device: cuda:0 2023-10-17 08:36:58,476 - embedding storage: none 2023-10-17 08:36:58,476 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:36:58,476 Model training base path: "hmbench-ajmc/de-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-17 08:36:58,476 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:36:58,476 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:36:58,476 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 08:36:59,775 epoch 1 - iter 27/275 - loss 4.18458578 - time (sec): 1.30 - samples/sec: 1607.98 - lr: 0.000003 - momentum: 0.000000 2023-10-17 08:37:01,092 epoch 1 - iter 54/275 - loss 3.52995056 - time (sec): 2.62 - samples/sec: 1665.60 - lr: 0.000006 - momentum: 0.000000 2023-10-17 08:37:02,364 epoch 1 - iter 81/275 - loss 2.73946158 - time (sec): 3.89 - samples/sec: 1742.61 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:37:03,575 epoch 1 - iter 108/275 - loss 2.32116704 - time (sec): 5.10 - samples/sec: 1717.25 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:37:04,802 epoch 1 - iter 135/275 - loss 1.97126016 - time (sec): 6.33 - samples/sec: 1750.19 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:37:06,020 epoch 1 - iter 162/275 - loss 1.71777691 - time (sec): 7.54 - samples/sec: 1773.32 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:37:07,229 epoch 1 - iter 189/275 - loss 1.54243129 - time (sec): 8.75 - samples/sec: 1772.43 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:37:08,460 epoch 1 - iter 216/275 - loss 1.39723452 - time (sec): 9.98 - samples/sec: 1779.60 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:37:09,699 epoch 1 - iter 243/275 - loss 1.27203012 - time (sec): 11.22 - samples/sec: 1793.76 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:37:10,955 epoch 1 - iter 270/275 - loss 1.17202515 - time (sec): 12.48 - samples/sec: 1794.93 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:37:11,175 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:37:11,175 EPOCH 1 done: loss 1.1580 - lr: 0.000029 2023-10-17 08:37:12,109 DEV : loss 0.24835138022899628 - f1-score (micro avg) 0.6724 2023-10-17 08:37:12,118 saving best model 2023-10-17 08:37:12,514 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:37:13,747 epoch 2 - iter 27/275 - loss 0.29820852 - time (sec): 1.23 - samples/sec: 1819.76 - lr: 0.000030 - momentum: 0.000000 2023-10-17 08:37:15,002 epoch 2 - iter 54/275 - loss 0.23526719 - time (sec): 2.49 - samples/sec: 1815.59 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:37:16,252 epoch 2 - iter 81/275 - loss 0.21389662 - time (sec): 3.74 - samples/sec: 1840.47 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:37:17,492 epoch 2 - iter 108/275 - loss 0.20199060 - time (sec): 4.98 - samples/sec: 1829.24 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:37:18,780 epoch 2 - iter 135/275 - loss 0.20714418 - time (sec): 6.26 - samples/sec: 1849.93 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:37:20,000 epoch 2 - iter 162/275 - loss 0.19899218 - time (sec): 7.48 - samples/sec: 1839.93 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:37:21,223 epoch 2 - iter 189/275 - loss 0.18640584 - time (sec): 8.71 - samples/sec: 1826.78 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:37:22,426 epoch 2 - iter 216/275 - loss 0.17495426 - time (sec): 9.91 - samples/sec: 1823.46 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:37:23,624 epoch 2 - iter 243/275 - loss 0.17167750 - time (sec): 11.11 - samples/sec: 1813.93 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:37:24,834 epoch 2 - iter 270/275 - loss 0.17281482 - time (sec): 12.32 - samples/sec: 1821.11 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:37:25,066 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:37:25,066 EPOCH 2 done: loss 0.1708 - lr: 0.000027 2023-10-17 08:37:25,708 DEV : loss 0.17084632813930511 - f1-score (micro avg) 0.7878 2023-10-17 08:37:25,713 saving best model 2023-10-17 08:37:26,168 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:37:27,514 epoch 3 - iter 27/275 - loss 0.09476884 - time (sec): 1.34 - samples/sec: 1858.77 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:37:28,804 epoch 3 - iter 54/275 - loss 0.10381617 - time (sec): 2.63 - samples/sec: 1816.48 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:37:30,061 epoch 3 - iter 81/275 - loss 0.08816859 - time (sec): 3.89 - samples/sec: 1809.77 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:37:31,279 epoch 3 - iter 108/275 - loss 0.09298207 - time (sec): 5.10 - samples/sec: 1858.42 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:37:32,446 epoch 3 - iter 135/275 - loss 0.08748654 - time (sec): 6.27 - samples/sec: 1829.77 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:37:33,636 epoch 3 - iter 162/275 - loss 0.08756638 - time (sec): 7.46 - samples/sec: 1832.04 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:37:34,860 epoch 3 - iter 189/275 - loss 0.09520057 - time (sec): 8.68 - samples/sec: 1832.53 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:37:36,073 epoch 3 - iter 216/275 - loss 0.09802843 - time (sec): 9.90 - samples/sec: 1837.75 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:37:37,279 epoch 3 - iter 243/275 - loss 0.10354344 - time (sec): 11.10 - samples/sec: 1832.55 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:37:38,492 epoch 3 - iter 270/275 - loss 0.10281829 - time (sec): 12.32 - samples/sec: 1822.19 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:37:38,722 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:37:38,722 EPOCH 3 done: loss 0.1016 - lr: 0.000023 2023-10-17 08:37:39,360 DEV : loss 0.1669451892375946 - f1-score (micro avg) 0.8578 2023-10-17 08:37:39,365 saving best model 2023-10-17 08:37:39,814 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:37:41,065 epoch 4 - iter 27/275 - loss 0.06433406 - time (sec): 1.25 - samples/sec: 1732.40 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:37:42,286 epoch 4 - iter 54/275 - loss 0.05325618 - time (sec): 2.47 - samples/sec: 1795.30 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:37:43,493 epoch 4 - iter 81/275 - loss 0.05697675 - time (sec): 3.68 - samples/sec: 1803.32 - lr: 0.000022 - momentum: 0.000000 2023-10-17 08:37:44,721 epoch 4 - iter 108/275 - loss 0.05585319 - time (sec): 4.91 - samples/sec: 1757.38 - lr: 0.000022 - momentum: 0.000000 2023-10-17 08:37:45,977 epoch 4 - iter 135/275 - loss 0.06478508 - time (sec): 6.16 - samples/sec: 1771.47 - lr: 0.000022 - momentum: 0.000000 2023-10-17 08:37:47,210 epoch 4 - iter 162/275 - loss 0.06539720 - time (sec): 7.39 - samples/sec: 1786.82 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:37:48,441 epoch 4 - iter 189/275 - loss 0.07160567 - time (sec): 8.63 - samples/sec: 1767.48 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:37:49,673 epoch 4 - iter 216/275 - loss 0.07816350 - time (sec): 9.86 - samples/sec: 1798.06 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:37:50,930 epoch 4 - iter 243/275 - loss 0.08021309 - time (sec): 11.11 - samples/sec: 1793.74 - lr: 0.000020 - momentum: 0.000000 2023-10-17 08:37:52,192 epoch 4 - iter 270/275 - loss 0.08200783 - time (sec): 12.38 - samples/sec: 1803.55 - lr: 0.000020 - momentum: 0.000000 2023-10-17 08:37:52,414 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:37:52,414 EPOCH 4 done: loss 0.0815 - lr: 0.000020 2023-10-17 08:37:53,069 DEV : loss 0.17194409668445587 - f1-score (micro avg) 0.8676 2023-10-17 08:37:53,074 saving best model 2023-10-17 08:37:53,493 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:37:54,661 epoch 5 - iter 27/275 - loss 0.10193073 - time (sec): 1.16 - samples/sec: 1980.50 - lr: 0.000020 - momentum: 0.000000 2023-10-17 08:37:55,826 epoch 5 - iter 54/275 - loss 0.10581887 - time (sec): 2.33 - samples/sec: 1992.98 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:37:56,994 epoch 5 - iter 81/275 - loss 0.08517821 - time (sec): 3.49 - samples/sec: 1958.62 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:37:58,283 epoch 5 - iter 108/275 - loss 0.08332972 - time (sec): 4.78 - samples/sec: 1885.61 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:37:59,502 epoch 5 - iter 135/275 - loss 0.07897071 - time (sec): 6.00 - samples/sec: 1878.56 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:38:00,721 epoch 5 - iter 162/275 - loss 0.08025946 - time (sec): 7.22 - samples/sec: 1863.81 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:38:01,973 epoch 5 - iter 189/275 - loss 0.07544818 - time (sec): 8.47 - samples/sec: 1862.92 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:38:03,224 epoch 5 - iter 216/275 - loss 0.07174048 - time (sec): 9.72 - samples/sec: 1860.61 - lr: 0.000017 - momentum: 0.000000 2023-10-17 08:38:04,430 epoch 5 - iter 243/275 - loss 0.06753309 - time (sec): 10.93 - samples/sec: 1856.49 - lr: 0.000017 - momentum: 0.000000 2023-10-17 08:38:05,649 epoch 5 - iter 270/275 - loss 0.06679138 - time (sec): 12.15 - samples/sec: 1841.03 - lr: 0.000017 - momentum: 0.000000 2023-10-17 08:38:05,872 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:38:05,872 EPOCH 5 done: loss 0.0656 - lr: 0.000017 2023-10-17 08:38:06,507 DEV : loss 0.18873989582061768 - f1-score (micro avg) 0.8697 2023-10-17 08:38:06,512 saving best model 2023-10-17 08:38:06,950 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:38:08,212 epoch 6 - iter 27/275 - loss 0.05035342 - time (sec): 1.26 - samples/sec: 1817.82 - lr: 0.000016 - momentum: 0.000000 2023-10-17 08:38:09,427 epoch 6 - iter 54/275 - loss 0.05035438 - time (sec): 2.48 - samples/sec: 1802.99 - lr: 0.000016 - momentum: 0.000000 2023-10-17 08:38:10,637 epoch 6 - iter 81/275 - loss 0.05778617 - time (sec): 3.69 - samples/sec: 1773.27 - lr: 0.000016 - momentum: 0.000000 2023-10-17 08:38:11,834 epoch 6 - iter 108/275 - loss 0.06517433 - time (sec): 4.88 - samples/sec: 1764.60 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:38:13,077 epoch 6 - iter 135/275 - loss 0.06395069 - time (sec): 6.13 - samples/sec: 1811.73 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:38:14,284 epoch 6 - iter 162/275 - loss 0.05909466 - time (sec): 7.33 - samples/sec: 1787.73 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:38:15,483 epoch 6 - iter 189/275 - loss 0.05858764 - time (sec): 8.53 - samples/sec: 1827.95 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:38:16,675 epoch 6 - iter 216/275 - loss 0.05612642 - time (sec): 9.72 - samples/sec: 1829.87 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:38:17,867 epoch 6 - iter 243/275 - loss 0.05375890 - time (sec): 10.92 - samples/sec: 1823.26 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:38:19,068 epoch 6 - iter 270/275 - loss 0.05000841 - time (sec): 12.12 - samples/sec: 1837.48 - lr: 0.000013 - momentum: 0.000000 2023-10-17 08:38:19,295 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:38:19,295 EPOCH 6 done: loss 0.0503 - lr: 0.000013 2023-10-17 08:38:19,926 DEV : loss 0.19087766110897064 - f1-score (micro avg) 0.8643 2023-10-17 08:38:19,931 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:38:21,162 epoch 7 - iter 27/275 - loss 0.04592519 - time (sec): 1.23 - samples/sec: 1883.07 - lr: 0.000013 - momentum: 0.000000 2023-10-17 08:38:22,366 epoch 7 - iter 54/275 - loss 0.03291311 - time (sec): 2.43 - samples/sec: 1838.45 - lr: 0.000013 - momentum: 0.000000 2023-10-17 08:38:23,561 epoch 7 - iter 81/275 - loss 0.02930348 - time (sec): 3.63 - samples/sec: 1820.31 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:38:24,776 epoch 7 - iter 108/275 - loss 0.04025652 - time (sec): 4.84 - samples/sec: 1783.80 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:38:26,025 epoch 7 - iter 135/275 - loss 0.03406485 - time (sec): 6.09 - samples/sec: 1832.62 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:38:27,240 epoch 7 - iter 162/275 - loss 0.04109484 - time (sec): 7.31 - samples/sec: 1852.06 - lr: 0.000011 - momentum: 0.000000 2023-10-17 08:38:28,446 epoch 7 - iter 189/275 - loss 0.04358684 - time (sec): 8.51 - samples/sec: 1840.09 - lr: 0.000011 - momentum: 0.000000 2023-10-17 08:38:29,663 epoch 7 - iter 216/275 - loss 0.04150936 - time (sec): 9.73 - samples/sec: 1848.67 - lr: 0.000011 - momentum: 0.000000 2023-10-17 08:38:30,890 epoch 7 - iter 243/275 - loss 0.04573996 - time (sec): 10.96 - samples/sec: 1843.23 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:38:32,107 epoch 7 - iter 270/275 - loss 0.04355072 - time (sec): 12.18 - samples/sec: 1846.29 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:38:32,333 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:38:32,333 EPOCH 7 done: loss 0.0451 - lr: 0.000010 2023-10-17 08:38:32,970 DEV : loss 0.18272067606449127 - f1-score (micro avg) 0.8841 2023-10-17 08:38:32,974 saving best model 2023-10-17 08:38:33,402 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:38:34,625 epoch 8 - iter 27/275 - loss 0.03791177 - time (sec): 1.22 - samples/sec: 1818.03 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:38:35,843 epoch 8 - iter 54/275 - loss 0.03884052 - time (sec): 2.44 - samples/sec: 1798.40 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:38:37,056 epoch 8 - iter 81/275 - loss 0.03522629 - time (sec): 3.65 - samples/sec: 1876.18 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:38:38,213 epoch 8 - iter 108/275 - loss 0.03898247 - time (sec): 4.81 - samples/sec: 1900.36 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:38:39,385 epoch 8 - iter 135/275 - loss 0.03363928 - time (sec): 5.98 - samples/sec: 1897.76 - lr: 0.000008 - momentum: 0.000000 2023-10-17 08:38:40,576 epoch 8 - iter 162/275 - loss 0.02922465 - time (sec): 7.17 - samples/sec: 1852.74 - lr: 0.000008 - momentum: 0.000000 2023-10-17 08:38:41,825 epoch 8 - iter 189/275 - loss 0.02981224 - time (sec): 8.42 - samples/sec: 1850.17 - lr: 0.000008 - momentum: 0.000000 2023-10-17 08:38:43,040 epoch 8 - iter 216/275 - loss 0.03064469 - time (sec): 9.64 - samples/sec: 1851.37 - lr: 0.000007 - momentum: 0.000000 2023-10-17 08:38:44,247 epoch 8 - iter 243/275 - loss 0.02957048 - time (sec): 10.84 - samples/sec: 1850.42 - lr: 0.000007 - momentum: 0.000000 2023-10-17 08:38:45,469 epoch 8 - iter 270/275 - loss 0.03328664 - time (sec): 12.06 - samples/sec: 1855.69 - lr: 0.000007 - momentum: 0.000000 2023-10-17 08:38:45,697 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:38:45,697 EPOCH 8 done: loss 0.0331 - lr: 0.000007 2023-10-17 08:38:46,406 DEV : loss 0.17866075038909912 - f1-score (micro avg) 0.8835 2023-10-17 08:38:46,412 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:38:47,639 epoch 9 - iter 27/275 - loss 0.02070222 - time (sec): 1.23 - samples/sec: 1816.68 - lr: 0.000006 - momentum: 0.000000 2023-10-17 08:38:48,857 epoch 9 - iter 54/275 - loss 0.02809420 - time (sec): 2.44 - samples/sec: 1912.52 - lr: 0.000006 - momentum: 0.000000 2023-10-17 08:38:50,059 epoch 9 - iter 81/275 - loss 0.03070519 - time (sec): 3.65 - samples/sec: 1858.56 - lr: 0.000006 - momentum: 0.000000 2023-10-17 08:38:51,254 epoch 9 - iter 108/275 - loss 0.03042463 - time (sec): 4.84 - samples/sec: 1851.92 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:38:52,474 epoch 9 - iter 135/275 - loss 0.02817818 - time (sec): 6.06 - samples/sec: 1860.00 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:38:53,696 epoch 9 - iter 162/275 - loss 0.02493376 - time (sec): 7.28 - samples/sec: 1851.21 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:38:54,935 epoch 9 - iter 189/275 - loss 0.03113342 - time (sec): 8.52 - samples/sec: 1840.47 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:38:56,177 epoch 9 - iter 216/275 - loss 0.03047445 - time (sec): 9.76 - samples/sec: 1839.37 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:38:57,419 epoch 9 - iter 243/275 - loss 0.02770802 - time (sec): 11.01 - samples/sec: 1817.40 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:38:58,661 epoch 9 - iter 270/275 - loss 0.02697792 - time (sec): 12.25 - samples/sec: 1822.10 - lr: 0.000003 - momentum: 0.000000 2023-10-17 08:38:58,881 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:38:58,881 EPOCH 9 done: loss 0.0269 - lr: 0.000003 2023-10-17 08:38:59,631 DEV : loss 0.17246393859386444 - f1-score (micro avg) 0.8803 2023-10-17 08:38:59,636 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:39:00,822 epoch 10 - iter 27/275 - loss 0.06560676 - time (sec): 1.19 - samples/sec: 2181.30 - lr: 0.000003 - momentum: 0.000000 2023-10-17 08:39:02,049 epoch 10 - iter 54/275 - loss 0.04650226 - time (sec): 2.41 - samples/sec: 2062.30 - lr: 0.000003 - momentum: 0.000000 2023-10-17 08:39:03,285 epoch 10 - iter 81/275 - loss 0.03455730 - time (sec): 3.65 - samples/sec: 2025.03 - lr: 0.000002 - momentum: 0.000000 2023-10-17 08:39:04,480 epoch 10 - iter 108/275 - loss 0.02764019 - time (sec): 4.84 - samples/sec: 1950.65 - lr: 0.000002 - momentum: 0.000000 2023-10-17 08:39:05,691 epoch 10 - iter 135/275 - loss 0.02670571 - time (sec): 6.05 - samples/sec: 1911.94 - lr: 0.000002 - momentum: 0.000000 2023-10-17 08:39:06,908 epoch 10 - iter 162/275 - loss 0.02432927 - time (sec): 7.27 - samples/sec: 1886.79 - lr: 0.000001 - momentum: 0.000000 2023-10-17 08:39:08,144 epoch 10 - iter 189/275 - loss 0.02362532 - time (sec): 8.51 - samples/sec: 1860.15 - lr: 0.000001 - momentum: 0.000000 2023-10-17 08:39:09,374 epoch 10 - iter 216/275 - loss 0.02263105 - time (sec): 9.74 - samples/sec: 1841.53 - lr: 0.000001 - momentum: 0.000000 2023-10-17 08:39:10,593 epoch 10 - iter 243/275 - loss 0.02183048 - time (sec): 10.96 - samples/sec: 1847.62 - lr: 0.000000 - momentum: 0.000000 2023-10-17 08:39:11,842 epoch 10 - iter 270/275 - loss 0.02161543 - time (sec): 12.21 - samples/sec: 1832.25 - lr: 0.000000 - momentum: 0.000000 2023-10-17 08:39:12,068 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:39:12,068 EPOCH 10 done: loss 0.0212 - lr: 0.000000 2023-10-17 08:39:12,729 DEV : loss 0.17962802946567535 - f1-score (micro avg) 0.8816 2023-10-17 08:39:13,079 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:39:13,080 Loading model from best epoch ... 2023-10-17 08:39:14,442 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-17 08:39:15,253 Results: - F-score (micro) 0.8863 - F-score (macro) 0.6622 - Accuracy 0.8091 By class: precision recall f1-score support scope 0.8708 0.8807 0.8757 176 pers 0.9603 0.9453 0.9528 128 work 0.7949 0.8378 0.8158 74 loc 1.0000 0.5000 0.6667 2 object 0.0000 0.0000 0.0000 2 micro avg 0.8851 0.8874 0.8863 382 macro avg 0.7252 0.6328 0.6622 382 weighted avg 0.8822 0.8874 0.8842 382 2023-10-17 08:39:15,253 ----------------------------------------------------------------------------------------------------