2023-10-17 08:18:18,987 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:18:18,988 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 08:18:18,988 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:18:18,989 MultiCorpus: 1100 train + 206 dev + 240 test sentences - NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator 2023-10-17 08:18:18,989 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:18:18,989 Train: 1100 sentences 2023-10-17 08:18:18,989 (train_with_dev=False, train_with_test=False) 2023-10-17 08:18:18,989 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:18:18,989 Training Params: 2023-10-17 08:18:18,989 - learning_rate: "3e-05" 2023-10-17 08:18:18,989 - mini_batch_size: "4" 2023-10-17 08:18:18,989 - max_epochs: "10" 2023-10-17 08:18:18,989 - shuffle: "True" 2023-10-17 08:18:18,989 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:18:18,990 Plugins: 2023-10-17 08:18:18,990 - TensorboardLogger 2023-10-17 08:18:18,990 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 08:18:18,990 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:18:18,990 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 08:18:18,990 - metric: "('micro avg', 'f1-score')" 2023-10-17 08:18:18,990 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:18:18,990 Computation: 2023-10-17 08:18:18,990 - compute on device: cuda:0 2023-10-17 08:18:18,990 - embedding storage: none 2023-10-17 08:18:18,990 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:18:18,990 Model training base path: "hmbench-ajmc/de-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-17 08:18:18,990 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:18:18,990 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:18:18,990 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 08:18:21,395 epoch 1 - iter 27/275 - loss 3.52317630 - time (sec): 2.40 - samples/sec: 889.99 - lr: 0.000003 - momentum: 0.000000 2023-10-17 08:18:22,640 epoch 1 - iter 54/275 - loss 2.97222654 - time (sec): 3.65 - samples/sec: 1166.58 - lr: 0.000006 - momentum: 0.000000 2023-10-17 08:18:23,847 epoch 1 - iter 81/275 - loss 2.46172200 - time (sec): 4.86 - samples/sec: 1322.50 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:18:25,088 epoch 1 - iter 108/275 - loss 1.97206923 - time (sec): 6.10 - samples/sec: 1432.99 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:18:26,313 epoch 1 - iter 135/275 - loss 1.66054933 - time (sec): 7.32 - samples/sec: 1525.26 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:18:27,520 epoch 1 - iter 162/275 - loss 1.46914393 - time (sec): 8.53 - samples/sec: 1586.70 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:18:28,712 epoch 1 - iter 189/275 - loss 1.31842235 - time (sec): 9.72 - samples/sec: 1612.89 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:18:29,926 epoch 1 - iter 216/275 - loss 1.20141186 - time (sec): 10.93 - samples/sec: 1639.21 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:18:31,173 epoch 1 - iter 243/275 - loss 1.10689649 - time (sec): 12.18 - samples/sec: 1648.32 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:18:32,501 epoch 1 - iter 270/275 - loss 1.01950699 - time (sec): 13.51 - samples/sec: 1661.37 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:18:32,713 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:18:32,713 EPOCH 1 done: loss 1.0079 - lr: 0.000029 2023-10-17 08:18:33,248 DEV : loss 0.20805643498897552 - f1-score (micro avg) 0.6826 2023-10-17 08:18:33,253 saving best model 2023-10-17 08:18:33,620 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:18:34,872 epoch 2 - iter 27/275 - loss 0.25196567 - time (sec): 1.25 - samples/sec: 1920.87 - lr: 0.000030 - momentum: 0.000000 2023-10-17 08:18:36,149 epoch 2 - iter 54/275 - loss 0.23255918 - time (sec): 2.53 - samples/sec: 1806.59 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:18:37,438 epoch 2 - iter 81/275 - loss 0.23733841 - time (sec): 3.82 - samples/sec: 1792.91 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:18:38,697 epoch 2 - iter 108/275 - loss 0.23047487 - time (sec): 5.08 - samples/sec: 1789.25 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:18:39,965 epoch 2 - iter 135/275 - loss 0.20976341 - time (sec): 6.34 - samples/sec: 1772.08 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:18:41,271 epoch 2 - iter 162/275 - loss 0.20039640 - time (sec): 7.65 - samples/sec: 1762.10 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:18:42,533 epoch 2 - iter 189/275 - loss 0.19365150 - time (sec): 8.91 - samples/sec: 1751.79 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:18:43,812 epoch 2 - iter 216/275 - loss 0.18866455 - time (sec): 10.19 - samples/sec: 1736.70 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:18:45,068 epoch 2 - iter 243/275 - loss 0.18395908 - time (sec): 11.45 - samples/sec: 1748.72 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:18:46,313 epoch 2 - iter 270/275 - loss 0.17611786 - time (sec): 12.69 - samples/sec: 1763.41 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:18:46,559 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:18:46,560 EPOCH 2 done: loss 0.1740 - lr: 0.000027 2023-10-17 08:18:47,242 DEV : loss 0.16901393234729767 - f1-score (micro avg) 0.7929 2023-10-17 08:18:47,246 saving best model 2023-10-17 08:18:47,722 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:18:48,941 epoch 3 - iter 27/275 - loss 0.10060364 - time (sec): 1.22 - samples/sec: 1741.81 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:18:50,135 epoch 3 - iter 54/275 - loss 0.10019335 - time (sec): 2.41 - samples/sec: 1896.16 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:18:51,322 epoch 3 - iter 81/275 - loss 0.08798651 - time (sec): 3.60 - samples/sec: 1849.18 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:18:52,704 epoch 3 - iter 108/275 - loss 0.09566087 - time (sec): 4.98 - samples/sec: 1745.02 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:18:53,897 epoch 3 - iter 135/275 - loss 0.10177219 - time (sec): 6.17 - samples/sec: 1804.29 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:18:55,069 epoch 3 - iter 162/275 - loss 0.10278867 - time (sec): 7.35 - samples/sec: 1808.06 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:18:56,239 epoch 3 - iter 189/275 - loss 0.10878498 - time (sec): 8.52 - samples/sec: 1855.28 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:18:57,406 epoch 3 - iter 216/275 - loss 0.10765900 - time (sec): 9.68 - samples/sec: 1843.84 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:18:58,582 epoch 3 - iter 243/275 - loss 0.10656534 - time (sec): 10.86 - samples/sec: 1842.94 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:18:59,761 epoch 3 - iter 270/275 - loss 0.11369212 - time (sec): 12.04 - samples/sec: 1858.79 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:18:59,977 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:18:59,977 EPOCH 3 done: loss 0.1126 - lr: 0.000023 2023-10-17 08:19:00,681 DEV : loss 0.15869425237178802 - f1-score (micro avg) 0.8309 2023-10-17 08:19:00,686 saving best model 2023-10-17 08:19:01,142 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:19:02,386 epoch 4 - iter 27/275 - loss 0.06874097 - time (sec): 1.24 - samples/sec: 1949.41 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:19:03,631 epoch 4 - iter 54/275 - loss 0.08195825 - time (sec): 2.49 - samples/sec: 1826.88 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:19:04,887 epoch 4 - iter 81/275 - loss 0.08181971 - time (sec): 3.74 - samples/sec: 1795.67 - lr: 0.000022 - momentum: 0.000000 2023-10-17 08:19:06,127 epoch 4 - iter 108/275 - loss 0.07748029 - time (sec): 4.98 - samples/sec: 1780.79 - lr: 0.000022 - momentum: 0.000000 2023-10-17 08:19:07,424 epoch 4 - iter 135/275 - loss 0.08497099 - time (sec): 6.28 - samples/sec: 1797.29 - lr: 0.000022 - momentum: 0.000000 2023-10-17 08:19:08,724 epoch 4 - iter 162/275 - loss 0.09504136 - time (sec): 7.58 - samples/sec: 1766.63 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:19:09,956 epoch 4 - iter 189/275 - loss 0.09094486 - time (sec): 8.81 - samples/sec: 1772.94 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:19:11,184 epoch 4 - iter 216/275 - loss 0.08800892 - time (sec): 10.04 - samples/sec: 1784.02 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:19:12,403 epoch 4 - iter 243/275 - loss 0.08680239 - time (sec): 11.26 - samples/sec: 1792.19 - lr: 0.000020 - momentum: 0.000000 2023-10-17 08:19:13,608 epoch 4 - iter 270/275 - loss 0.08449332 - time (sec): 12.46 - samples/sec: 1788.89 - lr: 0.000020 - momentum: 0.000000 2023-10-17 08:19:13,845 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:19:13,846 EPOCH 4 done: loss 0.0831 - lr: 0.000020 2023-10-17 08:19:14,565 DEV : loss 0.16908997297286987 - f1-score (micro avg) 0.8429 2023-10-17 08:19:14,573 saving best model 2023-10-17 08:19:15,204 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:19:16,662 epoch 5 - iter 27/275 - loss 0.08528267 - time (sec): 1.45 - samples/sec: 1566.38 - lr: 0.000020 - momentum: 0.000000 2023-10-17 08:19:18,127 epoch 5 - iter 54/275 - loss 0.07615872 - time (sec): 2.92 - samples/sec: 1573.43 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:19:19,459 epoch 5 - iter 81/275 - loss 0.07198106 - time (sec): 4.25 - samples/sec: 1618.80 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:19:20,723 epoch 5 - iter 108/275 - loss 0.07243756 - time (sec): 5.52 - samples/sec: 1637.53 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:19:21,943 epoch 5 - iter 135/275 - loss 0.06687607 - time (sec): 6.74 - samples/sec: 1638.71 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:19:23,173 epoch 5 - iter 162/275 - loss 0.06194550 - time (sec): 7.97 - samples/sec: 1646.70 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:19:24,396 epoch 5 - iter 189/275 - loss 0.06101984 - time (sec): 9.19 - samples/sec: 1675.13 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:19:25,682 epoch 5 - iter 216/275 - loss 0.07343089 - time (sec): 10.47 - samples/sec: 1709.15 - lr: 0.000017 - momentum: 0.000000 2023-10-17 08:19:26,912 epoch 5 - iter 243/275 - loss 0.07128583 - time (sec): 11.70 - samples/sec: 1727.91 - lr: 0.000017 - momentum: 0.000000 2023-10-17 08:19:28,168 epoch 5 - iter 270/275 - loss 0.06883251 - time (sec): 12.96 - samples/sec: 1723.92 - lr: 0.000017 - momentum: 0.000000 2023-10-17 08:19:28,404 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:19:28,405 EPOCH 5 done: loss 0.0675 - lr: 0.000017 2023-10-17 08:19:29,050 DEV : loss 0.20913541316986084 - f1-score (micro avg) 0.8408 2023-10-17 08:19:29,055 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:19:30,327 epoch 6 - iter 27/275 - loss 0.02455276 - time (sec): 1.27 - samples/sec: 1684.88 - lr: 0.000016 - momentum: 0.000000 2023-10-17 08:19:31,573 epoch 6 - iter 54/275 - loss 0.04978974 - time (sec): 2.52 - samples/sec: 1936.57 - lr: 0.000016 - momentum: 0.000000 2023-10-17 08:19:32,812 epoch 6 - iter 81/275 - loss 0.05084963 - time (sec): 3.76 - samples/sec: 1861.04 - lr: 0.000016 - momentum: 0.000000 2023-10-17 08:19:34,085 epoch 6 - iter 108/275 - loss 0.04648955 - time (sec): 5.03 - samples/sec: 1811.81 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:19:35,338 epoch 6 - iter 135/275 - loss 0.04047721 - time (sec): 6.28 - samples/sec: 1811.90 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:19:36,590 epoch 6 - iter 162/275 - loss 0.04107054 - time (sec): 7.53 - samples/sec: 1820.12 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:19:37,811 epoch 6 - iter 189/275 - loss 0.04430490 - time (sec): 8.75 - samples/sec: 1798.07 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:19:39,078 epoch 6 - iter 216/275 - loss 0.04737745 - time (sec): 10.02 - samples/sec: 1787.07 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:19:40,376 epoch 6 - iter 243/275 - loss 0.04440348 - time (sec): 11.32 - samples/sec: 1771.37 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:19:41,646 epoch 6 - iter 270/275 - loss 0.04893048 - time (sec): 12.59 - samples/sec: 1771.91 - lr: 0.000013 - momentum: 0.000000 2023-10-17 08:19:41,878 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:19:41,878 EPOCH 6 done: loss 0.0531 - lr: 0.000013 2023-10-17 08:19:42,537 DEV : loss 0.1726471483707428 - f1-score (micro avg) 0.8558 2023-10-17 08:19:42,542 saving best model 2023-10-17 08:19:43,038 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:19:44,248 epoch 7 - iter 27/275 - loss 0.01218684 - time (sec): 1.20 - samples/sec: 1675.75 - lr: 0.000013 - momentum: 0.000000 2023-10-17 08:19:45,460 epoch 7 - iter 54/275 - loss 0.02183532 - time (sec): 2.42 - samples/sec: 1748.01 - lr: 0.000013 - momentum: 0.000000 2023-10-17 08:19:46,706 epoch 7 - iter 81/275 - loss 0.04935737 - time (sec): 3.66 - samples/sec: 1757.34 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:19:47,936 epoch 7 - iter 108/275 - loss 0.04925165 - time (sec): 4.89 - samples/sec: 1770.13 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:19:49,176 epoch 7 - iter 135/275 - loss 0.04722796 - time (sec): 6.13 - samples/sec: 1800.64 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:19:50,411 epoch 7 - iter 162/275 - loss 0.04194004 - time (sec): 7.37 - samples/sec: 1786.62 - lr: 0.000011 - momentum: 0.000000 2023-10-17 08:19:51,673 epoch 7 - iter 189/275 - loss 0.03932751 - time (sec): 8.63 - samples/sec: 1776.38 - lr: 0.000011 - momentum: 0.000000 2023-10-17 08:19:52,925 epoch 7 - iter 216/275 - loss 0.04019275 - time (sec): 9.88 - samples/sec: 1786.05 - lr: 0.000011 - momentum: 0.000000 2023-10-17 08:19:54,200 epoch 7 - iter 243/275 - loss 0.03985253 - time (sec): 11.16 - samples/sec: 1797.40 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:19:55,409 epoch 7 - iter 270/275 - loss 0.03718124 - time (sec): 12.36 - samples/sec: 1803.75 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:19:55,645 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:19:55,645 EPOCH 7 done: loss 0.0364 - lr: 0.000010 2023-10-17 08:19:56,329 DEV : loss 0.18324865400791168 - f1-score (micro avg) 0.8651 2023-10-17 08:19:56,335 saving best model 2023-10-17 08:19:56,874 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:19:58,102 epoch 8 - iter 27/275 - loss 0.03846923 - time (sec): 1.23 - samples/sec: 1788.12 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:19:59,403 epoch 8 - iter 54/275 - loss 0.04835504 - time (sec): 2.53 - samples/sec: 1810.51 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:20:00,655 epoch 8 - iter 81/275 - loss 0.05253058 - time (sec): 3.78 - samples/sec: 1792.79 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:20:01,914 epoch 8 - iter 108/275 - loss 0.04887803 - time (sec): 5.04 - samples/sec: 1802.31 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:20:03,235 epoch 8 - iter 135/275 - loss 0.03991749 - time (sec): 6.36 - samples/sec: 1794.32 - lr: 0.000008 - momentum: 0.000000 2023-10-17 08:20:04,504 epoch 8 - iter 162/275 - loss 0.03528465 - time (sec): 7.63 - samples/sec: 1774.58 - lr: 0.000008 - momentum: 0.000000 2023-10-17 08:20:05,743 epoch 8 - iter 189/275 - loss 0.03234484 - time (sec): 8.87 - samples/sec: 1776.78 - lr: 0.000008 - momentum: 0.000000 2023-10-17 08:20:06,988 epoch 8 - iter 216/275 - loss 0.03274941 - time (sec): 10.11 - samples/sec: 1769.21 - lr: 0.000007 - momentum: 0.000000 2023-10-17 08:20:08,253 epoch 8 - iter 243/275 - loss 0.03518058 - time (sec): 11.38 - samples/sec: 1784.29 - lr: 0.000007 - momentum: 0.000000 2023-10-17 08:20:09,546 epoch 8 - iter 270/275 - loss 0.03348453 - time (sec): 12.67 - samples/sec: 1764.84 - lr: 0.000007 - momentum: 0.000000 2023-10-17 08:20:09,775 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:20:09,775 EPOCH 8 done: loss 0.0329 - lr: 0.000007 2023-10-17 08:20:10,406 DEV : loss 0.18320710957050323 - f1-score (micro avg) 0.866 2023-10-17 08:20:10,411 saving best model 2023-10-17 08:20:10,971 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:20:12,277 epoch 9 - iter 27/275 - loss 0.03749169 - time (sec): 1.30 - samples/sec: 1642.16 - lr: 0.000006 - momentum: 0.000000 2023-10-17 08:20:13,497 epoch 9 - iter 54/275 - loss 0.04785379 - time (sec): 2.52 - samples/sec: 1651.04 - lr: 0.000006 - momentum: 0.000000 2023-10-17 08:20:14,733 epoch 9 - iter 81/275 - loss 0.03406399 - time (sec): 3.76 - samples/sec: 1711.57 - lr: 0.000006 - momentum: 0.000000 2023-10-17 08:20:16,057 epoch 9 - iter 108/275 - loss 0.02763835 - time (sec): 5.08 - samples/sec: 1726.18 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:20:17,376 epoch 9 - iter 135/275 - loss 0.02907429 - time (sec): 6.40 - samples/sec: 1717.44 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:20:18,595 epoch 9 - iter 162/275 - loss 0.02739866 - time (sec): 7.62 - samples/sec: 1768.24 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:20:19,821 epoch 9 - iter 189/275 - loss 0.02560578 - time (sec): 8.85 - samples/sec: 1769.08 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:20:21,040 epoch 9 - iter 216/275 - loss 0.02617679 - time (sec): 10.07 - samples/sec: 1753.28 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:20:22,265 epoch 9 - iter 243/275 - loss 0.02508920 - time (sec): 11.29 - samples/sec: 1749.91 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:20:23,500 epoch 9 - iter 270/275 - loss 0.02536099 - time (sec): 12.53 - samples/sec: 1776.21 - lr: 0.000003 - momentum: 0.000000 2023-10-17 08:20:23,730 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:20:23,730 EPOCH 9 done: loss 0.0259 - lr: 0.000003 2023-10-17 08:20:24,360 DEV : loss 0.18943718075752258 - f1-score (micro avg) 0.8685 2023-10-17 08:20:24,365 saving best model 2023-10-17 08:20:24,836 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:20:25,991 epoch 10 - iter 27/275 - loss 0.03037133 - time (sec): 1.15 - samples/sec: 1892.87 - lr: 0.000003 - momentum: 0.000000 2023-10-17 08:20:27,258 epoch 10 - iter 54/275 - loss 0.01674465 - time (sec): 2.42 - samples/sec: 1742.46 - lr: 0.000003 - momentum: 0.000000 2023-10-17 08:20:28,480 epoch 10 - iter 81/275 - loss 0.01931327 - time (sec): 3.64 - samples/sec: 1772.42 - lr: 0.000002 - momentum: 0.000000 2023-10-17 08:20:29,804 epoch 10 - iter 108/275 - loss 0.01642879 - time (sec): 4.96 - samples/sec: 1718.68 - lr: 0.000002 - momentum: 0.000000 2023-10-17 08:20:31,134 epoch 10 - iter 135/275 - loss 0.01363607 - time (sec): 6.29 - samples/sec: 1725.93 - lr: 0.000002 - momentum: 0.000000 2023-10-17 08:20:32,428 epoch 10 - iter 162/275 - loss 0.01853421 - time (sec): 7.59 - samples/sec: 1707.95 - lr: 0.000001 - momentum: 0.000000 2023-10-17 08:20:33,669 epoch 10 - iter 189/275 - loss 0.02043455 - time (sec): 8.83 - samples/sec: 1753.41 - lr: 0.000001 - momentum: 0.000000 2023-10-17 08:20:34,907 epoch 10 - iter 216/275 - loss 0.01890215 - time (sec): 10.07 - samples/sec: 1775.84 - lr: 0.000001 - momentum: 0.000000 2023-10-17 08:20:36,129 epoch 10 - iter 243/275 - loss 0.02207666 - time (sec): 11.29 - samples/sec: 1787.67 - lr: 0.000000 - momentum: 0.000000 2023-10-17 08:20:37,341 epoch 10 - iter 270/275 - loss 0.02111457 - time (sec): 12.50 - samples/sec: 1789.88 - lr: 0.000000 - momentum: 0.000000 2023-10-17 08:20:37,561 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:20:37,562 EPOCH 10 done: loss 0.0209 - lr: 0.000000 2023-10-17 08:20:38,287 DEV : loss 0.1890801340341568 - f1-score (micro avg) 0.8798 2023-10-17 08:20:38,294 saving best model 2023-10-17 08:20:39,125 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:20:39,126 Loading model from best epoch ... 2023-10-17 08:20:40,778 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-17 08:20:41,567 Results: - F-score (micro) 0.9237 - F-score (macro) 0.7548 - Accuracy 0.8667 By class: precision recall f1-score support scope 0.9143 0.9091 0.9117 176 pers 0.9606 0.9531 0.9569 128 work 0.9054 0.9054 0.9054 74 loc 1.0000 1.0000 1.0000 2 object 0.0000 0.0000 0.0000 2 micro avg 0.9286 0.9188 0.9237 382 macro avg 0.7561 0.7535 0.7548 382 weighted avg 0.9238 0.9188 0.9213 382 2023-10-17 08:20:41,567 ----------------------------------------------------------------------------------------------------