2023-10-18 14:44:22,390 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:22,390 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 14:44:22,390 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:22,390 MultiCorpus: 1100 train + 206 dev + 240 test sentences - NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator 2023-10-18 14:44:22,390 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:22,390 Train: 1100 sentences 2023-10-18 14:44:22,391 (train_with_dev=False, train_with_test=False) 2023-10-18 14:44:22,391 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:22,391 Training Params: 2023-10-18 14:44:22,391 - learning_rate: "3e-05" 2023-10-18 14:44:22,391 - mini_batch_size: "8" 2023-10-18 14:44:22,391 - max_epochs: "10" 2023-10-18 14:44:22,391 - shuffle: "True" 2023-10-18 14:44:22,391 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:22,391 Plugins: 2023-10-18 14:44:22,391 - TensorboardLogger 2023-10-18 14:44:22,391 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 14:44:22,391 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:22,391 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 14:44:22,391 - metric: "('micro avg', 'f1-score')" 2023-10-18 14:44:22,391 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:22,391 Computation: 2023-10-18 14:44:22,391 - compute on device: cuda:0 2023-10-18 14:44:22,391 - embedding storage: none 2023-10-18 14:44:22,391 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:22,391 Model training base path: "hmbench-ajmc/de-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-18 14:44:22,391 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:22,391 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:22,391 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 14:44:22,663 epoch 1 - iter 13/138 - loss 4.05304806 - time (sec): 0.27 - samples/sec: 6983.51 - lr: 0.000003 - momentum: 0.000000 2023-10-18 14:44:22,926 epoch 1 - iter 26/138 - loss 3.93536613 - time (sec): 0.53 - samples/sec: 7890.83 - lr: 0.000005 - momentum: 0.000000 2023-10-18 14:44:23,182 epoch 1 - iter 39/138 - loss 3.88212494 - time (sec): 0.79 - samples/sec: 7934.23 - lr: 0.000008 - momentum: 0.000000 2023-10-18 14:44:23,450 epoch 1 - iter 52/138 - loss 3.81966631 - time (sec): 1.06 - samples/sec: 7986.74 - lr: 0.000011 - momentum: 0.000000 2023-10-18 14:44:23,717 epoch 1 - iter 65/138 - loss 3.74497983 - time (sec): 1.33 - samples/sec: 8041.95 - lr: 0.000014 - momentum: 0.000000 2023-10-18 14:44:23,973 epoch 1 - iter 78/138 - loss 3.64562072 - time (sec): 1.58 - samples/sec: 8229.47 - lr: 0.000017 - momentum: 0.000000 2023-10-18 14:44:24,222 epoch 1 - iter 91/138 - loss 3.53698690 - time (sec): 1.83 - samples/sec: 8255.14 - lr: 0.000020 - momentum: 0.000000 2023-10-18 14:44:24,519 epoch 1 - iter 104/138 - loss 3.39096503 - time (sec): 2.13 - samples/sec: 8271.30 - lr: 0.000022 - momentum: 0.000000 2023-10-18 14:44:24,820 epoch 1 - iter 117/138 - loss 3.26119243 - time (sec): 2.43 - samples/sec: 8075.76 - lr: 0.000025 - momentum: 0.000000 2023-10-18 14:44:25,111 epoch 1 - iter 130/138 - loss 3.14421769 - time (sec): 2.72 - samples/sec: 7902.22 - lr: 0.000028 - momentum: 0.000000 2023-10-18 14:44:25,280 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:25,281 EPOCH 1 done: loss 3.0495 - lr: 0.000028 2023-10-18 14:44:25,529 DEV : loss 1.0529041290283203 - f1-score (micro avg) 0.0 2023-10-18 14:44:25,533 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:25,822 epoch 2 - iter 13/138 - loss 1.28227408 - time (sec): 0.29 - samples/sec: 8334.32 - lr: 0.000030 - momentum: 0.000000 2023-10-18 14:44:26,096 epoch 2 - iter 26/138 - loss 1.22376295 - time (sec): 0.56 - samples/sec: 7979.75 - lr: 0.000029 - momentum: 0.000000 2023-10-18 14:44:26,393 epoch 2 - iter 39/138 - loss 1.20014775 - time (sec): 0.86 - samples/sec: 7884.22 - lr: 0.000029 - momentum: 0.000000 2023-10-18 14:44:26,689 epoch 2 - iter 52/138 - loss 1.19678840 - time (sec): 1.16 - samples/sec: 7485.41 - lr: 0.000029 - momentum: 0.000000 2023-10-18 14:44:26,988 epoch 2 - iter 65/138 - loss 1.19264840 - time (sec): 1.45 - samples/sec: 7324.74 - lr: 0.000028 - momentum: 0.000000 2023-10-18 14:44:27,296 epoch 2 - iter 78/138 - loss 1.14836302 - time (sec): 1.76 - samples/sec: 7277.00 - lr: 0.000028 - momentum: 0.000000 2023-10-18 14:44:27,589 epoch 2 - iter 91/138 - loss 1.14620358 - time (sec): 2.06 - samples/sec: 7181.99 - lr: 0.000028 - momentum: 0.000000 2023-10-18 14:44:27,891 epoch 2 - iter 104/138 - loss 1.13012148 - time (sec): 2.36 - samples/sec: 7167.55 - lr: 0.000028 - momentum: 0.000000 2023-10-18 14:44:28,187 epoch 2 - iter 117/138 - loss 1.12244059 - time (sec): 2.65 - samples/sec: 7250.54 - lr: 0.000027 - momentum: 0.000000 2023-10-18 14:44:28,497 epoch 2 - iter 130/138 - loss 1.08277494 - time (sec): 2.96 - samples/sec: 7276.45 - lr: 0.000027 - momentum: 0.000000 2023-10-18 14:44:28,676 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:28,676 EPOCH 2 done: loss 1.0784 - lr: 0.000027 2023-10-18 14:44:29,025 DEV : loss 0.8357104063034058 - f1-score (micro avg) 0.0 2023-10-18 14:44:29,029 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:29,315 epoch 3 - iter 13/138 - loss 0.98820548 - time (sec): 0.29 - samples/sec: 7725.54 - lr: 0.000026 - momentum: 0.000000 2023-10-18 14:44:29,613 epoch 3 - iter 26/138 - loss 0.98097005 - time (sec): 0.58 - samples/sec: 7967.30 - lr: 0.000026 - momentum: 0.000000 2023-10-18 14:44:29,883 epoch 3 - iter 39/138 - loss 0.98080860 - time (sec): 0.85 - samples/sec: 7972.31 - lr: 0.000026 - momentum: 0.000000 2023-10-18 14:44:30,167 epoch 3 - iter 52/138 - loss 0.92333642 - time (sec): 1.14 - samples/sec: 7831.84 - lr: 0.000025 - momentum: 0.000000 2023-10-18 14:44:30,447 epoch 3 - iter 65/138 - loss 0.90231843 - time (sec): 1.42 - samples/sec: 7909.51 - lr: 0.000025 - momentum: 0.000000 2023-10-18 14:44:30,726 epoch 3 - iter 78/138 - loss 0.88019406 - time (sec): 1.70 - samples/sec: 7847.02 - lr: 0.000025 - momentum: 0.000000 2023-10-18 14:44:31,000 epoch 3 - iter 91/138 - loss 0.87301108 - time (sec): 1.97 - samples/sec: 7778.34 - lr: 0.000025 - momentum: 0.000000 2023-10-18 14:44:31,298 epoch 3 - iter 104/138 - loss 0.86478251 - time (sec): 2.27 - samples/sec: 7697.32 - lr: 0.000024 - momentum: 0.000000 2023-10-18 14:44:31,603 epoch 3 - iter 117/138 - loss 0.86149433 - time (sec): 2.57 - samples/sec: 7616.38 - lr: 0.000024 - momentum: 0.000000 2023-10-18 14:44:31,894 epoch 3 - iter 130/138 - loss 0.85947925 - time (sec): 2.86 - samples/sec: 7566.82 - lr: 0.000024 - momentum: 0.000000 2023-10-18 14:44:32,067 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:32,068 EPOCH 3 done: loss 0.8529 - lr: 0.000024 2023-10-18 14:44:32,420 DEV : loss 0.6737000942230225 - f1-score (micro avg) 0.0095 2023-10-18 14:44:32,424 saving best model 2023-10-18 14:44:32,458 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:32,736 epoch 4 - iter 13/138 - loss 0.78639711 - time (sec): 0.28 - samples/sec: 7145.29 - lr: 0.000023 - momentum: 0.000000 2023-10-18 14:44:33,023 epoch 4 - iter 26/138 - loss 0.80055742 - time (sec): 0.56 - samples/sec: 7048.49 - lr: 0.000023 - momentum: 0.000000 2023-10-18 14:44:33,315 epoch 4 - iter 39/138 - loss 0.79948307 - time (sec): 0.86 - samples/sec: 7023.10 - lr: 0.000022 - momentum: 0.000000 2023-10-18 14:44:33,614 epoch 4 - iter 52/138 - loss 0.78231556 - time (sec): 1.16 - samples/sec: 7150.35 - lr: 0.000022 - momentum: 0.000000 2023-10-18 14:44:33,913 epoch 4 - iter 65/138 - loss 0.77445613 - time (sec): 1.45 - samples/sec: 7149.11 - lr: 0.000022 - momentum: 0.000000 2023-10-18 14:44:34,211 epoch 4 - iter 78/138 - loss 0.78036006 - time (sec): 1.75 - samples/sec: 7238.56 - lr: 0.000021 - momentum: 0.000000 2023-10-18 14:44:34,509 epoch 4 - iter 91/138 - loss 0.77329601 - time (sec): 2.05 - samples/sec: 7239.84 - lr: 0.000021 - momentum: 0.000000 2023-10-18 14:44:34,801 epoch 4 - iter 104/138 - loss 0.76118171 - time (sec): 2.34 - samples/sec: 7269.66 - lr: 0.000021 - momentum: 0.000000 2023-10-18 14:44:35,089 epoch 4 - iter 117/138 - loss 0.75320336 - time (sec): 2.63 - samples/sec: 7335.16 - lr: 0.000021 - momentum: 0.000000 2023-10-18 14:44:35,377 epoch 4 - iter 130/138 - loss 0.74796027 - time (sec): 2.92 - samples/sec: 7352.55 - lr: 0.000020 - momentum: 0.000000 2023-10-18 14:44:35,544 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:35,544 EPOCH 4 done: loss 0.7343 - lr: 0.000020 2023-10-18 14:44:36,023 DEV : loss 0.6080276966094971 - f1-score (micro avg) 0.0625 2023-10-18 14:44:36,027 saving best model 2023-10-18 14:44:36,066 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:36,350 epoch 5 - iter 13/138 - loss 0.72897632 - time (sec): 0.28 - samples/sec: 8223.64 - lr: 0.000020 - momentum: 0.000000 2023-10-18 14:44:36,635 epoch 5 - iter 26/138 - loss 0.67419737 - time (sec): 0.57 - samples/sec: 7987.37 - lr: 0.000019 - momentum: 0.000000 2023-10-18 14:44:36,921 epoch 5 - iter 39/138 - loss 0.66327275 - time (sec): 0.85 - samples/sec: 7840.77 - lr: 0.000019 - momentum: 0.000000 2023-10-18 14:44:37,194 epoch 5 - iter 52/138 - loss 0.67337077 - time (sec): 1.13 - samples/sec: 7867.83 - lr: 0.000019 - momentum: 0.000000 2023-10-18 14:44:37,475 epoch 5 - iter 65/138 - loss 0.66087110 - time (sec): 1.41 - samples/sec: 7892.18 - lr: 0.000018 - momentum: 0.000000 2023-10-18 14:44:37,749 epoch 5 - iter 78/138 - loss 0.66105552 - time (sec): 1.68 - samples/sec: 7805.59 - lr: 0.000018 - momentum: 0.000000 2023-10-18 14:44:38,031 epoch 5 - iter 91/138 - loss 0.67621490 - time (sec): 1.96 - samples/sec: 7822.90 - lr: 0.000018 - momentum: 0.000000 2023-10-18 14:44:38,308 epoch 5 - iter 104/138 - loss 0.66536673 - time (sec): 2.24 - samples/sec: 7747.58 - lr: 0.000018 - momentum: 0.000000 2023-10-18 14:44:38,581 epoch 5 - iter 117/138 - loss 0.66306599 - time (sec): 2.51 - samples/sec: 7700.03 - lr: 0.000017 - momentum: 0.000000 2023-10-18 14:44:38,857 epoch 5 - iter 130/138 - loss 0.65607499 - time (sec): 2.79 - samples/sec: 7638.32 - lr: 0.000017 - momentum: 0.000000 2023-10-18 14:44:39,028 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:39,028 EPOCH 5 done: loss 0.6545 - lr: 0.000017 2023-10-18 14:44:39,386 DEV : loss 0.5329661965370178 - f1-score (micro avg) 0.1917 2023-10-18 14:44:39,390 saving best model 2023-10-18 14:44:39,429 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:39,755 epoch 6 - iter 13/138 - loss 0.66846780 - time (sec): 0.33 - samples/sec: 6082.72 - lr: 0.000016 - momentum: 0.000000 2023-10-18 14:44:40,072 epoch 6 - iter 26/138 - loss 0.63723474 - time (sec): 0.64 - samples/sec: 6359.30 - lr: 0.000016 - momentum: 0.000000 2023-10-18 14:44:40,374 epoch 6 - iter 39/138 - loss 0.63117860 - time (sec): 0.94 - samples/sec: 6279.98 - lr: 0.000016 - momentum: 0.000000 2023-10-18 14:44:40,695 epoch 6 - iter 52/138 - loss 0.62869044 - time (sec): 1.27 - samples/sec: 6437.66 - lr: 0.000015 - momentum: 0.000000 2023-10-18 14:44:41,002 epoch 6 - iter 65/138 - loss 0.62316862 - time (sec): 1.57 - samples/sec: 6592.29 - lr: 0.000015 - momentum: 0.000000 2023-10-18 14:44:41,290 epoch 6 - iter 78/138 - loss 0.62259517 - time (sec): 1.86 - samples/sec: 6855.59 - lr: 0.000015 - momentum: 0.000000 2023-10-18 14:44:41,569 epoch 6 - iter 91/138 - loss 0.60961835 - time (sec): 2.14 - samples/sec: 6959.11 - lr: 0.000015 - momentum: 0.000000 2023-10-18 14:44:41,857 epoch 6 - iter 104/138 - loss 0.60683274 - time (sec): 2.43 - samples/sec: 6942.48 - lr: 0.000014 - momentum: 0.000000 2023-10-18 14:44:42,162 epoch 6 - iter 117/138 - loss 0.62745433 - time (sec): 2.73 - samples/sec: 6955.77 - lr: 0.000014 - momentum: 0.000000 2023-10-18 14:44:42,461 epoch 6 - iter 130/138 - loss 0.61365882 - time (sec): 3.03 - samples/sec: 6979.27 - lr: 0.000014 - momentum: 0.000000 2023-10-18 14:44:42,638 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:42,639 EPOCH 6 done: loss 0.6094 - lr: 0.000014 2023-10-18 14:44:43,007 DEV : loss 0.49222689867019653 - f1-score (micro avg) 0.2349 2023-10-18 14:44:43,011 saving best model 2023-10-18 14:44:43,044 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:43,337 epoch 7 - iter 13/138 - loss 0.65445718 - time (sec): 0.29 - samples/sec: 6290.59 - lr: 0.000013 - momentum: 0.000000 2023-10-18 14:44:43,630 epoch 7 - iter 26/138 - loss 0.59842300 - time (sec): 0.59 - samples/sec: 7091.16 - lr: 0.000013 - momentum: 0.000000 2023-10-18 14:44:43,935 epoch 7 - iter 39/138 - loss 0.55656779 - time (sec): 0.89 - samples/sec: 7344.98 - lr: 0.000012 - momentum: 0.000000 2023-10-18 14:44:44,237 epoch 7 - iter 52/138 - loss 0.55628281 - time (sec): 1.19 - samples/sec: 7317.87 - lr: 0.000012 - momentum: 0.000000 2023-10-18 14:44:44,541 epoch 7 - iter 65/138 - loss 0.56697048 - time (sec): 1.50 - samples/sec: 7192.74 - lr: 0.000012 - momentum: 0.000000 2023-10-18 14:44:44,835 epoch 7 - iter 78/138 - loss 0.56499553 - time (sec): 1.79 - samples/sec: 7138.86 - lr: 0.000012 - momentum: 0.000000 2023-10-18 14:44:45,124 epoch 7 - iter 91/138 - loss 0.56333720 - time (sec): 2.08 - samples/sec: 7252.73 - lr: 0.000011 - momentum: 0.000000 2023-10-18 14:44:45,403 epoch 7 - iter 104/138 - loss 0.56674029 - time (sec): 2.36 - samples/sec: 7305.44 - lr: 0.000011 - momentum: 0.000000 2023-10-18 14:44:45,683 epoch 7 - iter 117/138 - loss 0.56763392 - time (sec): 2.64 - samples/sec: 7352.45 - lr: 0.000011 - momentum: 0.000000 2023-10-18 14:44:45,971 epoch 7 - iter 130/138 - loss 0.56375929 - time (sec): 2.93 - samples/sec: 7414.88 - lr: 0.000010 - momentum: 0.000000 2023-10-18 14:44:46,138 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:46,138 EPOCH 7 done: loss 0.5649 - lr: 0.000010 2023-10-18 14:44:46,514 DEV : loss 0.4581972360610962 - f1-score (micro avg) 0.2721 2023-10-18 14:44:46,519 saving best model 2023-10-18 14:44:46,555 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:46,846 epoch 8 - iter 13/138 - loss 0.57448595 - time (sec): 0.29 - samples/sec: 7901.44 - lr: 0.000010 - momentum: 0.000000 2023-10-18 14:44:47,135 epoch 8 - iter 26/138 - loss 0.55483701 - time (sec): 0.58 - samples/sec: 7882.48 - lr: 0.000009 - momentum: 0.000000 2023-10-18 14:44:47,422 epoch 8 - iter 39/138 - loss 0.58564271 - time (sec): 0.87 - samples/sec: 8048.32 - lr: 0.000009 - momentum: 0.000000 2023-10-18 14:44:47,716 epoch 8 - iter 52/138 - loss 0.56632677 - time (sec): 1.16 - samples/sec: 7845.41 - lr: 0.000009 - momentum: 0.000000 2023-10-18 14:44:48,000 epoch 8 - iter 65/138 - loss 0.56724695 - time (sec): 1.45 - samples/sec: 7707.61 - lr: 0.000009 - momentum: 0.000000 2023-10-18 14:44:48,278 epoch 8 - iter 78/138 - loss 0.56633922 - time (sec): 1.72 - samples/sec: 7618.43 - lr: 0.000008 - momentum: 0.000000 2023-10-18 14:44:48,545 epoch 8 - iter 91/138 - loss 0.56725942 - time (sec): 1.99 - samples/sec: 7505.51 - lr: 0.000008 - momentum: 0.000000 2023-10-18 14:44:48,820 epoch 8 - iter 104/138 - loss 0.56258095 - time (sec): 2.27 - samples/sec: 7536.58 - lr: 0.000008 - momentum: 0.000000 2023-10-18 14:44:49,119 epoch 8 - iter 117/138 - loss 0.55693521 - time (sec): 2.56 - samples/sec: 7502.09 - lr: 0.000007 - momentum: 0.000000 2023-10-18 14:44:49,399 epoch 8 - iter 130/138 - loss 0.55383777 - time (sec): 2.84 - samples/sec: 7574.09 - lr: 0.000007 - momentum: 0.000000 2023-10-18 14:44:49,569 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:49,569 EPOCH 8 done: loss 0.5514 - lr: 0.000007 2023-10-18 14:44:49,929 DEV : loss 0.44890934228897095 - f1-score (micro avg) 0.281 2023-10-18 14:44:49,933 saving best model 2023-10-18 14:44:49,969 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:50,248 epoch 9 - iter 13/138 - loss 0.49080680 - time (sec): 0.28 - samples/sec: 7428.82 - lr: 0.000006 - momentum: 0.000000 2023-10-18 14:44:50,553 epoch 9 - iter 26/138 - loss 0.52929321 - time (sec): 0.58 - samples/sec: 7240.20 - lr: 0.000006 - momentum: 0.000000 2023-10-18 14:44:50,848 epoch 9 - iter 39/138 - loss 0.53596693 - time (sec): 0.88 - samples/sec: 7076.05 - lr: 0.000006 - momentum: 0.000000 2023-10-18 14:44:51,126 epoch 9 - iter 52/138 - loss 0.52780705 - time (sec): 1.16 - samples/sec: 7097.04 - lr: 0.000005 - momentum: 0.000000 2023-10-18 14:44:51,417 epoch 9 - iter 65/138 - loss 0.53305648 - time (sec): 1.45 - samples/sec: 7060.51 - lr: 0.000005 - momentum: 0.000000 2023-10-18 14:44:51,722 epoch 9 - iter 78/138 - loss 0.53386170 - time (sec): 1.75 - samples/sec: 7042.63 - lr: 0.000005 - momentum: 0.000000 2023-10-18 14:44:52,011 epoch 9 - iter 91/138 - loss 0.53645774 - time (sec): 2.04 - samples/sec: 7151.08 - lr: 0.000005 - momentum: 0.000000 2023-10-18 14:44:52,320 epoch 9 - iter 104/138 - loss 0.53023292 - time (sec): 2.35 - samples/sec: 7203.32 - lr: 0.000004 - momentum: 0.000000 2023-10-18 14:44:52,624 epoch 9 - iter 117/138 - loss 0.52468748 - time (sec): 2.65 - samples/sec: 7256.98 - lr: 0.000004 - momentum: 0.000000 2023-10-18 14:44:52,925 epoch 9 - iter 130/138 - loss 0.52758927 - time (sec): 2.95 - samples/sec: 7297.91 - lr: 0.000004 - momentum: 0.000000 2023-10-18 14:44:53,108 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:53,108 EPOCH 9 done: loss 0.5315 - lr: 0.000004 2023-10-18 14:44:53,473 DEV : loss 0.43327605724334717 - f1-score (micro avg) 0.3104 2023-10-18 14:44:53,477 saving best model 2023-10-18 14:44:53,513 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:53,816 epoch 10 - iter 13/138 - loss 0.53683943 - time (sec): 0.30 - samples/sec: 7126.69 - lr: 0.000003 - momentum: 0.000000 2023-10-18 14:44:54,121 epoch 10 - iter 26/138 - loss 0.53833980 - time (sec): 0.61 - samples/sec: 7251.40 - lr: 0.000003 - momentum: 0.000000 2023-10-18 14:44:54,429 epoch 10 - iter 39/138 - loss 0.56534209 - time (sec): 0.92 - samples/sec: 7308.20 - lr: 0.000002 - momentum: 0.000000 2023-10-18 14:44:54,721 epoch 10 - iter 52/138 - loss 0.55833241 - time (sec): 1.21 - samples/sec: 7229.37 - lr: 0.000002 - momentum: 0.000000 2023-10-18 14:44:55,019 epoch 10 - iter 65/138 - loss 0.56393905 - time (sec): 1.51 - samples/sec: 7338.12 - lr: 0.000002 - momentum: 0.000000 2023-10-18 14:44:55,317 epoch 10 - iter 78/138 - loss 0.54708651 - time (sec): 1.80 - samples/sec: 7368.53 - lr: 0.000002 - momentum: 0.000000 2023-10-18 14:44:55,613 epoch 10 - iter 91/138 - loss 0.54411553 - time (sec): 2.10 - samples/sec: 7290.42 - lr: 0.000001 - momentum: 0.000000 2023-10-18 14:44:55,908 epoch 10 - iter 104/138 - loss 0.54488777 - time (sec): 2.39 - samples/sec: 7243.54 - lr: 0.000001 - momentum: 0.000000 2023-10-18 14:44:56,208 epoch 10 - iter 117/138 - loss 0.54295386 - time (sec): 2.69 - samples/sec: 7259.38 - lr: 0.000001 - momentum: 0.000000 2023-10-18 14:44:56,502 epoch 10 - iter 130/138 - loss 0.54286910 - time (sec): 2.99 - samples/sec: 7176.95 - lr: 0.000000 - momentum: 0.000000 2023-10-18 14:44:56,674 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:56,674 EPOCH 10 done: loss 0.5362 - lr: 0.000000 2023-10-18 14:44:57,043 DEV : loss 0.42921262979507446 - f1-score (micro avg) 0.3185 2023-10-18 14:44:57,048 saving best model 2023-10-18 14:44:57,109 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:44:57,109 Loading model from best epoch ... 2023-10-18 14:44:57,190 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-18 14:44:57,480 Results: - F-score (micro) 0.3558 - F-score (macro) 0.1677 - Accuracy 0.2185 By class: precision recall f1-score support scope 0.5844 0.5114 0.5455 176 pers 0.3448 0.0781 0.1274 128 work 0.1864 0.1486 0.1654 74 object 0.0000 0.0000 0.0000 2 loc 0.0000 0.0000 0.0000 2 micro avg 0.4587 0.2906 0.3558 382 macro avg 0.2231 0.1476 0.1677 382 weighted avg 0.4209 0.2906 0.3260 382 2023-10-18 14:44:57,480 ----------------------------------------------------------------------------------------------------