2023-10-18 14:48:01,578 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:01,579 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 14:48:01,579 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:01,579 MultiCorpus: 1100 train + 206 dev + 240 test sentences - NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator 2023-10-18 14:48:01,579 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:01,579 Train: 1100 sentences 2023-10-18 14:48:01,579 (train_with_dev=False, train_with_test=False) 2023-10-18 14:48:01,579 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:01,579 Training Params: 2023-10-18 14:48:01,579 - learning_rate: "5e-05" 2023-10-18 14:48:01,579 - mini_batch_size: "8" 2023-10-18 14:48:01,579 - max_epochs: "10" 2023-10-18 14:48:01,579 - shuffle: "True" 2023-10-18 14:48:01,579 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:01,579 Plugins: 2023-10-18 14:48:01,579 - TensorboardLogger 2023-10-18 14:48:01,579 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 14:48:01,579 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:01,579 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 14:48:01,579 - metric: "('micro avg', 'f1-score')" 2023-10-18 14:48:01,579 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:01,579 Computation: 2023-10-18 14:48:01,579 - compute on device: cuda:0 2023-10-18 14:48:01,579 - embedding storage: none 2023-10-18 14:48:01,580 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:01,580 Model training base path: "hmbench-ajmc/de-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-18 14:48:01,580 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:01,580 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:01,580 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 14:48:01,844 epoch 1 - iter 13/138 - loss 3.61621295 - time (sec): 0.26 - samples/sec: 8642.49 - lr: 0.000004 - momentum: 0.000000 2023-10-18 14:48:02,094 epoch 1 - iter 26/138 - loss 3.58833752 - time (sec): 0.51 - samples/sec: 8388.83 - lr: 0.000009 - momentum: 0.000000 2023-10-18 14:48:02,333 epoch 1 - iter 39/138 - loss 3.55941603 - time (sec): 0.75 - samples/sec: 8867.54 - lr: 0.000014 - momentum: 0.000000 2023-10-18 14:48:02,631 epoch 1 - iter 52/138 - loss 3.45512742 - time (sec): 1.05 - samples/sec: 8333.89 - lr: 0.000018 - momentum: 0.000000 2023-10-18 14:48:02,932 epoch 1 - iter 65/138 - loss 3.35066188 - time (sec): 1.35 - samples/sec: 8020.40 - lr: 0.000023 - momentum: 0.000000 2023-10-18 14:48:03,239 epoch 1 - iter 78/138 - loss 3.18390454 - time (sec): 1.66 - samples/sec: 7922.90 - lr: 0.000028 - momentum: 0.000000 2023-10-18 14:48:03,523 epoch 1 - iter 91/138 - loss 3.00847535 - time (sec): 1.94 - samples/sec: 7872.82 - lr: 0.000033 - momentum: 0.000000 2023-10-18 14:48:03,814 epoch 1 - iter 104/138 - loss 2.80529411 - time (sec): 2.23 - samples/sec: 7963.80 - lr: 0.000037 - momentum: 0.000000 2023-10-18 14:48:04,101 epoch 1 - iter 117/138 - loss 2.65693908 - time (sec): 2.52 - samples/sec: 7858.30 - lr: 0.000042 - momentum: 0.000000 2023-10-18 14:48:04,379 epoch 1 - iter 130/138 - loss 2.52423147 - time (sec): 2.80 - samples/sec: 7753.28 - lr: 0.000047 - momentum: 0.000000 2023-10-18 14:48:04,545 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:04,545 EPOCH 1 done: loss 2.4593 - lr: 0.000047 2023-10-18 14:48:04,801 DEV : loss 0.899878203868866 - f1-score (micro avg) 0.0 2023-10-18 14:48:04,807 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:05,111 epoch 2 - iter 13/138 - loss 0.93950857 - time (sec): 0.30 - samples/sec: 8053.90 - lr: 0.000050 - momentum: 0.000000 2023-10-18 14:48:05,398 epoch 2 - iter 26/138 - loss 0.89565026 - time (sec): 0.59 - samples/sec: 7617.00 - lr: 0.000049 - momentum: 0.000000 2023-10-18 14:48:05,672 epoch 2 - iter 39/138 - loss 0.91908086 - time (sec): 0.86 - samples/sec: 7623.51 - lr: 0.000048 - momentum: 0.000000 2023-10-18 14:48:05,958 epoch 2 - iter 52/138 - loss 0.91861228 - time (sec): 1.15 - samples/sec: 7638.11 - lr: 0.000048 - momentum: 0.000000 2023-10-18 14:48:06,237 epoch 2 - iter 65/138 - loss 0.90086032 - time (sec): 1.43 - samples/sec: 7700.79 - lr: 0.000047 - momentum: 0.000000 2023-10-18 14:48:06,516 epoch 2 - iter 78/138 - loss 0.88596343 - time (sec): 1.71 - samples/sec: 7791.87 - lr: 0.000047 - momentum: 0.000000 2023-10-18 14:48:06,792 epoch 2 - iter 91/138 - loss 0.86861150 - time (sec): 1.98 - samples/sec: 7710.75 - lr: 0.000046 - momentum: 0.000000 2023-10-18 14:48:07,090 epoch 2 - iter 104/138 - loss 0.85751453 - time (sec): 2.28 - samples/sec: 7573.54 - lr: 0.000046 - momentum: 0.000000 2023-10-18 14:48:07,364 epoch 2 - iter 117/138 - loss 0.85634203 - time (sec): 2.56 - samples/sec: 7655.64 - lr: 0.000045 - momentum: 0.000000 2023-10-18 14:48:07,631 epoch 2 - iter 130/138 - loss 0.84243502 - time (sec): 2.82 - samples/sec: 7703.73 - lr: 0.000045 - momentum: 0.000000 2023-10-18 14:48:07,783 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:07,784 EPOCH 2 done: loss 0.8432 - lr: 0.000045 2023-10-18 14:48:08,146 DEV : loss 0.5958214402198792 - f1-score (micro avg) 0.0674 2023-10-18 14:48:08,152 saving best model 2023-10-18 14:48:08,184 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:08,459 epoch 3 - iter 13/138 - loss 0.68333402 - time (sec): 0.27 - samples/sec: 7340.83 - lr: 0.000044 - momentum: 0.000000 2023-10-18 14:48:08,746 epoch 3 - iter 26/138 - loss 0.65524953 - time (sec): 0.56 - samples/sec: 7335.59 - lr: 0.000043 - momentum: 0.000000 2023-10-18 14:48:09,048 epoch 3 - iter 39/138 - loss 0.67157573 - time (sec): 0.86 - samples/sec: 7437.40 - lr: 0.000043 - momentum: 0.000000 2023-10-18 14:48:09,351 epoch 3 - iter 52/138 - loss 0.65471134 - time (sec): 1.17 - samples/sec: 7629.24 - lr: 0.000042 - momentum: 0.000000 2023-10-18 14:48:09,627 epoch 3 - iter 65/138 - loss 0.64918371 - time (sec): 1.44 - samples/sec: 7648.44 - lr: 0.000042 - momentum: 0.000000 2023-10-18 14:48:09,904 epoch 3 - iter 78/138 - loss 0.63875739 - time (sec): 1.72 - samples/sec: 7618.74 - lr: 0.000041 - momentum: 0.000000 2023-10-18 14:48:10,191 epoch 3 - iter 91/138 - loss 0.63806077 - time (sec): 2.01 - samples/sec: 7435.21 - lr: 0.000041 - momentum: 0.000000 2023-10-18 14:48:10,480 epoch 3 - iter 104/138 - loss 0.63568487 - time (sec): 2.30 - samples/sec: 7401.69 - lr: 0.000040 - momentum: 0.000000 2023-10-18 14:48:10,780 epoch 3 - iter 117/138 - loss 0.63378407 - time (sec): 2.60 - samples/sec: 7444.87 - lr: 0.000040 - momentum: 0.000000 2023-10-18 14:48:11,222 epoch 3 - iter 130/138 - loss 0.62922214 - time (sec): 3.04 - samples/sec: 7100.68 - lr: 0.000039 - momentum: 0.000000 2023-10-18 14:48:11,394 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:11,394 EPOCH 3 done: loss 0.6271 - lr: 0.000039 2023-10-18 14:48:11,758 DEV : loss 0.46706756949424744 - f1-score (micro avg) 0.3648 2023-10-18 14:48:11,762 saving best model 2023-10-18 14:48:11,794 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:12,101 epoch 4 - iter 13/138 - loss 0.55692653 - time (sec): 0.31 - samples/sec: 7037.01 - lr: 0.000038 - momentum: 0.000000 2023-10-18 14:48:12,400 epoch 4 - iter 26/138 - loss 0.49403602 - time (sec): 0.61 - samples/sec: 7438.36 - lr: 0.000038 - momentum: 0.000000 2023-10-18 14:48:12,686 epoch 4 - iter 39/138 - loss 0.49928936 - time (sec): 0.89 - samples/sec: 7204.67 - lr: 0.000037 - momentum: 0.000000 2023-10-18 14:48:12,959 epoch 4 - iter 52/138 - loss 0.49820707 - time (sec): 1.17 - samples/sec: 7368.47 - lr: 0.000037 - momentum: 0.000000 2023-10-18 14:48:13,260 epoch 4 - iter 65/138 - loss 0.51047648 - time (sec): 1.47 - samples/sec: 7441.62 - lr: 0.000036 - momentum: 0.000000 2023-10-18 14:48:13,529 epoch 4 - iter 78/138 - loss 0.50746551 - time (sec): 1.74 - samples/sec: 7480.07 - lr: 0.000036 - momentum: 0.000000 2023-10-18 14:48:13,806 epoch 4 - iter 91/138 - loss 0.51071273 - time (sec): 2.01 - samples/sec: 7452.95 - lr: 0.000035 - momentum: 0.000000 2023-10-18 14:48:14,075 epoch 4 - iter 104/138 - loss 0.51569318 - time (sec): 2.28 - samples/sec: 7410.04 - lr: 0.000035 - momentum: 0.000000 2023-10-18 14:48:14,374 epoch 4 - iter 117/138 - loss 0.53407310 - time (sec): 2.58 - samples/sec: 7467.77 - lr: 0.000034 - momentum: 0.000000 2023-10-18 14:48:14,661 epoch 4 - iter 130/138 - loss 0.52786420 - time (sec): 2.87 - samples/sec: 7533.07 - lr: 0.000034 - momentum: 0.000000 2023-10-18 14:48:14,838 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:14,838 EPOCH 4 done: loss 0.5236 - lr: 0.000034 2023-10-18 14:48:15,201 DEV : loss 0.40362346172332764 - f1-score (micro avg) 0.3931 2023-10-18 14:48:15,205 saving best model 2023-10-18 14:48:15,236 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:15,526 epoch 5 - iter 13/138 - loss 0.45330791 - time (sec): 0.29 - samples/sec: 7302.10 - lr: 0.000033 - momentum: 0.000000 2023-10-18 14:48:15,814 epoch 5 - iter 26/138 - loss 0.46920724 - time (sec): 0.58 - samples/sec: 7094.88 - lr: 0.000032 - momentum: 0.000000 2023-10-18 14:48:16,104 epoch 5 - iter 39/138 - loss 0.47445543 - time (sec): 0.87 - samples/sec: 7303.00 - lr: 0.000032 - momentum: 0.000000 2023-10-18 14:48:16,404 epoch 5 - iter 52/138 - loss 0.45799012 - time (sec): 1.17 - samples/sec: 7430.86 - lr: 0.000031 - momentum: 0.000000 2023-10-18 14:48:16,696 epoch 5 - iter 65/138 - loss 0.46996586 - time (sec): 1.46 - samples/sec: 7217.03 - lr: 0.000031 - momentum: 0.000000 2023-10-18 14:48:16,985 epoch 5 - iter 78/138 - loss 0.46781253 - time (sec): 1.75 - samples/sec: 7258.51 - lr: 0.000030 - momentum: 0.000000 2023-10-18 14:48:17,268 epoch 5 - iter 91/138 - loss 0.46842812 - time (sec): 2.03 - samples/sec: 7359.80 - lr: 0.000030 - momentum: 0.000000 2023-10-18 14:48:17,557 epoch 5 - iter 104/138 - loss 0.45143974 - time (sec): 2.32 - samples/sec: 7315.77 - lr: 0.000029 - momentum: 0.000000 2023-10-18 14:48:17,848 epoch 5 - iter 117/138 - loss 0.45760709 - time (sec): 2.61 - samples/sec: 7350.40 - lr: 0.000029 - momentum: 0.000000 2023-10-18 14:48:18,141 epoch 5 - iter 130/138 - loss 0.46585937 - time (sec): 2.90 - samples/sec: 7410.27 - lr: 0.000028 - momentum: 0.000000 2023-10-18 14:48:18,321 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:18,321 EPOCH 5 done: loss 0.4624 - lr: 0.000028 2023-10-18 14:48:18,687 DEV : loss 0.34635287523269653 - f1-score (micro avg) 0.5339 2023-10-18 14:48:18,691 saving best model 2023-10-18 14:48:18,727 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:19,006 epoch 6 - iter 13/138 - loss 0.42187539 - time (sec): 0.28 - samples/sec: 8028.31 - lr: 0.000027 - momentum: 0.000000 2023-10-18 14:48:19,272 epoch 6 - iter 26/138 - loss 0.40492545 - time (sec): 0.54 - samples/sec: 7614.17 - lr: 0.000027 - momentum: 0.000000 2023-10-18 14:48:19,514 epoch 6 - iter 39/138 - loss 0.41437159 - time (sec): 0.79 - samples/sec: 7931.45 - lr: 0.000026 - momentum: 0.000000 2023-10-18 14:48:19,768 epoch 6 - iter 52/138 - loss 0.42073139 - time (sec): 1.04 - samples/sec: 7897.64 - lr: 0.000026 - momentum: 0.000000 2023-10-18 14:48:20,049 epoch 6 - iter 65/138 - loss 0.41337050 - time (sec): 1.32 - samples/sec: 8006.18 - lr: 0.000025 - momentum: 0.000000 2023-10-18 14:48:20,323 epoch 6 - iter 78/138 - loss 0.42018390 - time (sec): 1.60 - samples/sec: 8009.43 - lr: 0.000025 - momentum: 0.000000 2023-10-18 14:48:20,597 epoch 6 - iter 91/138 - loss 0.41755428 - time (sec): 1.87 - samples/sec: 8043.19 - lr: 0.000024 - momentum: 0.000000 2023-10-18 14:48:20,893 epoch 6 - iter 104/138 - loss 0.42023957 - time (sec): 2.16 - samples/sec: 7913.36 - lr: 0.000024 - momentum: 0.000000 2023-10-18 14:48:21,186 epoch 6 - iter 117/138 - loss 0.42016911 - time (sec): 2.46 - samples/sec: 7886.65 - lr: 0.000023 - momentum: 0.000000 2023-10-18 14:48:21,471 epoch 6 - iter 130/138 - loss 0.42898464 - time (sec): 2.74 - samples/sec: 7867.37 - lr: 0.000023 - momentum: 0.000000 2023-10-18 14:48:21,656 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:21,656 EPOCH 6 done: loss 0.4232 - lr: 0.000023 2023-10-18 14:48:22,024 DEV : loss 0.32510215044021606 - f1-score (micro avg) 0.5664 2023-10-18 14:48:22,028 saving best model 2023-10-18 14:48:22,061 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:22,355 epoch 7 - iter 13/138 - loss 0.44408968 - time (sec): 0.29 - samples/sec: 7490.16 - lr: 0.000022 - momentum: 0.000000 2023-10-18 14:48:22,629 epoch 7 - iter 26/138 - loss 0.42814280 - time (sec): 0.57 - samples/sec: 7635.90 - lr: 0.000021 - momentum: 0.000000 2023-10-18 14:48:22,921 epoch 7 - iter 39/138 - loss 0.41452726 - time (sec): 0.86 - samples/sec: 7531.87 - lr: 0.000021 - momentum: 0.000000 2023-10-18 14:48:23,198 epoch 7 - iter 52/138 - loss 0.41293677 - time (sec): 1.14 - samples/sec: 7700.30 - lr: 0.000020 - momentum: 0.000000 2023-10-18 14:48:23,468 epoch 7 - iter 65/138 - loss 0.40928714 - time (sec): 1.41 - samples/sec: 7792.32 - lr: 0.000020 - momentum: 0.000000 2023-10-18 14:48:23,741 epoch 7 - iter 78/138 - loss 0.39379892 - time (sec): 1.68 - samples/sec: 7824.55 - lr: 0.000019 - momentum: 0.000000 2023-10-18 14:48:24,008 epoch 7 - iter 91/138 - loss 0.38796470 - time (sec): 1.95 - samples/sec: 7772.29 - lr: 0.000019 - momentum: 0.000000 2023-10-18 14:48:24,278 epoch 7 - iter 104/138 - loss 0.38694101 - time (sec): 2.22 - samples/sec: 7743.28 - lr: 0.000018 - momentum: 0.000000 2023-10-18 14:48:24,559 epoch 7 - iter 117/138 - loss 0.38836812 - time (sec): 2.50 - samples/sec: 7798.98 - lr: 0.000018 - momentum: 0.000000 2023-10-18 14:48:24,834 epoch 7 - iter 130/138 - loss 0.39151995 - time (sec): 2.77 - samples/sec: 7846.26 - lr: 0.000017 - momentum: 0.000000 2023-10-18 14:48:24,989 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:24,989 EPOCH 7 done: loss 0.3911 - lr: 0.000017 2023-10-18 14:48:25,373 DEV : loss 0.32003989815711975 - f1-score (micro avg) 0.5783 2023-10-18 14:48:25,378 saving best model 2023-10-18 14:48:25,411 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:25,702 epoch 8 - iter 13/138 - loss 0.40449463 - time (sec): 0.29 - samples/sec: 6660.96 - lr: 0.000016 - momentum: 0.000000 2023-10-18 14:48:25,977 epoch 8 - iter 26/138 - loss 0.38570927 - time (sec): 0.56 - samples/sec: 7201.20 - lr: 0.000016 - momentum: 0.000000 2023-10-18 14:48:26,260 epoch 8 - iter 39/138 - loss 0.39921994 - time (sec): 0.85 - samples/sec: 7220.88 - lr: 0.000015 - momentum: 0.000000 2023-10-18 14:48:26,549 epoch 8 - iter 52/138 - loss 0.39531419 - time (sec): 1.14 - samples/sec: 7326.59 - lr: 0.000015 - momentum: 0.000000 2023-10-18 14:48:26,863 epoch 8 - iter 65/138 - loss 0.38567402 - time (sec): 1.45 - samples/sec: 7453.24 - lr: 0.000014 - momentum: 0.000000 2023-10-18 14:48:27,163 epoch 8 - iter 78/138 - loss 0.39325008 - time (sec): 1.75 - samples/sec: 7488.51 - lr: 0.000014 - momentum: 0.000000 2023-10-18 14:48:27,463 epoch 8 - iter 91/138 - loss 0.38138117 - time (sec): 2.05 - samples/sec: 7500.51 - lr: 0.000013 - momentum: 0.000000 2023-10-18 14:48:27,769 epoch 8 - iter 104/138 - loss 0.37742345 - time (sec): 2.36 - samples/sec: 7416.48 - lr: 0.000013 - momentum: 0.000000 2023-10-18 14:48:28,060 epoch 8 - iter 117/138 - loss 0.38459725 - time (sec): 2.65 - samples/sec: 7364.03 - lr: 0.000012 - momentum: 0.000000 2023-10-18 14:48:28,361 epoch 8 - iter 130/138 - loss 0.37837693 - time (sec): 2.95 - samples/sec: 7296.06 - lr: 0.000012 - momentum: 0.000000 2023-10-18 14:48:28,542 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:28,542 EPOCH 8 done: loss 0.3769 - lr: 0.000012 2023-10-18 14:48:28,920 DEV : loss 0.3068062663078308 - f1-score (micro avg) 0.5714 2023-10-18 14:48:28,923 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:29,197 epoch 9 - iter 13/138 - loss 0.45820679 - time (sec): 0.27 - samples/sec: 7814.57 - lr: 0.000011 - momentum: 0.000000 2023-10-18 14:48:29,477 epoch 9 - iter 26/138 - loss 0.39535289 - time (sec): 0.55 - samples/sec: 7761.23 - lr: 0.000010 - momentum: 0.000000 2023-10-18 14:48:29,762 epoch 9 - iter 39/138 - loss 0.40314374 - time (sec): 0.84 - samples/sec: 7809.64 - lr: 0.000010 - momentum: 0.000000 2023-10-18 14:48:30,061 epoch 9 - iter 52/138 - loss 0.39126971 - time (sec): 1.14 - samples/sec: 7578.68 - lr: 0.000009 - momentum: 0.000000 2023-10-18 14:48:30,355 epoch 9 - iter 65/138 - loss 0.39035044 - time (sec): 1.43 - samples/sec: 7518.61 - lr: 0.000009 - momentum: 0.000000 2023-10-18 14:48:30,636 epoch 9 - iter 78/138 - loss 0.37836327 - time (sec): 1.71 - samples/sec: 7379.61 - lr: 0.000008 - momentum: 0.000000 2023-10-18 14:48:30,933 epoch 9 - iter 91/138 - loss 0.35950535 - time (sec): 2.01 - samples/sec: 7464.77 - lr: 0.000008 - momentum: 0.000000 2023-10-18 14:48:31,217 epoch 9 - iter 104/138 - loss 0.35879332 - time (sec): 2.29 - samples/sec: 7421.78 - lr: 0.000007 - momentum: 0.000000 2023-10-18 14:48:31,520 epoch 9 - iter 117/138 - loss 0.36086285 - time (sec): 2.60 - samples/sec: 7445.84 - lr: 0.000007 - momentum: 0.000000 2023-10-18 14:48:31,828 epoch 9 - iter 130/138 - loss 0.36348692 - time (sec): 2.90 - samples/sec: 7478.02 - lr: 0.000006 - momentum: 0.000000 2023-10-18 14:48:32,006 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:32,006 EPOCH 9 done: loss 0.3658 - lr: 0.000006 2023-10-18 14:48:32,383 DEV : loss 0.2985237240791321 - f1-score (micro avg) 0.5749 2023-10-18 14:48:32,387 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:32,676 epoch 10 - iter 13/138 - loss 0.34147069 - time (sec): 0.29 - samples/sec: 7974.56 - lr: 0.000005 - momentum: 0.000000 2023-10-18 14:48:32,967 epoch 10 - iter 26/138 - loss 0.33718214 - time (sec): 0.58 - samples/sec: 7484.86 - lr: 0.000005 - momentum: 0.000000 2023-10-18 14:48:33,253 epoch 10 - iter 39/138 - loss 0.34787922 - time (sec): 0.87 - samples/sec: 7422.11 - lr: 0.000004 - momentum: 0.000000 2023-10-18 14:48:33,531 epoch 10 - iter 52/138 - loss 0.34339969 - time (sec): 1.14 - samples/sec: 7727.53 - lr: 0.000004 - momentum: 0.000000 2023-10-18 14:48:33,816 epoch 10 - iter 65/138 - loss 0.34816441 - time (sec): 1.43 - samples/sec: 7720.46 - lr: 0.000003 - momentum: 0.000000 2023-10-18 14:48:34,118 epoch 10 - iter 78/138 - loss 0.35112086 - time (sec): 1.73 - samples/sec: 7656.23 - lr: 0.000003 - momentum: 0.000000 2023-10-18 14:48:34,403 epoch 10 - iter 91/138 - loss 0.34924704 - time (sec): 2.02 - samples/sec: 7636.50 - lr: 0.000002 - momentum: 0.000000 2023-10-18 14:48:34,674 epoch 10 - iter 104/138 - loss 0.36083286 - time (sec): 2.29 - samples/sec: 7629.42 - lr: 0.000002 - momentum: 0.000000 2023-10-18 14:48:34,957 epoch 10 - iter 117/138 - loss 0.35958672 - time (sec): 2.57 - samples/sec: 7631.62 - lr: 0.000001 - momentum: 0.000000 2023-10-18 14:48:35,237 epoch 10 - iter 130/138 - loss 0.35664338 - time (sec): 2.85 - samples/sec: 7543.36 - lr: 0.000000 - momentum: 0.000000 2023-10-18 14:48:35,414 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:35,414 EPOCH 10 done: loss 0.3596 - lr: 0.000000 2023-10-18 14:48:35,790 DEV : loss 0.29692572355270386 - f1-score (micro avg) 0.5848 2023-10-18 14:48:35,794 saving best model 2023-10-18 14:48:35,855 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:48:35,855 Loading model from best epoch ... 2023-10-18 14:48:35,928 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-18 14:48:36,225 Results: - F-score (micro) 0.615 - F-score (macro) 0.3655 - Accuracy 0.4577 By class: precision recall f1-score support scope 0.5895 0.6364 0.6120 176 pers 0.8058 0.6484 0.7186 128 work 0.4343 0.5811 0.4971 74 object 0.0000 0.0000 0.0000 2 loc 0.0000 0.0000 0.0000 2 micro avg 0.6071 0.6230 0.6150 382 macro avg 0.3659 0.3732 0.3655 382 weighted avg 0.6257 0.6230 0.6191 382 2023-10-18 14:48:36,225 ----------------------------------------------------------------------------------------------------