2023-10-18 14:38:57,878 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:38:57,878 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 14:38:57,878 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:38:57,878 MultiCorpus: 1100 train + 206 dev + 240 test sentences - NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator 2023-10-18 14:38:57,878 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:38:57,878 Train: 1100 sentences 2023-10-18 14:38:57,878 (train_with_dev=False, train_with_test=False) 2023-10-18 14:38:57,878 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:38:57,878 Training Params: 2023-10-18 14:38:57,878 - learning_rate: "5e-05" 2023-10-18 14:38:57,878 - mini_batch_size: "8" 2023-10-18 14:38:57,878 - max_epochs: "10" 2023-10-18 14:38:57,878 - shuffle: "True" 2023-10-18 14:38:57,878 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:38:57,878 Plugins: 2023-10-18 14:38:57,878 - TensorboardLogger 2023-10-18 14:38:57,879 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 14:38:57,879 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:38:57,879 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 14:38:57,879 - metric: "('micro avg', 'f1-score')" 2023-10-18 14:38:57,879 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:38:57,879 Computation: 2023-10-18 14:38:57,879 - compute on device: cuda:0 2023-10-18 14:38:57,879 - embedding storage: none 2023-10-18 14:38:57,879 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:38:57,879 Model training base path: "hmbench-ajmc/de-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-18 14:38:57,879 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:38:57,879 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:38:57,879 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 14:38:58,190 epoch 1 - iter 13/138 - loss 3.42245675 - time (sec): 0.31 - samples/sec: 6894.88 - lr: 0.000004 - momentum: 0.000000 2023-10-18 14:38:58,497 epoch 1 - iter 26/138 - loss 3.45651479 - time (sec): 0.62 - samples/sec: 7001.30 - lr: 0.000009 - momentum: 0.000000 2023-10-18 14:38:58,820 epoch 1 - iter 39/138 - loss 3.39669444 - time (sec): 0.94 - samples/sec: 7122.95 - lr: 0.000014 - momentum: 0.000000 2023-10-18 14:38:59,109 epoch 1 - iter 52/138 - loss 3.29406571 - time (sec): 1.23 - samples/sec: 7213.47 - lr: 0.000018 - momentum: 0.000000 2023-10-18 14:38:59,383 epoch 1 - iter 65/138 - loss 3.17358152 - time (sec): 1.50 - samples/sec: 7188.29 - lr: 0.000023 - momentum: 0.000000 2023-10-18 14:38:59,685 epoch 1 - iter 78/138 - loss 3.01361420 - time (sec): 1.81 - samples/sec: 7262.93 - lr: 0.000028 - momentum: 0.000000 2023-10-18 14:38:59,983 epoch 1 - iter 91/138 - loss 2.85709471 - time (sec): 2.10 - samples/sec: 7281.39 - lr: 0.000033 - momentum: 0.000000 2023-10-18 14:39:00,279 epoch 1 - iter 104/138 - loss 2.68024097 - time (sec): 2.40 - samples/sec: 7265.33 - lr: 0.000037 - momentum: 0.000000 2023-10-18 14:39:00,581 epoch 1 - iter 117/138 - loss 2.50700588 - time (sec): 2.70 - samples/sec: 7245.48 - lr: 0.000042 - momentum: 0.000000 2023-10-18 14:39:00,875 epoch 1 - iter 130/138 - loss 2.38275601 - time (sec): 3.00 - samples/sec: 7216.10 - lr: 0.000047 - momentum: 0.000000 2023-10-18 14:39:01,049 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:39:01,050 EPOCH 1 done: loss 2.3098 - lr: 0.000047 2023-10-18 14:39:01,297 DEV : loss 0.9283897876739502 - f1-score (micro avg) 0.0 2023-10-18 14:39:01,304 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:39:01,607 epoch 2 - iter 13/138 - loss 1.09385578 - time (sec): 0.30 - samples/sec: 7627.74 - lr: 0.000050 - momentum: 0.000000 2023-10-18 14:39:01,903 epoch 2 - iter 26/138 - loss 1.05428928 - time (sec): 0.60 - samples/sec: 7716.24 - lr: 0.000049 - momentum: 0.000000 2023-10-18 14:39:02,195 epoch 2 - iter 39/138 - loss 1.01869524 - time (sec): 0.89 - samples/sec: 7577.86 - lr: 0.000048 - momentum: 0.000000 2023-10-18 14:39:02,501 epoch 2 - iter 52/138 - loss 1.03347462 - time (sec): 1.20 - samples/sec: 7495.98 - lr: 0.000048 - momentum: 0.000000 2023-10-18 14:39:02,807 epoch 2 - iter 65/138 - loss 0.99740698 - time (sec): 1.50 - samples/sec: 7430.15 - lr: 0.000047 - momentum: 0.000000 2023-10-18 14:39:03,085 epoch 2 - iter 78/138 - loss 0.94137628 - time (sec): 1.78 - samples/sec: 7409.33 - lr: 0.000047 - momentum: 0.000000 2023-10-18 14:39:03,377 epoch 2 - iter 91/138 - loss 0.91957672 - time (sec): 2.07 - samples/sec: 7465.79 - lr: 0.000046 - momentum: 0.000000 2023-10-18 14:39:03,672 epoch 2 - iter 104/138 - loss 0.89906565 - time (sec): 2.37 - samples/sec: 7456.30 - lr: 0.000046 - momentum: 0.000000 2023-10-18 14:39:03,961 epoch 2 - iter 117/138 - loss 0.87844041 - time (sec): 2.66 - samples/sec: 7358.33 - lr: 0.000045 - momentum: 0.000000 2023-10-18 14:39:04,220 epoch 2 - iter 130/138 - loss 0.85969199 - time (sec): 2.92 - samples/sec: 7433.93 - lr: 0.000045 - momentum: 0.000000 2023-10-18 14:39:04,376 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:39:04,376 EPOCH 2 done: loss 0.8502 - lr: 0.000045 2023-10-18 14:39:04,736 DEV : loss 0.6120837330818176 - f1-score (micro avg) 0.1379 2023-10-18 14:39:04,740 saving best model 2023-10-18 14:39:04,773 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:39:05,053 epoch 3 - iter 13/138 - loss 0.64846285 - time (sec): 0.28 - samples/sec: 7263.66 - lr: 0.000044 - momentum: 0.000000 2023-10-18 14:39:05,344 epoch 3 - iter 26/138 - loss 0.66179865 - time (sec): 0.57 - samples/sec: 7581.46 - lr: 0.000043 - momentum: 0.000000 2023-10-18 14:39:05,619 epoch 3 - iter 39/138 - loss 0.66324060 - time (sec): 0.84 - samples/sec: 7724.22 - lr: 0.000043 - momentum: 0.000000 2023-10-18 14:39:05,898 epoch 3 - iter 52/138 - loss 0.68247883 - time (sec): 1.12 - samples/sec: 7718.13 - lr: 0.000042 - momentum: 0.000000 2023-10-18 14:39:06,180 epoch 3 - iter 65/138 - loss 0.68405829 - time (sec): 1.41 - samples/sec: 7665.98 - lr: 0.000042 - momentum: 0.000000 2023-10-18 14:39:06,470 epoch 3 - iter 78/138 - loss 0.66042721 - time (sec): 1.70 - samples/sec: 7671.28 - lr: 0.000041 - momentum: 0.000000 2023-10-18 14:39:06,774 epoch 3 - iter 91/138 - loss 0.64931781 - time (sec): 2.00 - samples/sec: 7674.19 - lr: 0.000041 - momentum: 0.000000 2023-10-18 14:39:07,056 epoch 3 - iter 104/138 - loss 0.64777456 - time (sec): 2.28 - samples/sec: 7564.66 - lr: 0.000040 - momentum: 0.000000 2023-10-18 14:39:07,352 epoch 3 - iter 117/138 - loss 0.64605280 - time (sec): 2.58 - samples/sec: 7593.44 - lr: 0.000040 - momentum: 0.000000 2023-10-18 14:39:07,645 epoch 3 - iter 130/138 - loss 0.64179322 - time (sec): 2.87 - samples/sec: 7523.07 - lr: 0.000039 - momentum: 0.000000 2023-10-18 14:39:07,819 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:39:07,819 EPOCH 3 done: loss 0.6498 - lr: 0.000039 2023-10-18 14:39:08,306 DEV : loss 0.48717039823532104 - f1-score (micro avg) 0.23 2023-10-18 14:39:08,310 saving best model 2023-10-18 14:39:08,355 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:39:08,642 epoch 4 - iter 13/138 - loss 0.55535406 - time (sec): 0.29 - samples/sec: 7221.60 - lr: 0.000038 - momentum: 0.000000 2023-10-18 14:39:08,914 epoch 4 - iter 26/138 - loss 0.55935695 - time (sec): 0.56 - samples/sec: 7474.97 - lr: 0.000038 - momentum: 0.000000 2023-10-18 14:39:09,188 epoch 4 - iter 39/138 - loss 0.59482611 - time (sec): 0.83 - samples/sec: 7440.12 - lr: 0.000037 - momentum: 0.000000 2023-10-18 14:39:09,477 epoch 4 - iter 52/138 - loss 0.61949451 - time (sec): 1.12 - samples/sec: 7714.21 - lr: 0.000037 - momentum: 0.000000 2023-10-18 14:39:09,749 epoch 4 - iter 65/138 - loss 0.61674220 - time (sec): 1.39 - samples/sec: 7559.71 - lr: 0.000036 - momentum: 0.000000 2023-10-18 14:39:10,029 epoch 4 - iter 78/138 - loss 0.60877807 - time (sec): 1.67 - samples/sec: 7540.71 - lr: 0.000036 - momentum: 0.000000 2023-10-18 14:39:10,306 epoch 4 - iter 91/138 - loss 0.58509220 - time (sec): 1.95 - samples/sec: 7589.24 - lr: 0.000035 - momentum: 0.000000 2023-10-18 14:39:10,595 epoch 4 - iter 104/138 - loss 0.58021061 - time (sec): 2.24 - samples/sec: 7731.05 - lr: 0.000035 - momentum: 0.000000 2023-10-18 14:39:10,887 epoch 4 - iter 117/138 - loss 0.56510813 - time (sec): 2.53 - samples/sec: 7684.82 - lr: 0.000034 - momentum: 0.000000 2023-10-18 14:39:11,162 epoch 4 - iter 130/138 - loss 0.55566967 - time (sec): 2.81 - samples/sec: 7606.32 - lr: 0.000034 - momentum: 0.000000 2023-10-18 14:39:11,336 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:39:11,336 EPOCH 4 done: loss 0.5596 - lr: 0.000034 2023-10-18 14:39:11,703 DEV : loss 0.4194047152996063 - f1-score (micro avg) 0.3063 2023-10-18 14:39:11,707 saving best model 2023-10-18 14:39:11,741 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:39:12,021 epoch 5 - iter 13/138 - loss 0.51136840 - time (sec): 0.28 - samples/sec: 7505.62 - lr: 0.000033 - momentum: 0.000000 2023-10-18 14:39:12,302 epoch 5 - iter 26/138 - loss 0.49615124 - time (sec): 0.56 - samples/sec: 7504.70 - lr: 0.000032 - momentum: 0.000000 2023-10-18 14:39:12,591 epoch 5 - iter 39/138 - loss 0.50925520 - time (sec): 0.85 - samples/sec: 7591.71 - lr: 0.000032 - momentum: 0.000000 2023-10-18 14:39:12,899 epoch 5 - iter 52/138 - loss 0.50981525 - time (sec): 1.16 - samples/sec: 7549.90 - lr: 0.000031 - momentum: 0.000000 2023-10-18 14:39:13,177 epoch 5 - iter 65/138 - loss 0.49880562 - time (sec): 1.44 - samples/sec: 7421.76 - lr: 0.000031 - momentum: 0.000000 2023-10-18 14:39:13,462 epoch 5 - iter 78/138 - loss 0.50193854 - time (sec): 1.72 - samples/sec: 7480.52 - lr: 0.000030 - momentum: 0.000000 2023-10-18 14:39:13,734 epoch 5 - iter 91/138 - loss 0.49885792 - time (sec): 1.99 - samples/sec: 7487.27 - lr: 0.000030 - momentum: 0.000000 2023-10-18 14:39:14,023 epoch 5 - iter 104/138 - loss 0.49915038 - time (sec): 2.28 - samples/sec: 7534.36 - lr: 0.000029 - momentum: 0.000000 2023-10-18 14:39:14,304 epoch 5 - iter 117/138 - loss 0.50338832 - time (sec): 2.56 - samples/sec: 7617.57 - lr: 0.000029 - momentum: 0.000000 2023-10-18 14:39:14,577 epoch 5 - iter 130/138 - loss 0.49760914 - time (sec): 2.84 - samples/sec: 7645.35 - lr: 0.000028 - momentum: 0.000000 2023-10-18 14:39:14,753 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:39:14,754 EPOCH 5 done: loss 0.4908 - lr: 0.000028 2023-10-18 14:39:15,119 DEV : loss 0.3583006262779236 - f1-score (micro avg) 0.4839 2023-10-18 14:39:15,123 saving best model 2023-10-18 14:39:15,156 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:39:15,431 epoch 6 - iter 13/138 - loss 0.37298682 - time (sec): 0.27 - samples/sec: 8006.39 - lr: 0.000027 - momentum: 0.000000 2023-10-18 14:39:15,706 epoch 6 - iter 26/138 - loss 0.42290226 - time (sec): 0.55 - samples/sec: 7622.87 - lr: 0.000027 - momentum: 0.000000 2023-10-18 14:39:15,982 epoch 6 - iter 39/138 - loss 0.40459662 - time (sec): 0.83 - samples/sec: 7737.35 - lr: 0.000026 - momentum: 0.000000 2023-10-18 14:39:16,261 epoch 6 - iter 52/138 - loss 0.43285839 - time (sec): 1.10 - samples/sec: 7679.09 - lr: 0.000026 - momentum: 0.000000 2023-10-18 14:39:16,565 epoch 6 - iter 65/138 - loss 0.42995726 - time (sec): 1.41 - samples/sec: 7717.85 - lr: 0.000025 - momentum: 0.000000 2023-10-18 14:39:16,859 epoch 6 - iter 78/138 - loss 0.42298722 - time (sec): 1.70 - samples/sec: 7697.88 - lr: 0.000025 - momentum: 0.000000 2023-10-18 14:39:17,139 epoch 6 - iter 91/138 - loss 0.42358084 - time (sec): 1.98 - samples/sec: 7701.83 - lr: 0.000024 - momentum: 0.000000 2023-10-18 14:39:17,409 epoch 6 - iter 104/138 - loss 0.42767791 - time (sec): 2.25 - samples/sec: 7689.40 - lr: 0.000024 - momentum: 0.000000 2023-10-18 14:39:17,682 epoch 6 - iter 117/138 - loss 0.43686698 - time (sec): 2.53 - samples/sec: 7759.09 - lr: 0.000023 - momentum: 0.000000 2023-10-18 14:39:17,971 epoch 6 - iter 130/138 - loss 0.43974471 - time (sec): 2.81 - samples/sec: 7723.83 - lr: 0.000023 - momentum: 0.000000 2023-10-18 14:39:18,132 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:39:18,132 EPOCH 6 done: loss 0.4414 - lr: 0.000023 2023-10-18 14:39:18,496 DEV : loss 0.33042845129966736 - f1-score (micro avg) 0.5341 2023-10-18 14:39:18,500 saving best model 2023-10-18 14:39:18,533 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:39:18,819 epoch 7 - iter 13/138 - loss 0.52247254 - time (sec): 0.29 - samples/sec: 8259.93 - lr: 0.000022 - momentum: 0.000000 2023-10-18 14:39:19,098 epoch 7 - iter 26/138 - loss 0.45463455 - time (sec): 0.56 - samples/sec: 7738.39 - lr: 0.000021 - momentum: 0.000000 2023-10-18 14:39:19,378 epoch 7 - iter 39/138 - loss 0.45637732 - time (sec): 0.84 - samples/sec: 7668.93 - lr: 0.000021 - momentum: 0.000000 2023-10-18 14:39:19,653 epoch 7 - iter 52/138 - loss 0.44819056 - time (sec): 1.12 - samples/sec: 7715.57 - lr: 0.000020 - momentum: 0.000000 2023-10-18 14:39:19,931 epoch 7 - iter 65/138 - loss 0.44589353 - time (sec): 1.40 - samples/sec: 7734.39 - lr: 0.000020 - momentum: 0.000000 2023-10-18 14:39:20,198 epoch 7 - iter 78/138 - loss 0.45413109 - time (sec): 1.66 - samples/sec: 7714.72 - lr: 0.000019 - momentum: 0.000000 2023-10-18 14:39:20,481 epoch 7 - iter 91/138 - loss 0.45309809 - time (sec): 1.95 - samples/sec: 7762.07 - lr: 0.000019 - momentum: 0.000000 2023-10-18 14:39:20,751 epoch 7 - iter 104/138 - loss 0.43800951 - time (sec): 2.22 - samples/sec: 7638.51 - lr: 0.000018 - momentum: 0.000000 2023-10-18 14:39:21,059 epoch 7 - iter 117/138 - loss 0.43199440 - time (sec): 2.53 - samples/sec: 7601.69 - lr: 0.000018 - momentum: 0.000000 2023-10-18 14:39:21,362 epoch 7 - iter 130/138 - loss 0.42818864 - time (sec): 2.83 - samples/sec: 7602.35 - lr: 0.000017 - momentum: 0.000000 2023-10-18 14:39:21,537 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:39:21,537 EPOCH 7 done: loss 0.4193 - lr: 0.000017 2023-10-18 14:39:21,907 DEV : loss 0.32160937786102295 - f1-score (micro avg) 0.5539 2023-10-18 14:39:21,912 saving best model 2023-10-18 14:39:21,944 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:39:22,183 epoch 8 - iter 13/138 - loss 0.41481649 - time (sec): 0.24 - samples/sec: 9225.48 - lr: 0.000016 - momentum: 0.000000 2023-10-18 14:39:22,424 epoch 8 - iter 26/138 - loss 0.39847745 - time (sec): 0.48 - samples/sec: 9408.31 - lr: 0.000016 - momentum: 0.000000 2023-10-18 14:39:22,703 epoch 8 - iter 39/138 - loss 0.40199344 - time (sec): 0.76 - samples/sec: 8710.48 - lr: 0.000015 - momentum: 0.000000 2023-10-18 14:39:23,007 epoch 8 - iter 52/138 - loss 0.40121457 - time (sec): 1.06 - samples/sec: 8503.64 - lr: 0.000015 - momentum: 0.000000 2023-10-18 14:39:23,284 epoch 8 - iter 65/138 - loss 0.38705500 - time (sec): 1.34 - samples/sec: 8405.00 - lr: 0.000014 - momentum: 0.000000 2023-10-18 14:39:23,572 epoch 8 - iter 78/138 - loss 0.37921949 - time (sec): 1.63 - samples/sec: 8261.88 - lr: 0.000014 - momentum: 0.000000 2023-10-18 14:39:23,832 epoch 8 - iter 91/138 - loss 0.39259301 - time (sec): 1.89 - samples/sec: 8138.72 - lr: 0.000013 - momentum: 0.000000 2023-10-18 14:39:24,108 epoch 8 - iter 104/138 - loss 0.39176225 - time (sec): 2.16 - samples/sec: 7980.64 - lr: 0.000013 - momentum: 0.000000 2023-10-18 14:39:24,386 epoch 8 - iter 117/138 - loss 0.39795567 - time (sec): 2.44 - samples/sec: 7968.03 - lr: 0.000012 - momentum: 0.000000 2023-10-18 14:39:24,665 epoch 8 - iter 130/138 - loss 0.39112207 - time (sec): 2.72 - samples/sec: 7951.38 - lr: 0.000012 - momentum: 0.000000 2023-10-18 14:39:24,828 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:39:24,828 EPOCH 8 done: loss 0.3900 - lr: 0.000012 2023-10-18 14:39:25,195 DEV : loss 0.311394602060318 - f1-score (micro avg) 0.5645 2023-10-18 14:39:25,199 saving best model 2023-10-18 14:39:25,238 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:39:25,517 epoch 9 - iter 13/138 - loss 0.40021116 - time (sec): 0.28 - samples/sec: 7192.63 - lr: 0.000011 - momentum: 0.000000 2023-10-18 14:39:25,807 epoch 9 - iter 26/138 - loss 0.41025763 - time (sec): 0.57 - samples/sec: 7444.13 - lr: 0.000010 - momentum: 0.000000 2023-10-18 14:39:26,091 epoch 9 - iter 39/138 - loss 0.39251032 - time (sec): 0.85 - samples/sec: 7450.17 - lr: 0.000010 - momentum: 0.000000 2023-10-18 14:39:26,363 epoch 9 - iter 52/138 - loss 0.42319775 - time (sec): 1.12 - samples/sec: 7464.97 - lr: 0.000009 - momentum: 0.000000 2023-10-18 14:39:26,637 epoch 9 - iter 65/138 - loss 0.41296078 - time (sec): 1.40 - samples/sec: 7673.43 - lr: 0.000009 - momentum: 0.000000 2023-10-18 14:39:26,920 epoch 9 - iter 78/138 - loss 0.42416776 - time (sec): 1.68 - samples/sec: 7665.01 - lr: 0.000008 - momentum: 0.000000 2023-10-18 14:39:27,197 epoch 9 - iter 91/138 - loss 0.40835513 - time (sec): 1.96 - samples/sec: 7724.18 - lr: 0.000008 - momentum: 0.000000 2023-10-18 14:39:27,469 epoch 9 - iter 104/138 - loss 0.40688293 - time (sec): 2.23 - samples/sec: 7727.59 - lr: 0.000007 - momentum: 0.000000 2023-10-18 14:39:27,777 epoch 9 - iter 117/138 - loss 0.39452867 - time (sec): 2.54 - samples/sec: 7707.69 - lr: 0.000007 - momentum: 0.000000 2023-10-18 14:39:28,070 epoch 9 - iter 130/138 - loss 0.39162340 - time (sec): 2.83 - samples/sec: 7651.09 - lr: 0.000006 - momentum: 0.000000 2023-10-18 14:39:28,256 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:39:28,257 EPOCH 9 done: loss 0.3905 - lr: 0.000006 2023-10-18 14:39:28,627 DEV : loss 0.3063945472240448 - f1-score (micro avg) 0.5707 2023-10-18 14:39:28,631 saving best model 2023-10-18 14:39:28,669 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:39:28,966 epoch 10 - iter 13/138 - loss 0.32500790 - time (sec): 0.30 - samples/sec: 7275.01 - lr: 0.000005 - momentum: 0.000000 2023-10-18 14:39:29,280 epoch 10 - iter 26/138 - loss 0.35209130 - time (sec): 0.61 - samples/sec: 7560.42 - lr: 0.000005 - momentum: 0.000000 2023-10-18 14:39:29,558 epoch 10 - iter 39/138 - loss 0.36398068 - time (sec): 0.89 - samples/sec: 7398.94 - lr: 0.000004 - momentum: 0.000000 2023-10-18 14:39:29,839 epoch 10 - iter 52/138 - loss 0.36562887 - time (sec): 1.17 - samples/sec: 7321.56 - lr: 0.000004 - momentum: 0.000000 2023-10-18 14:39:30,130 epoch 10 - iter 65/138 - loss 0.36881734 - time (sec): 1.46 - samples/sec: 7271.07 - lr: 0.000003 - momentum: 0.000000 2023-10-18 14:39:30,407 epoch 10 - iter 78/138 - loss 0.36648456 - time (sec): 1.74 - samples/sec: 7328.41 - lr: 0.000003 - momentum: 0.000000 2023-10-18 14:39:30,701 epoch 10 - iter 91/138 - loss 0.36321895 - time (sec): 2.03 - samples/sec: 7450.28 - lr: 0.000002 - momentum: 0.000000 2023-10-18 14:39:30,991 epoch 10 - iter 104/138 - loss 0.37424437 - time (sec): 2.32 - samples/sec: 7464.30 - lr: 0.000002 - momentum: 0.000000 2023-10-18 14:39:31,265 epoch 10 - iter 117/138 - loss 0.37687308 - time (sec): 2.60 - samples/sec: 7469.61 - lr: 0.000001 - momentum: 0.000000 2023-10-18 14:39:31,563 epoch 10 - iter 130/138 - loss 0.38173002 - time (sec): 2.89 - samples/sec: 7426.40 - lr: 0.000000 - momentum: 0.000000 2023-10-18 14:39:31,739 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:39:31,740 EPOCH 10 done: loss 0.3802 - lr: 0.000000 2023-10-18 14:39:32,107 DEV : loss 0.30349376797676086 - f1-score (micro avg) 0.5666 2023-10-18 14:39:32,140 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:39:32,141 Loading model from best epoch ... 2023-10-18 14:39:32,222 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-18 14:39:32,520 Results: - F-score (micro) 0.5987 - F-score (macro) 0.3472 - Accuracy 0.4328 By class: precision recall f1-score support scope 0.5938 0.6477 0.6196 176 pers 0.8350 0.6719 0.7446 128 work 0.3265 0.4324 0.3721 74 object 0.0000 0.0000 0.0000 2 loc 0.0000 0.0000 0.0000 2 micro avg 0.5903 0.6073 0.5987 382 macro avg 0.3510 0.3504 0.3472 382 weighted avg 0.6166 0.6073 0.6070 382 2023-10-18 14:39:32,520 ----------------------------------------------------------------------------------------------------