2023-10-17 10:20:53,547 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:20:53,548 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 10:20:53,548 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:20:53,548 MultiCorpus: 1214 train + 266 dev + 251 test sentences - NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator 2023-10-17 10:20:53,548 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:20:53,548 Train: 1214 sentences 2023-10-17 10:20:53,548 (train_with_dev=False, train_with_test=False) 2023-10-17 10:20:53,548 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:20:53,548 Training Params: 2023-10-17 10:20:53,548 - learning_rate: "5e-05" 2023-10-17 10:20:53,548 - mini_batch_size: "8" 2023-10-17 10:20:53,548 - max_epochs: "10" 2023-10-17 10:20:53,548 - shuffle: "True" 2023-10-17 10:20:53,548 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:20:53,549 Plugins: 2023-10-17 10:20:53,549 - TensorboardLogger 2023-10-17 10:20:53,549 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 10:20:53,549 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:20:53,549 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 10:20:53,549 - metric: "('micro avg', 'f1-score')" 2023-10-17 10:20:53,549 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:20:53,549 Computation: 2023-10-17 10:20:53,549 - compute on device: cuda:0 2023-10-17 10:20:53,549 - embedding storage: none 2023-10-17 10:20:53,549 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:20:53,549 Model training base path: "hmbench-ajmc/en-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-17 10:20:53,549 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:20:53,549 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:20:53,549 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 10:20:54,403 epoch 1 - iter 15/152 - loss 4.13996917 - time (sec): 0.85 - samples/sec: 3458.89 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:20:55,272 epoch 1 - iter 30/152 - loss 3.46761275 - time (sec): 1.72 - samples/sec: 3492.85 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:20:56,122 epoch 1 - iter 45/152 - loss 2.64766035 - time (sec): 2.57 - samples/sec: 3479.13 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:20:56,977 epoch 1 - iter 60/152 - loss 2.13419318 - time (sec): 3.43 - samples/sec: 3495.65 - lr: 0.000019 - momentum: 0.000000 2023-10-17 10:20:57,839 epoch 1 - iter 75/152 - loss 1.79758948 - time (sec): 4.29 - samples/sec: 3513.85 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:20:58,724 epoch 1 - iter 90/152 - loss 1.57046363 - time (sec): 5.17 - samples/sec: 3526.54 - lr: 0.000029 - momentum: 0.000000 2023-10-17 10:20:59,528 epoch 1 - iter 105/152 - loss 1.40630831 - time (sec): 5.98 - samples/sec: 3512.91 - lr: 0.000034 - momentum: 0.000000 2023-10-17 10:21:00,430 epoch 1 - iter 120/152 - loss 1.25564505 - time (sec): 6.88 - samples/sec: 3529.55 - lr: 0.000039 - momentum: 0.000000 2023-10-17 10:21:01,275 epoch 1 - iter 135/152 - loss 1.13636163 - time (sec): 7.72 - samples/sec: 3561.51 - lr: 0.000044 - momentum: 0.000000 2023-10-17 10:21:02,143 epoch 1 - iter 150/152 - loss 1.04828158 - time (sec): 8.59 - samples/sec: 3560.30 - lr: 0.000049 - momentum: 0.000000 2023-10-17 10:21:02,237 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:21:02,238 EPOCH 1 done: loss 1.0381 - lr: 0.000049 2023-10-17 10:21:03,295 DEV : loss 0.206589475274086 - f1-score (micro avg) 0.5536 2023-10-17 10:21:03,303 saving best model 2023-10-17 10:21:03,652 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:21:04,497 epoch 2 - iter 15/152 - loss 0.19093756 - time (sec): 0.84 - samples/sec: 3526.74 - lr: 0.000049 - momentum: 0.000000 2023-10-17 10:21:05,341 epoch 2 - iter 30/152 - loss 0.18722834 - time (sec): 1.69 - samples/sec: 3625.20 - lr: 0.000049 - momentum: 0.000000 2023-10-17 10:21:06,177 epoch 2 - iter 45/152 - loss 0.17371351 - time (sec): 2.52 - samples/sec: 3635.58 - lr: 0.000048 - momentum: 0.000000 2023-10-17 10:21:07,006 epoch 2 - iter 60/152 - loss 0.17155827 - time (sec): 3.35 - samples/sec: 3691.85 - lr: 0.000048 - momentum: 0.000000 2023-10-17 10:21:07,864 epoch 2 - iter 75/152 - loss 0.16587935 - time (sec): 4.21 - samples/sec: 3652.64 - lr: 0.000047 - momentum: 0.000000 2023-10-17 10:21:08,695 epoch 2 - iter 90/152 - loss 0.16197003 - time (sec): 5.04 - samples/sec: 3629.79 - lr: 0.000047 - momentum: 0.000000 2023-10-17 10:21:09,573 epoch 2 - iter 105/152 - loss 0.15785632 - time (sec): 5.92 - samples/sec: 3611.33 - lr: 0.000046 - momentum: 0.000000 2023-10-17 10:21:10,450 epoch 2 - iter 120/152 - loss 0.15542643 - time (sec): 6.80 - samples/sec: 3602.49 - lr: 0.000046 - momentum: 0.000000 2023-10-17 10:21:11,315 epoch 2 - iter 135/152 - loss 0.15433622 - time (sec): 7.66 - samples/sec: 3574.95 - lr: 0.000045 - momentum: 0.000000 2023-10-17 10:21:12,142 epoch 2 - iter 150/152 - loss 0.14553977 - time (sec): 8.49 - samples/sec: 3593.94 - lr: 0.000045 - momentum: 0.000000 2023-10-17 10:21:12,262 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:21:12,262 EPOCH 2 done: loss 0.1442 - lr: 0.000045 2023-10-17 10:21:13,279 DEV : loss 0.14839822053909302 - f1-score (micro avg) 0.8005 2023-10-17 10:21:13,288 saving best model 2023-10-17 10:21:13,730 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:21:14,648 epoch 3 - iter 15/152 - loss 0.06014581 - time (sec): 0.91 - samples/sec: 3132.29 - lr: 0.000044 - momentum: 0.000000 2023-10-17 10:21:15,545 epoch 3 - iter 30/152 - loss 0.06888070 - time (sec): 1.81 - samples/sec: 3338.50 - lr: 0.000043 - momentum: 0.000000 2023-10-17 10:21:16,425 epoch 3 - iter 45/152 - loss 0.07869788 - time (sec): 2.69 - samples/sec: 3382.02 - lr: 0.000043 - momentum: 0.000000 2023-10-17 10:21:17,276 epoch 3 - iter 60/152 - loss 0.08240161 - time (sec): 3.54 - samples/sec: 3494.34 - lr: 0.000042 - momentum: 0.000000 2023-10-17 10:21:18,132 epoch 3 - iter 75/152 - loss 0.08746841 - time (sec): 4.40 - samples/sec: 3556.94 - lr: 0.000042 - momentum: 0.000000 2023-10-17 10:21:18,945 epoch 3 - iter 90/152 - loss 0.08907252 - time (sec): 5.21 - samples/sec: 3597.66 - lr: 0.000041 - momentum: 0.000000 2023-10-17 10:21:19,763 epoch 3 - iter 105/152 - loss 0.08457069 - time (sec): 6.03 - samples/sec: 3639.69 - lr: 0.000041 - momentum: 0.000000 2023-10-17 10:21:20,577 epoch 3 - iter 120/152 - loss 0.08116355 - time (sec): 6.84 - samples/sec: 3623.40 - lr: 0.000040 - momentum: 0.000000 2023-10-17 10:21:21,413 epoch 3 - iter 135/152 - loss 0.07990992 - time (sec): 7.68 - samples/sec: 3626.36 - lr: 0.000040 - momentum: 0.000000 2023-10-17 10:21:22,243 epoch 3 - iter 150/152 - loss 0.07924719 - time (sec): 8.51 - samples/sec: 3596.11 - lr: 0.000039 - momentum: 0.000000 2023-10-17 10:21:22,363 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:21:22,363 EPOCH 3 done: loss 0.0786 - lr: 0.000039 2023-10-17 10:21:23,336 DEV : loss 0.1418256163597107 - f1-score (micro avg) 0.8343 2023-10-17 10:21:23,343 saving best model 2023-10-17 10:21:23,806 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:21:24,697 epoch 4 - iter 15/152 - loss 0.07257197 - time (sec): 0.89 - samples/sec: 3376.00 - lr: 0.000038 - momentum: 0.000000 2023-10-17 10:21:25,532 epoch 4 - iter 30/152 - loss 0.06426303 - time (sec): 1.72 - samples/sec: 3390.36 - lr: 0.000038 - momentum: 0.000000 2023-10-17 10:21:26,414 epoch 4 - iter 45/152 - loss 0.06319758 - time (sec): 2.61 - samples/sec: 3531.50 - lr: 0.000037 - momentum: 0.000000 2023-10-17 10:21:27,252 epoch 4 - iter 60/152 - loss 0.05695866 - time (sec): 3.44 - samples/sec: 3487.01 - lr: 0.000037 - momentum: 0.000000 2023-10-17 10:21:28,138 epoch 4 - iter 75/152 - loss 0.05408339 - time (sec): 4.33 - samples/sec: 3488.48 - lr: 0.000036 - momentum: 0.000000 2023-10-17 10:21:28,989 epoch 4 - iter 90/152 - loss 0.05746597 - time (sec): 5.18 - samples/sec: 3485.36 - lr: 0.000036 - momentum: 0.000000 2023-10-17 10:21:29,810 epoch 4 - iter 105/152 - loss 0.05758814 - time (sec): 6.00 - samples/sec: 3494.33 - lr: 0.000035 - momentum: 0.000000 2023-10-17 10:21:30,653 epoch 4 - iter 120/152 - loss 0.05995760 - time (sec): 6.85 - samples/sec: 3503.91 - lr: 0.000035 - momentum: 0.000000 2023-10-17 10:21:31,537 epoch 4 - iter 135/152 - loss 0.05793896 - time (sec): 7.73 - samples/sec: 3539.60 - lr: 0.000034 - momentum: 0.000000 2023-10-17 10:21:32,424 epoch 4 - iter 150/152 - loss 0.06155153 - time (sec): 8.62 - samples/sec: 3552.03 - lr: 0.000034 - momentum: 0.000000 2023-10-17 10:21:32,527 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:21:32,527 EPOCH 4 done: loss 0.0608 - lr: 0.000034 2023-10-17 10:21:33,505 DEV : loss 0.1754484623670578 - f1-score (micro avg) 0.8461 2023-10-17 10:21:33,512 saving best model 2023-10-17 10:21:33,958 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:21:34,813 epoch 5 - iter 15/152 - loss 0.04832152 - time (sec): 0.85 - samples/sec: 3505.28 - lr: 0.000033 - momentum: 0.000000 2023-10-17 10:21:35,678 epoch 5 - iter 30/152 - loss 0.05065053 - time (sec): 1.72 - samples/sec: 3519.13 - lr: 0.000032 - momentum: 0.000000 2023-10-17 10:21:36,518 epoch 5 - iter 45/152 - loss 0.05033563 - time (sec): 2.56 - samples/sec: 3625.13 - lr: 0.000032 - momentum: 0.000000 2023-10-17 10:21:37,368 epoch 5 - iter 60/152 - loss 0.05565228 - time (sec): 3.41 - samples/sec: 3577.44 - lr: 0.000031 - momentum: 0.000000 2023-10-17 10:21:38,205 epoch 5 - iter 75/152 - loss 0.05171994 - time (sec): 4.25 - samples/sec: 3569.20 - lr: 0.000031 - momentum: 0.000000 2023-10-17 10:21:39,044 epoch 5 - iter 90/152 - loss 0.04675863 - time (sec): 5.08 - samples/sec: 3526.43 - lr: 0.000030 - momentum: 0.000000 2023-10-17 10:21:39,896 epoch 5 - iter 105/152 - loss 0.04425283 - time (sec): 5.94 - samples/sec: 3585.72 - lr: 0.000030 - momentum: 0.000000 2023-10-17 10:21:40,791 epoch 5 - iter 120/152 - loss 0.05117550 - time (sec): 6.83 - samples/sec: 3589.06 - lr: 0.000029 - momentum: 0.000000 2023-10-17 10:21:41,624 epoch 5 - iter 135/152 - loss 0.05185766 - time (sec): 7.66 - samples/sec: 3554.42 - lr: 0.000029 - momentum: 0.000000 2023-10-17 10:21:42,478 epoch 5 - iter 150/152 - loss 0.04832041 - time (sec): 8.52 - samples/sec: 3587.97 - lr: 0.000028 - momentum: 0.000000 2023-10-17 10:21:42,607 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:21:42,607 EPOCH 5 done: loss 0.0477 - lr: 0.000028 2023-10-17 10:21:43,550 DEV : loss 0.17468248307704926 - f1-score (micro avg) 0.8494 2023-10-17 10:21:43,557 saving best model 2023-10-17 10:21:43,978 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:21:44,838 epoch 6 - iter 15/152 - loss 0.01944411 - time (sec): 0.85 - samples/sec: 3608.43 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:21:45,724 epoch 6 - iter 30/152 - loss 0.03490156 - time (sec): 1.74 - samples/sec: 3587.01 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:21:46,566 epoch 6 - iter 45/152 - loss 0.03279006 - time (sec): 2.58 - samples/sec: 3585.76 - lr: 0.000026 - momentum: 0.000000 2023-10-17 10:21:47,368 epoch 6 - iter 60/152 - loss 0.03050177 - time (sec): 3.38 - samples/sec: 3577.10 - lr: 0.000026 - momentum: 0.000000 2023-10-17 10:21:48,264 epoch 6 - iter 75/152 - loss 0.03267751 - time (sec): 4.28 - samples/sec: 3598.23 - lr: 0.000025 - momentum: 0.000000 2023-10-17 10:21:49,140 epoch 6 - iter 90/152 - loss 0.02936129 - time (sec): 5.16 - samples/sec: 3553.68 - lr: 0.000025 - momentum: 0.000000 2023-10-17 10:21:50,002 epoch 6 - iter 105/152 - loss 0.02858257 - time (sec): 6.02 - samples/sec: 3566.28 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:21:50,867 epoch 6 - iter 120/152 - loss 0.02879724 - time (sec): 6.88 - samples/sec: 3558.64 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:21:51,772 epoch 6 - iter 135/152 - loss 0.02969643 - time (sec): 7.79 - samples/sec: 3536.58 - lr: 0.000023 - momentum: 0.000000 2023-10-17 10:21:52,631 epoch 6 - iter 150/152 - loss 0.03245760 - time (sec): 8.65 - samples/sec: 3547.49 - lr: 0.000022 - momentum: 0.000000 2023-10-17 10:21:52,731 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:21:52,731 EPOCH 6 done: loss 0.0326 - lr: 0.000022 2023-10-17 10:21:53,728 DEV : loss 0.1806900054216385 - f1-score (micro avg) 0.838 2023-10-17 10:21:53,735 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:21:54,611 epoch 7 - iter 15/152 - loss 0.03557474 - time (sec): 0.87 - samples/sec: 3578.87 - lr: 0.000022 - momentum: 0.000000 2023-10-17 10:21:55,520 epoch 7 - iter 30/152 - loss 0.03126429 - time (sec): 1.78 - samples/sec: 3713.56 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:21:56,393 epoch 7 - iter 45/152 - loss 0.02335225 - time (sec): 2.66 - samples/sec: 3705.13 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:21:57,226 epoch 7 - iter 60/152 - loss 0.02327013 - time (sec): 3.49 - samples/sec: 3698.49 - lr: 0.000020 - momentum: 0.000000 2023-10-17 10:21:58,088 epoch 7 - iter 75/152 - loss 0.02472991 - time (sec): 4.35 - samples/sec: 3638.49 - lr: 0.000020 - momentum: 0.000000 2023-10-17 10:21:58,937 epoch 7 - iter 90/152 - loss 0.02372267 - time (sec): 5.20 - samples/sec: 3605.89 - lr: 0.000019 - momentum: 0.000000 2023-10-17 10:21:59,836 epoch 7 - iter 105/152 - loss 0.02498016 - time (sec): 6.10 - samples/sec: 3580.56 - lr: 0.000019 - momentum: 0.000000 2023-10-17 10:22:00,650 epoch 7 - iter 120/152 - loss 0.02545515 - time (sec): 6.91 - samples/sec: 3567.97 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:22:01,513 epoch 7 - iter 135/152 - loss 0.02532419 - time (sec): 7.78 - samples/sec: 3564.78 - lr: 0.000017 - momentum: 0.000000 2023-10-17 10:22:02,357 epoch 7 - iter 150/152 - loss 0.02715959 - time (sec): 8.62 - samples/sec: 3541.35 - lr: 0.000017 - momentum: 0.000000 2023-10-17 10:22:02,464 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:22:02,464 EPOCH 7 done: loss 0.0268 - lr: 0.000017 2023-10-17 10:22:03,434 DEV : loss 0.18513934314250946 - f1-score (micro avg) 0.8551 2023-10-17 10:22:03,441 saving best model 2023-10-17 10:22:03,890 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:22:04,699 epoch 8 - iter 15/152 - loss 0.03913568 - time (sec): 0.81 - samples/sec: 3222.58 - lr: 0.000016 - momentum: 0.000000 2023-10-17 10:22:05,513 epoch 8 - iter 30/152 - loss 0.02400707 - time (sec): 1.62 - samples/sec: 3284.97 - lr: 0.000016 - momentum: 0.000000 2023-10-17 10:22:06,412 epoch 8 - iter 45/152 - loss 0.02011310 - time (sec): 2.52 - samples/sec: 3458.27 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:22:07,331 epoch 8 - iter 60/152 - loss 0.02112743 - time (sec): 3.44 - samples/sec: 3432.67 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:22:08,148 epoch 8 - iter 75/152 - loss 0.01872410 - time (sec): 4.26 - samples/sec: 3487.92 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:22:08,987 epoch 8 - iter 90/152 - loss 0.01698178 - time (sec): 5.10 - samples/sec: 3533.80 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:22:09,861 epoch 8 - iter 105/152 - loss 0.01596958 - time (sec): 5.97 - samples/sec: 3528.51 - lr: 0.000013 - momentum: 0.000000 2023-10-17 10:22:10,702 epoch 8 - iter 120/152 - loss 0.01617424 - time (sec): 6.81 - samples/sec: 3519.65 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:22:11,584 epoch 8 - iter 135/152 - loss 0.01808541 - time (sec): 7.69 - samples/sec: 3529.87 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:22:12,428 epoch 8 - iter 150/152 - loss 0.01849872 - time (sec): 8.54 - samples/sec: 3572.54 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:22:12,562 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:22:12,563 EPOCH 8 done: loss 0.0182 - lr: 0.000011 2023-10-17 10:22:13,573 DEV : loss 0.18587149679660797 - f1-score (micro avg) 0.8571 2023-10-17 10:22:13,580 saving best model 2023-10-17 10:22:14,014 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:22:14,879 epoch 9 - iter 15/152 - loss 0.03171371 - time (sec): 0.86 - samples/sec: 3398.58 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:22:15,766 epoch 9 - iter 30/152 - loss 0.01991696 - time (sec): 1.75 - samples/sec: 3604.69 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:22:16,649 epoch 9 - iter 45/152 - loss 0.01991665 - time (sec): 2.63 - samples/sec: 3566.38 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:22:17,489 epoch 9 - iter 60/152 - loss 0.01708208 - time (sec): 3.47 - samples/sec: 3590.79 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:22:18,404 epoch 9 - iter 75/152 - loss 0.01483370 - time (sec): 4.39 - samples/sec: 3540.77 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:22:19,459 epoch 9 - iter 90/152 - loss 0.01589577 - time (sec): 5.44 - samples/sec: 3410.71 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:22:20,387 epoch 9 - iter 105/152 - loss 0.01460746 - time (sec): 6.37 - samples/sec: 3371.20 - lr: 0.000007 - momentum: 0.000000 2023-10-17 10:22:21,260 epoch 9 - iter 120/152 - loss 0.01391634 - time (sec): 7.24 - samples/sec: 3378.32 - lr: 0.000007 - momentum: 0.000000 2023-10-17 10:22:22,133 epoch 9 - iter 135/152 - loss 0.01493864 - time (sec): 8.12 - samples/sec: 3391.42 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:22:23,022 epoch 9 - iter 150/152 - loss 0.01394751 - time (sec): 9.00 - samples/sec: 3407.70 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:22:23,116 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:22:23,116 EPOCH 9 done: loss 0.0140 - lr: 0.000006 2023-10-17 10:22:24,086 DEV : loss 0.19769912958145142 - f1-score (micro avg) 0.8622 2023-10-17 10:22:24,095 saving best model 2023-10-17 10:22:24,530 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:22:25,448 epoch 10 - iter 15/152 - loss 0.02304944 - time (sec): 0.92 - samples/sec: 3495.37 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:22:26,282 epoch 10 - iter 30/152 - loss 0.01518266 - time (sec): 1.75 - samples/sec: 3572.54 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:22:27,172 epoch 10 - iter 45/152 - loss 0.01286667 - time (sec): 2.64 - samples/sec: 3633.68 - lr: 0.000004 - momentum: 0.000000 2023-10-17 10:22:28,019 epoch 10 - iter 60/152 - loss 0.01970669 - time (sec): 3.49 - samples/sec: 3565.22 - lr: 0.000004 - momentum: 0.000000 2023-10-17 10:22:28,931 epoch 10 - iter 75/152 - loss 0.01609411 - time (sec): 4.40 - samples/sec: 3525.39 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:22:29,734 epoch 10 - iter 90/152 - loss 0.01450987 - time (sec): 5.20 - samples/sec: 3535.91 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:22:30,615 epoch 10 - iter 105/152 - loss 0.01304704 - time (sec): 6.08 - samples/sec: 3536.79 - lr: 0.000002 - momentum: 0.000000 2023-10-17 10:22:31,476 epoch 10 - iter 120/152 - loss 0.01164193 - time (sec): 6.94 - samples/sec: 3554.01 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:22:32,316 epoch 10 - iter 135/152 - loss 0.01101577 - time (sec): 7.78 - samples/sec: 3548.53 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:22:33,204 epoch 10 - iter 150/152 - loss 0.01034020 - time (sec): 8.67 - samples/sec: 3545.66 - lr: 0.000000 - momentum: 0.000000 2023-10-17 10:22:33,305 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:22:33,305 EPOCH 10 done: loss 0.0103 - lr: 0.000000 2023-10-17 10:22:34,301 DEV : loss 0.19832785427570343 - f1-score (micro avg) 0.8633 2023-10-17 10:22:34,309 saving best model 2023-10-17 10:22:35,065 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:22:35,066 Loading model from best epoch ... 2023-10-17 10:22:36,479 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object 2023-10-17 10:22:37,243 Results: - F-score (micro) 0.8204 - F-score (macro) 0.6609 - Accuracy 0.6988 By class: precision recall f1-score support scope 0.7722 0.8079 0.7896 151 work 0.7477 0.8737 0.8058 95 pers 0.8824 0.9375 0.9091 96 date 0.0000 0.0000 0.0000 3 loc 1.0000 0.6667 0.8000 3 micro avg 0.7899 0.8534 0.8204 348 macro avg 0.6805 0.6572 0.6609 348 weighted avg 0.7912 0.8534 0.8203 348 2023-10-17 10:22:37,243 ----------------------------------------------------------------------------------------------------