2023-10-17 09:38:34,141 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:38:34,142 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 09:38:34,142 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:38:34,143 MultiCorpus: 1214 train + 266 dev + 251 test sentences - NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator 2023-10-17 09:38:34,143 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:38:34,143 Train: 1214 sentences 2023-10-17 09:38:34,143 (train_with_dev=False, train_with_test=False) 2023-10-17 09:38:34,143 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:38:34,143 Training Params: 2023-10-17 09:38:34,143 - learning_rate: "5e-05" 2023-10-17 09:38:34,143 - mini_batch_size: "8" 2023-10-17 09:38:34,143 - max_epochs: "10" 2023-10-17 09:38:34,143 - shuffle: "True" 2023-10-17 09:38:34,143 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:38:34,143 Plugins: 2023-10-17 09:38:34,143 - TensorboardLogger 2023-10-17 09:38:34,143 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 09:38:34,143 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:38:34,143 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 09:38:34,143 - metric: "('micro avg', 'f1-score')" 2023-10-17 09:38:34,143 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:38:34,143 Computation: 2023-10-17 09:38:34,143 - compute on device: cuda:0 2023-10-17 09:38:34,143 - embedding storage: none 2023-10-17 09:38:34,143 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:38:34,143 Model training base path: "hmbench-ajmc/en-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-17 09:38:34,143 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:38:34,143 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:38:34,143 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 09:38:34,950 epoch 1 - iter 15/152 - loss 3.27859425 - time (sec): 0.81 - samples/sec: 3732.51 - lr: 0.000005 - momentum: 0.000000 2023-10-17 09:38:35,826 epoch 1 - iter 30/152 - loss 2.72599269 - time (sec): 1.68 - samples/sec: 3539.20 - lr: 0.000010 - momentum: 0.000000 2023-10-17 09:38:36,686 epoch 1 - iter 45/152 - loss 2.08274849 - time (sec): 2.54 - samples/sec: 3696.16 - lr: 0.000014 - momentum: 0.000000 2023-10-17 09:38:37,548 epoch 1 - iter 60/152 - loss 1.69026109 - time (sec): 3.40 - samples/sec: 3728.36 - lr: 0.000019 - momentum: 0.000000 2023-10-17 09:38:38,379 epoch 1 - iter 75/152 - loss 1.46877182 - time (sec): 4.24 - samples/sec: 3650.48 - lr: 0.000024 - momentum: 0.000000 2023-10-17 09:38:39,218 epoch 1 - iter 90/152 - loss 1.28998562 - time (sec): 5.07 - samples/sec: 3599.89 - lr: 0.000029 - momentum: 0.000000 2023-10-17 09:38:40,116 epoch 1 - iter 105/152 - loss 1.15167153 - time (sec): 5.97 - samples/sec: 3585.63 - lr: 0.000034 - momentum: 0.000000 2023-10-17 09:38:40,972 epoch 1 - iter 120/152 - loss 1.05035412 - time (sec): 6.83 - samples/sec: 3580.80 - lr: 0.000039 - momentum: 0.000000 2023-10-17 09:38:41,843 epoch 1 - iter 135/152 - loss 0.96613510 - time (sec): 7.70 - samples/sec: 3566.35 - lr: 0.000044 - momentum: 0.000000 2023-10-17 09:38:42,715 epoch 1 - iter 150/152 - loss 0.89078976 - time (sec): 8.57 - samples/sec: 3576.29 - lr: 0.000049 - momentum: 0.000000 2023-10-17 09:38:42,823 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:38:42,823 EPOCH 1 done: loss 0.8833 - lr: 0.000049 2023-10-17 09:38:43,599 DEV : loss 0.1987733244895935 - f1-score (micro avg) 0.6076 2023-10-17 09:38:43,606 saving best model 2023-10-17 09:38:43,940 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:38:44,818 epoch 2 - iter 15/152 - loss 0.20082465 - time (sec): 0.88 - samples/sec: 3433.40 - lr: 0.000049 - momentum: 0.000000 2023-10-17 09:38:45,655 epoch 2 - iter 30/152 - loss 0.19478102 - time (sec): 1.71 - samples/sec: 3565.02 - lr: 0.000049 - momentum: 0.000000 2023-10-17 09:38:46,493 epoch 2 - iter 45/152 - loss 0.17303971 - time (sec): 2.55 - samples/sec: 3611.14 - lr: 0.000048 - momentum: 0.000000 2023-10-17 09:38:47,364 epoch 2 - iter 60/152 - loss 0.16161793 - time (sec): 3.42 - samples/sec: 3578.45 - lr: 0.000048 - momentum: 0.000000 2023-10-17 09:38:48,245 epoch 2 - iter 75/152 - loss 0.15733415 - time (sec): 4.30 - samples/sec: 3583.42 - lr: 0.000047 - momentum: 0.000000 2023-10-17 09:38:49,039 epoch 2 - iter 90/152 - loss 0.14995035 - time (sec): 5.10 - samples/sec: 3592.35 - lr: 0.000047 - momentum: 0.000000 2023-10-17 09:38:49,872 epoch 2 - iter 105/152 - loss 0.14810154 - time (sec): 5.93 - samples/sec: 3569.23 - lr: 0.000046 - momentum: 0.000000 2023-10-17 09:38:50,735 epoch 2 - iter 120/152 - loss 0.14985129 - time (sec): 6.79 - samples/sec: 3609.32 - lr: 0.000046 - momentum: 0.000000 2023-10-17 09:38:51,593 epoch 2 - iter 135/152 - loss 0.14474154 - time (sec): 7.65 - samples/sec: 3625.84 - lr: 0.000045 - momentum: 0.000000 2023-10-17 09:38:52,480 epoch 2 - iter 150/152 - loss 0.14450170 - time (sec): 8.54 - samples/sec: 3594.70 - lr: 0.000045 - momentum: 0.000000 2023-10-17 09:38:52,583 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:38:52,583 EPOCH 2 done: loss 0.1435 - lr: 0.000045 2023-10-17 09:38:53,538 DEV : loss 0.13812772929668427 - f1-score (micro avg) 0.8167 2023-10-17 09:38:53,544 saving best model 2023-10-17 09:38:54,193 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:38:55,055 epoch 3 - iter 15/152 - loss 0.08406498 - time (sec): 0.86 - samples/sec: 3394.23 - lr: 0.000044 - momentum: 0.000000 2023-10-17 09:38:55,855 epoch 3 - iter 30/152 - loss 0.08216989 - time (sec): 1.66 - samples/sec: 3549.93 - lr: 0.000043 - momentum: 0.000000 2023-10-17 09:38:56,689 epoch 3 - iter 45/152 - loss 0.08400782 - time (sec): 2.49 - samples/sec: 3517.04 - lr: 0.000043 - momentum: 0.000000 2023-10-17 09:38:57,506 epoch 3 - iter 60/152 - loss 0.07951715 - time (sec): 3.31 - samples/sec: 3543.39 - lr: 0.000042 - momentum: 0.000000 2023-10-17 09:38:58,410 epoch 3 - iter 75/152 - loss 0.07134940 - time (sec): 4.21 - samples/sec: 3594.73 - lr: 0.000042 - momentum: 0.000000 2023-10-17 09:38:59,296 epoch 3 - iter 90/152 - loss 0.08497999 - time (sec): 5.10 - samples/sec: 3594.94 - lr: 0.000041 - momentum: 0.000000 2023-10-17 09:39:00,149 epoch 3 - iter 105/152 - loss 0.08940396 - time (sec): 5.95 - samples/sec: 3603.39 - lr: 0.000041 - momentum: 0.000000 2023-10-17 09:39:01,011 epoch 3 - iter 120/152 - loss 0.08746023 - time (sec): 6.81 - samples/sec: 3592.93 - lr: 0.000040 - momentum: 0.000000 2023-10-17 09:39:01,849 epoch 3 - iter 135/152 - loss 0.08383760 - time (sec): 7.65 - samples/sec: 3575.69 - lr: 0.000040 - momentum: 0.000000 2023-10-17 09:39:02,739 epoch 3 - iter 150/152 - loss 0.08213378 - time (sec): 8.54 - samples/sec: 3589.37 - lr: 0.000039 - momentum: 0.000000 2023-10-17 09:39:02,841 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:39:02,842 EPOCH 3 done: loss 0.0825 - lr: 0.000039 2023-10-17 09:39:03,799 DEV : loss 0.1335965245962143 - f1-score (micro avg) 0.8278 2023-10-17 09:39:03,805 saving best model 2023-10-17 09:39:04,237 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:39:05,055 epoch 4 - iter 15/152 - loss 0.03908169 - time (sec): 0.81 - samples/sec: 3830.29 - lr: 0.000038 - momentum: 0.000000 2023-10-17 09:39:05,898 epoch 4 - iter 30/152 - loss 0.04629587 - time (sec): 1.65 - samples/sec: 3707.14 - lr: 0.000038 - momentum: 0.000000 2023-10-17 09:39:06,753 epoch 4 - iter 45/152 - loss 0.05984097 - time (sec): 2.51 - samples/sec: 3578.75 - lr: 0.000037 - momentum: 0.000000 2023-10-17 09:39:07,691 epoch 4 - iter 60/152 - loss 0.05594081 - time (sec): 3.45 - samples/sec: 3589.48 - lr: 0.000037 - momentum: 0.000000 2023-10-17 09:39:08,547 epoch 4 - iter 75/152 - loss 0.05336373 - time (sec): 4.30 - samples/sec: 3554.56 - lr: 0.000036 - momentum: 0.000000 2023-10-17 09:39:09,403 epoch 4 - iter 90/152 - loss 0.05316291 - time (sec): 5.16 - samples/sec: 3529.17 - lr: 0.000036 - momentum: 0.000000 2023-10-17 09:39:10,283 epoch 4 - iter 105/152 - loss 0.05437028 - time (sec): 6.04 - samples/sec: 3544.89 - lr: 0.000035 - momentum: 0.000000 2023-10-17 09:39:11,123 epoch 4 - iter 120/152 - loss 0.05389946 - time (sec): 6.88 - samples/sec: 3571.08 - lr: 0.000035 - momentum: 0.000000 2023-10-17 09:39:12,003 epoch 4 - iter 135/152 - loss 0.05636939 - time (sec): 7.76 - samples/sec: 3563.84 - lr: 0.000034 - momentum: 0.000000 2023-10-17 09:39:12,831 epoch 4 - iter 150/152 - loss 0.05731052 - time (sec): 8.59 - samples/sec: 3575.72 - lr: 0.000034 - momentum: 0.000000 2023-10-17 09:39:12,936 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:39:12,936 EPOCH 4 done: loss 0.0574 - lr: 0.000034 2023-10-17 09:39:13,907 DEV : loss 0.14786171913146973 - f1-score (micro avg) 0.83 2023-10-17 09:39:13,915 saving best model 2023-10-17 09:39:14,340 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:39:15,265 epoch 5 - iter 15/152 - loss 0.05904708 - time (sec): 0.92 - samples/sec: 3658.66 - lr: 0.000033 - momentum: 0.000000 2023-10-17 09:39:16,172 epoch 5 - iter 30/152 - loss 0.04451243 - time (sec): 1.83 - samples/sec: 3466.96 - lr: 0.000032 - momentum: 0.000000 2023-10-17 09:39:17,059 epoch 5 - iter 45/152 - loss 0.04383666 - time (sec): 2.72 - samples/sec: 3563.23 - lr: 0.000032 - momentum: 0.000000 2023-10-17 09:39:17,903 epoch 5 - iter 60/152 - loss 0.03689764 - time (sec): 3.56 - samples/sec: 3569.59 - lr: 0.000031 - momentum: 0.000000 2023-10-17 09:39:18,733 epoch 5 - iter 75/152 - loss 0.03291934 - time (sec): 4.39 - samples/sec: 3594.00 - lr: 0.000031 - momentum: 0.000000 2023-10-17 09:39:19,588 epoch 5 - iter 90/152 - loss 0.03570598 - time (sec): 5.25 - samples/sec: 3561.01 - lr: 0.000030 - momentum: 0.000000 2023-10-17 09:39:20,460 epoch 5 - iter 105/152 - loss 0.03541183 - time (sec): 6.12 - samples/sec: 3544.28 - lr: 0.000030 - momentum: 0.000000 2023-10-17 09:39:21,266 epoch 5 - iter 120/152 - loss 0.04052611 - time (sec): 6.92 - samples/sec: 3528.99 - lr: 0.000029 - momentum: 0.000000 2023-10-17 09:39:22,140 epoch 5 - iter 135/152 - loss 0.03986727 - time (sec): 7.80 - samples/sec: 3536.23 - lr: 0.000029 - momentum: 0.000000 2023-10-17 09:39:22,995 epoch 5 - iter 150/152 - loss 0.04344923 - time (sec): 8.65 - samples/sec: 3548.42 - lr: 0.000028 - momentum: 0.000000 2023-10-17 09:39:23,099 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:39:23,099 EPOCH 5 done: loss 0.0432 - lr: 0.000028 2023-10-17 09:39:24,037 DEV : loss 0.18125084042549133 - f1-score (micro avg) 0.8386 2023-10-17 09:39:24,044 saving best model 2023-10-17 09:39:24,518 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:39:25,402 epoch 6 - iter 15/152 - loss 0.03603697 - time (sec): 0.88 - samples/sec: 3513.16 - lr: 0.000027 - momentum: 0.000000 2023-10-17 09:39:26,246 epoch 6 - iter 30/152 - loss 0.02829961 - time (sec): 1.73 - samples/sec: 3425.55 - lr: 0.000027 - momentum: 0.000000 2023-10-17 09:39:27,132 epoch 6 - iter 45/152 - loss 0.02713591 - time (sec): 2.61 - samples/sec: 3404.97 - lr: 0.000026 - momentum: 0.000000 2023-10-17 09:39:28,057 epoch 6 - iter 60/152 - loss 0.02652119 - time (sec): 3.54 - samples/sec: 3373.27 - lr: 0.000026 - momentum: 0.000000 2023-10-17 09:39:28,955 epoch 6 - iter 75/152 - loss 0.02681934 - time (sec): 4.44 - samples/sec: 3411.40 - lr: 0.000025 - momentum: 0.000000 2023-10-17 09:39:29,812 epoch 6 - iter 90/152 - loss 0.02547718 - time (sec): 5.29 - samples/sec: 3423.75 - lr: 0.000025 - momentum: 0.000000 2023-10-17 09:39:30,715 epoch 6 - iter 105/152 - loss 0.02868459 - time (sec): 6.20 - samples/sec: 3439.26 - lr: 0.000024 - momentum: 0.000000 2023-10-17 09:39:31,560 epoch 6 - iter 120/152 - loss 0.02636207 - time (sec): 7.04 - samples/sec: 3447.80 - lr: 0.000024 - momentum: 0.000000 2023-10-17 09:39:32,381 epoch 6 - iter 135/152 - loss 0.02747651 - time (sec): 7.86 - samples/sec: 3483.43 - lr: 0.000023 - momentum: 0.000000 2023-10-17 09:39:33,286 epoch 6 - iter 150/152 - loss 0.02916397 - time (sec): 8.77 - samples/sec: 3497.20 - lr: 0.000022 - momentum: 0.000000 2023-10-17 09:39:33,389 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:39:33,389 EPOCH 6 done: loss 0.0290 - lr: 0.000022 2023-10-17 09:39:34,346 DEV : loss 0.17775358259677887 - f1-score (micro avg) 0.8392 2023-10-17 09:39:34,354 saving best model 2023-10-17 09:39:34,847 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:39:35,727 epoch 7 - iter 15/152 - loss 0.01054120 - time (sec): 0.88 - samples/sec: 3191.99 - lr: 0.000022 - momentum: 0.000000 2023-10-17 09:39:36,613 epoch 7 - iter 30/152 - loss 0.00586221 - time (sec): 1.76 - samples/sec: 3306.50 - lr: 0.000021 - momentum: 0.000000 2023-10-17 09:39:37,476 epoch 7 - iter 45/152 - loss 0.01323046 - time (sec): 2.63 - samples/sec: 3394.61 - lr: 0.000021 - momentum: 0.000000 2023-10-17 09:39:38,352 epoch 7 - iter 60/152 - loss 0.01434518 - time (sec): 3.50 - samples/sec: 3381.57 - lr: 0.000020 - momentum: 0.000000 2023-10-17 09:39:39,198 epoch 7 - iter 75/152 - loss 0.01412553 - time (sec): 4.35 - samples/sec: 3445.80 - lr: 0.000020 - momentum: 0.000000 2023-10-17 09:39:40,033 epoch 7 - iter 90/152 - loss 0.01286259 - time (sec): 5.18 - samples/sec: 3473.05 - lr: 0.000019 - momentum: 0.000000 2023-10-17 09:39:40,921 epoch 7 - iter 105/152 - loss 0.01426731 - time (sec): 6.07 - samples/sec: 3466.78 - lr: 0.000019 - momentum: 0.000000 2023-10-17 09:39:41,876 epoch 7 - iter 120/152 - loss 0.01566369 - time (sec): 7.03 - samples/sec: 3476.59 - lr: 0.000018 - momentum: 0.000000 2023-10-17 09:39:42,805 epoch 7 - iter 135/152 - loss 0.01808615 - time (sec): 7.96 - samples/sec: 3486.38 - lr: 0.000017 - momentum: 0.000000 2023-10-17 09:39:43,656 epoch 7 - iter 150/152 - loss 0.02135121 - time (sec): 8.81 - samples/sec: 3478.81 - lr: 0.000017 - momentum: 0.000000 2023-10-17 09:39:43,764 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:39:43,764 EPOCH 7 done: loss 0.0214 - lr: 0.000017 2023-10-17 09:39:44,763 DEV : loss 0.18729975819587708 - f1-score (micro avg) 0.8547 2023-10-17 09:39:44,771 saving best model 2023-10-17 09:39:45,212 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:39:46,080 epoch 8 - iter 15/152 - loss 0.00906167 - time (sec): 0.87 - samples/sec: 3887.75 - lr: 0.000016 - momentum: 0.000000 2023-10-17 09:39:46,926 epoch 8 - iter 30/152 - loss 0.00839837 - time (sec): 1.71 - samples/sec: 3729.41 - lr: 0.000016 - momentum: 0.000000 2023-10-17 09:39:47,820 epoch 8 - iter 45/152 - loss 0.01422912 - time (sec): 2.60 - samples/sec: 3645.42 - lr: 0.000015 - momentum: 0.000000 2023-10-17 09:39:48,644 epoch 8 - iter 60/152 - loss 0.01285881 - time (sec): 3.43 - samples/sec: 3550.60 - lr: 0.000015 - momentum: 0.000000 2023-10-17 09:39:49,523 epoch 8 - iter 75/152 - loss 0.01186707 - time (sec): 4.31 - samples/sec: 3603.65 - lr: 0.000014 - momentum: 0.000000 2023-10-17 09:39:50,290 epoch 8 - iter 90/152 - loss 0.01303695 - time (sec): 5.08 - samples/sec: 3630.21 - lr: 0.000014 - momentum: 0.000000 2023-10-17 09:39:51,145 epoch 8 - iter 105/152 - loss 0.01545938 - time (sec): 5.93 - samples/sec: 3607.92 - lr: 0.000013 - momentum: 0.000000 2023-10-17 09:39:51,988 epoch 8 - iter 120/152 - loss 0.01686442 - time (sec): 6.77 - samples/sec: 3619.78 - lr: 0.000012 - momentum: 0.000000 2023-10-17 09:39:52,840 epoch 8 - iter 135/152 - loss 0.01607652 - time (sec): 7.63 - samples/sec: 3621.61 - lr: 0.000012 - momentum: 0.000000 2023-10-17 09:39:53,684 epoch 8 - iter 150/152 - loss 0.01592888 - time (sec): 8.47 - samples/sec: 3615.38 - lr: 0.000011 - momentum: 0.000000 2023-10-17 09:39:53,792 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:39:53,792 EPOCH 8 done: loss 0.0157 - lr: 0.000011 2023-10-17 09:39:54,730 DEV : loss 0.19675733149051666 - f1-score (micro avg) 0.8516 2023-10-17 09:39:54,737 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:39:55,580 epoch 9 - iter 15/152 - loss 0.01308460 - time (sec): 0.84 - samples/sec: 3652.30 - lr: 0.000011 - momentum: 0.000000 2023-10-17 09:39:56,439 epoch 9 - iter 30/152 - loss 0.00896246 - time (sec): 1.70 - samples/sec: 3516.26 - lr: 0.000010 - momentum: 0.000000 2023-10-17 09:39:57,331 epoch 9 - iter 45/152 - loss 0.01312645 - time (sec): 2.59 - samples/sec: 3534.32 - lr: 0.000010 - momentum: 0.000000 2023-10-17 09:39:58,174 epoch 9 - iter 60/152 - loss 0.01130915 - time (sec): 3.44 - samples/sec: 3505.52 - lr: 0.000009 - momentum: 0.000000 2023-10-17 09:39:59,043 epoch 9 - iter 75/152 - loss 0.01077197 - time (sec): 4.30 - samples/sec: 3539.88 - lr: 0.000009 - momentum: 0.000000 2023-10-17 09:39:59,895 epoch 9 - iter 90/152 - loss 0.01047289 - time (sec): 5.16 - samples/sec: 3559.78 - lr: 0.000008 - momentum: 0.000000 2023-10-17 09:40:00,767 epoch 9 - iter 105/152 - loss 0.00916831 - time (sec): 6.03 - samples/sec: 3536.58 - lr: 0.000007 - momentum: 0.000000 2023-10-17 09:40:01,576 epoch 9 - iter 120/152 - loss 0.01012104 - time (sec): 6.84 - samples/sec: 3548.84 - lr: 0.000007 - momentum: 0.000000 2023-10-17 09:40:02,455 epoch 9 - iter 135/152 - loss 0.01105105 - time (sec): 7.72 - samples/sec: 3555.43 - lr: 0.000006 - momentum: 0.000000 2023-10-17 09:40:03,345 epoch 9 - iter 150/152 - loss 0.01143143 - time (sec): 8.61 - samples/sec: 3561.34 - lr: 0.000006 - momentum: 0.000000 2023-10-17 09:40:03,449 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:40:03,449 EPOCH 9 done: loss 0.0113 - lr: 0.000006 2023-10-17 09:40:04,397 DEV : loss 0.200613334774971 - f1-score (micro avg) 0.8503 2023-10-17 09:40:04,405 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:40:05,304 epoch 10 - iter 15/152 - loss 0.00389795 - time (sec): 0.90 - samples/sec: 3276.13 - lr: 0.000005 - momentum: 0.000000 2023-10-17 09:40:06,241 epoch 10 - iter 30/152 - loss 0.00586938 - time (sec): 1.83 - samples/sec: 3256.88 - lr: 0.000005 - momentum: 0.000000 2023-10-17 09:40:07,204 epoch 10 - iter 45/152 - loss 0.00698098 - time (sec): 2.80 - samples/sec: 3313.32 - lr: 0.000004 - momentum: 0.000000 2023-10-17 09:40:08,154 epoch 10 - iter 60/152 - loss 0.01104798 - time (sec): 3.75 - samples/sec: 3254.03 - lr: 0.000004 - momentum: 0.000000 2023-10-17 09:40:09,020 epoch 10 - iter 75/152 - loss 0.00971413 - time (sec): 4.61 - samples/sec: 3281.76 - lr: 0.000003 - momentum: 0.000000 2023-10-17 09:40:09,935 epoch 10 - iter 90/152 - loss 0.00975448 - time (sec): 5.53 - samples/sec: 3280.82 - lr: 0.000003 - momentum: 0.000000 2023-10-17 09:40:10,824 epoch 10 - iter 105/152 - loss 0.00833967 - time (sec): 6.42 - samples/sec: 3327.69 - lr: 0.000002 - momentum: 0.000000 2023-10-17 09:40:11,689 epoch 10 - iter 120/152 - loss 0.00759956 - time (sec): 7.28 - samples/sec: 3341.10 - lr: 0.000001 - momentum: 0.000000 2023-10-17 09:40:12,574 epoch 10 - iter 135/152 - loss 0.00829980 - time (sec): 8.17 - samples/sec: 3354.76 - lr: 0.000001 - momentum: 0.000000 2023-10-17 09:40:13,508 epoch 10 - iter 150/152 - loss 0.00935136 - time (sec): 9.10 - samples/sec: 3363.67 - lr: 0.000000 - momentum: 0.000000 2023-10-17 09:40:13,628 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:40:13,628 EPOCH 10 done: loss 0.0092 - lr: 0.000000 2023-10-17 09:40:14,600 DEV : loss 0.19964995980262756 - f1-score (micro avg) 0.8544 2023-10-17 09:40:14,972 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:40:14,974 Loading model from best epoch ... 2023-10-17 09:40:16,456 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object 2023-10-17 09:40:17,392 Results: - F-score (micro) 0.8113 - F-score (macro) 0.6298 - Accuracy 0.6923 By class: precision recall f1-score support scope 0.7677 0.7881 0.7778 151 work 0.7843 0.8421 0.8122 95 pers 0.8788 0.9062 0.8923 96 loc 0.6667 0.6667 0.6667 3 date 0.0000 0.0000 0.0000 3 micro avg 0.7956 0.8276 0.8113 348 macro avg 0.6195 0.6406 0.6298 348 weighted avg 0.7954 0.8276 0.8111 348 2023-10-17 09:40:17,392 ----------------------------------------------------------------------------------------------------