|
2023-10-18 15:58:37,050 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:58:37,050 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=25, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-18 15:58:37,050 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:58:37,050 MultiCorpus: 1214 train + 266 dev + 251 test sentences |
|
- NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator |
|
2023-10-18 15:58:37,050 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:58:37,050 Train: 1214 sentences |
|
2023-10-18 15:58:37,050 (train_with_dev=False, train_with_test=False) |
|
2023-10-18 15:58:37,050 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:58:37,050 Training Params: |
|
2023-10-18 15:58:37,050 - learning_rate: "3e-05" |
|
2023-10-18 15:58:37,050 - mini_batch_size: "4" |
|
2023-10-18 15:58:37,050 - max_epochs: "10" |
|
2023-10-18 15:58:37,050 - shuffle: "True" |
|
2023-10-18 15:58:37,050 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:58:37,050 Plugins: |
|
2023-10-18 15:58:37,050 - TensorboardLogger |
|
2023-10-18 15:58:37,050 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-18 15:58:37,051 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:58:37,051 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-18 15:58:37,051 - metric: "('micro avg', 'f1-score')" |
|
2023-10-18 15:58:37,051 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:58:37,051 Computation: |
|
2023-10-18 15:58:37,051 - compute on device: cuda:0 |
|
2023-10-18 15:58:37,051 - embedding storage: none |
|
2023-10-18 15:58:37,051 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:58:37,051 Model training base path: "hmbench-ajmc/en-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-18 15:58:37,051 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:58:37,051 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:58:37,051 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-18 15:58:37,899 epoch 1 - iter 30/304 - loss 4.03850600 - time (sec): 0.85 - samples/sec: 3648.32 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 15:58:38,361 epoch 1 - iter 60/304 - loss 4.00273567 - time (sec): 1.31 - samples/sec: 4586.16 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 15:58:38,852 epoch 1 - iter 90/304 - loss 3.89665102 - time (sec): 1.80 - samples/sec: 4991.65 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 15:58:39,318 epoch 1 - iter 120/304 - loss 3.73569002 - time (sec): 2.27 - samples/sec: 5251.25 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 15:58:39,761 epoch 1 - iter 150/304 - loss 3.54778952 - time (sec): 2.71 - samples/sec: 5460.90 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 15:58:40,203 epoch 1 - iter 180/304 - loss 3.33140639 - time (sec): 3.15 - samples/sec: 5575.79 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 15:58:40,663 epoch 1 - iter 210/304 - loss 3.05550293 - time (sec): 3.61 - samples/sec: 5784.51 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 15:58:41,110 epoch 1 - iter 240/304 - loss 2.80847888 - time (sec): 4.06 - samples/sec: 5925.56 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 15:58:41,592 epoch 1 - iter 270/304 - loss 2.57313831 - time (sec): 4.54 - samples/sec: 6028.64 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 15:58:42,063 epoch 1 - iter 300/304 - loss 2.38975156 - time (sec): 5.01 - samples/sec: 6124.33 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 15:58:42,116 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:58:42,116 EPOCH 1 done: loss 2.3726 - lr: 0.000030 |
|
2023-10-18 15:58:42,560 DEV : loss 0.7924676537513733 - f1-score (micro avg) 0.0 |
|
2023-10-18 15:58:42,565 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:58:43,034 epoch 2 - iter 30/304 - loss 0.73969190 - time (sec): 0.47 - samples/sec: 6972.68 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 15:58:43,501 epoch 2 - iter 60/304 - loss 0.75251064 - time (sec): 0.94 - samples/sec: 6662.39 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 15:58:43,958 epoch 2 - iter 90/304 - loss 0.77850753 - time (sec): 1.39 - samples/sec: 6714.80 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 15:58:44,423 epoch 2 - iter 120/304 - loss 0.78712700 - time (sec): 1.86 - samples/sec: 6814.59 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 15:58:44,861 epoch 2 - iter 150/304 - loss 0.75228939 - time (sec): 2.30 - samples/sec: 6711.55 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 15:58:45,309 epoch 2 - iter 180/304 - loss 0.74921235 - time (sec): 2.74 - samples/sec: 6668.69 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 15:58:45,756 epoch 2 - iter 210/304 - loss 0.71675831 - time (sec): 3.19 - samples/sec: 6775.90 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 15:58:46,200 epoch 2 - iter 240/304 - loss 0.70448184 - time (sec): 3.63 - samples/sec: 6749.94 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 15:58:46,648 epoch 2 - iter 270/304 - loss 0.70003918 - time (sec): 4.08 - samples/sec: 6788.89 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 15:58:47,096 epoch 2 - iter 300/304 - loss 0.70620534 - time (sec): 4.53 - samples/sec: 6767.90 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 15:58:47,154 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:58:47,154 EPOCH 2 done: loss 0.7045 - lr: 0.000027 |
|
2023-10-18 15:58:47,650 DEV : loss 0.5252388119697571 - f1-score (micro avg) 0.0 |
|
2023-10-18 15:58:47,654 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:58:48,102 epoch 3 - iter 30/304 - loss 0.55182917 - time (sec): 0.45 - samples/sec: 6556.91 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 15:58:48,570 epoch 3 - iter 60/304 - loss 0.56782352 - time (sec): 0.91 - samples/sec: 6923.54 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 15:58:49,033 epoch 3 - iter 90/304 - loss 0.56221581 - time (sec): 1.38 - samples/sec: 6804.70 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 15:58:49,492 epoch 3 - iter 120/304 - loss 0.55137019 - time (sec): 1.84 - samples/sec: 7001.36 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 15:58:49,957 epoch 3 - iter 150/304 - loss 0.54290759 - time (sec): 2.30 - samples/sec: 6934.28 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 15:58:50,408 epoch 3 - iter 180/304 - loss 0.55414283 - time (sec): 2.75 - samples/sec: 6880.44 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 15:58:50,865 epoch 3 - iter 210/304 - loss 0.55558892 - time (sec): 3.21 - samples/sec: 6796.11 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 15:58:51,312 epoch 3 - iter 240/304 - loss 0.53805602 - time (sec): 3.66 - samples/sec: 6734.10 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 15:58:51,792 epoch 3 - iter 270/304 - loss 0.53847435 - time (sec): 4.14 - samples/sec: 6680.27 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 15:58:52,278 epoch 3 - iter 300/304 - loss 0.53237949 - time (sec): 4.62 - samples/sec: 6620.46 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 15:58:52,334 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:58:52,334 EPOCH 3 done: loss 0.5305 - lr: 0.000023 |
|
2023-10-18 15:58:52,844 DEV : loss 0.40204209089279175 - f1-score (micro avg) 0.2089 |
|
2023-10-18 15:58:52,850 saving best model |
|
2023-10-18 15:58:52,883 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:58:53,390 epoch 4 - iter 30/304 - loss 0.45252594 - time (sec): 0.51 - samples/sec: 6070.52 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 15:58:53,883 epoch 4 - iter 60/304 - loss 0.50647720 - time (sec): 1.00 - samples/sec: 6170.79 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 15:58:54,347 epoch 4 - iter 90/304 - loss 0.49377967 - time (sec): 1.46 - samples/sec: 6374.31 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 15:58:54,815 epoch 4 - iter 120/304 - loss 0.48707188 - time (sec): 1.93 - samples/sec: 6379.39 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 15:58:55,275 epoch 4 - iter 150/304 - loss 0.47134754 - time (sec): 2.39 - samples/sec: 6388.36 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 15:58:55,747 epoch 4 - iter 180/304 - loss 0.46182963 - time (sec): 2.86 - samples/sec: 6348.99 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 15:58:56,210 epoch 4 - iter 210/304 - loss 0.45484006 - time (sec): 3.33 - samples/sec: 6421.44 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 15:58:56,678 epoch 4 - iter 240/304 - loss 0.44238293 - time (sec): 3.79 - samples/sec: 6472.14 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 15:58:57,140 epoch 4 - iter 270/304 - loss 0.43397338 - time (sec): 4.26 - samples/sec: 6463.75 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 15:58:57,609 epoch 4 - iter 300/304 - loss 0.43128089 - time (sec): 4.73 - samples/sec: 6480.76 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 15:58:57,668 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:58:57,668 EPOCH 4 done: loss 0.4304 - lr: 0.000020 |
|
2023-10-18 15:58:58,167 DEV : loss 0.3549251854419708 - f1-score (micro avg) 0.3758 |
|
2023-10-18 15:58:58,172 saving best model |
|
2023-10-18 15:58:58,207 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:58:58,671 epoch 5 - iter 30/304 - loss 0.39439532 - time (sec): 0.46 - samples/sec: 6814.25 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 15:58:59,139 epoch 5 - iter 60/304 - loss 0.41855782 - time (sec): 0.93 - samples/sec: 6885.48 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 15:58:59,592 epoch 5 - iter 90/304 - loss 0.38978642 - time (sec): 1.38 - samples/sec: 6962.29 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 15:59:00,043 epoch 5 - iter 120/304 - loss 0.37286955 - time (sec): 1.84 - samples/sec: 6834.48 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 15:59:00,511 epoch 5 - iter 150/304 - loss 0.36375578 - time (sec): 2.30 - samples/sec: 6784.57 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 15:59:00,962 epoch 5 - iter 180/304 - loss 0.36976221 - time (sec): 2.75 - samples/sec: 6713.78 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 15:59:01,435 epoch 5 - iter 210/304 - loss 0.37905154 - time (sec): 3.23 - samples/sec: 6740.31 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 15:59:01,891 epoch 5 - iter 240/304 - loss 0.38358745 - time (sec): 3.68 - samples/sec: 6680.55 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 15:59:02,343 epoch 5 - iter 270/304 - loss 0.37909180 - time (sec): 4.14 - samples/sec: 6699.68 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 15:59:02,794 epoch 5 - iter 300/304 - loss 0.37788989 - time (sec): 4.59 - samples/sec: 6702.79 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 15:59:02,848 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:02,848 EPOCH 5 done: loss 0.3771 - lr: 0.000017 |
|
2023-10-18 15:59:03,369 DEV : loss 0.320213258266449 - f1-score (micro avg) 0.459 |
|
2023-10-18 15:59:03,374 saving best model |
|
2023-10-18 15:59:03,409 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:03,867 epoch 6 - iter 30/304 - loss 0.37048301 - time (sec): 0.46 - samples/sec: 6169.89 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 15:59:04,336 epoch 6 - iter 60/304 - loss 0.36174320 - time (sec): 0.93 - samples/sec: 6157.14 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 15:59:04,800 epoch 6 - iter 90/304 - loss 0.35495927 - time (sec): 1.39 - samples/sec: 6509.22 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 15:59:05,255 epoch 6 - iter 120/304 - loss 0.37478142 - time (sec): 1.85 - samples/sec: 6544.62 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 15:59:05,706 epoch 6 - iter 150/304 - loss 0.35939768 - time (sec): 2.30 - samples/sec: 6629.69 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 15:59:06,173 epoch 6 - iter 180/304 - loss 0.35925061 - time (sec): 2.76 - samples/sec: 6596.48 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 15:59:06,623 epoch 6 - iter 210/304 - loss 0.35501061 - time (sec): 3.21 - samples/sec: 6633.04 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 15:59:07,071 epoch 6 - iter 240/304 - loss 0.34696615 - time (sec): 3.66 - samples/sec: 6625.04 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 15:59:07,526 epoch 6 - iter 270/304 - loss 0.35132371 - time (sec): 4.12 - samples/sec: 6598.30 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 15:59:08,008 epoch 6 - iter 300/304 - loss 0.34576817 - time (sec): 4.60 - samples/sec: 6652.53 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 15:59:08,069 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:08,069 EPOCH 6 done: loss 0.3460 - lr: 0.000013 |
|
2023-10-18 15:59:08,572 DEV : loss 0.30403241515159607 - f1-score (micro avg) 0.4773 |
|
2023-10-18 15:59:08,577 saving best model |
|
2023-10-18 15:59:08,614 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:09,079 epoch 7 - iter 30/304 - loss 0.32985688 - time (sec): 0.46 - samples/sec: 6358.05 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 15:59:09,536 epoch 7 - iter 60/304 - loss 0.33590119 - time (sec): 0.92 - samples/sec: 6551.55 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 15:59:09,997 epoch 7 - iter 90/304 - loss 0.32853347 - time (sec): 1.38 - samples/sec: 6519.56 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 15:59:10,454 epoch 7 - iter 120/304 - loss 0.33896971 - time (sec): 1.84 - samples/sec: 6575.40 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 15:59:10,904 epoch 7 - iter 150/304 - loss 0.33852375 - time (sec): 2.29 - samples/sec: 6713.42 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 15:59:11,361 epoch 7 - iter 180/304 - loss 0.33988547 - time (sec): 2.75 - samples/sec: 6671.25 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 15:59:11,813 epoch 7 - iter 210/304 - loss 0.32997448 - time (sec): 3.20 - samples/sec: 6652.43 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 15:59:12,273 epoch 7 - iter 240/304 - loss 0.32820568 - time (sec): 3.66 - samples/sec: 6646.35 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 15:59:12,734 epoch 7 - iter 270/304 - loss 0.32081326 - time (sec): 4.12 - samples/sec: 6669.68 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 15:59:13,185 epoch 7 - iter 300/304 - loss 0.32612648 - time (sec): 4.57 - samples/sec: 6710.89 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 15:59:13,239 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:13,239 EPOCH 7 done: loss 0.3251 - lr: 0.000010 |
|
2023-10-18 15:59:13,746 DEV : loss 0.29145151376724243 - f1-score (micro avg) 0.4799 |
|
2023-10-18 15:59:13,751 saving best model |
|
2023-10-18 15:59:13,785 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:14,237 epoch 8 - iter 30/304 - loss 0.30285066 - time (sec): 0.45 - samples/sec: 5877.49 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 15:59:14,696 epoch 8 - iter 60/304 - loss 0.31396965 - time (sec): 0.91 - samples/sec: 6247.51 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 15:59:15,154 epoch 8 - iter 90/304 - loss 0.28638459 - time (sec): 1.37 - samples/sec: 6311.78 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 15:59:15,594 epoch 8 - iter 120/304 - loss 0.30745593 - time (sec): 1.81 - samples/sec: 6525.06 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 15:59:16,061 epoch 8 - iter 150/304 - loss 0.30920520 - time (sec): 2.27 - samples/sec: 6665.05 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 15:59:16,516 epoch 8 - iter 180/304 - loss 0.30945030 - time (sec): 2.73 - samples/sec: 6585.89 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 15:59:16,974 epoch 8 - iter 210/304 - loss 0.30839110 - time (sec): 3.19 - samples/sec: 6675.65 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 15:59:17,432 epoch 8 - iter 240/304 - loss 0.30879628 - time (sec): 3.65 - samples/sec: 6720.29 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 15:59:17,886 epoch 8 - iter 270/304 - loss 0.31612129 - time (sec): 4.10 - samples/sec: 6736.14 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 15:59:18,340 epoch 8 - iter 300/304 - loss 0.31489604 - time (sec): 4.55 - samples/sec: 6734.01 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 15:59:18,398 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:18,398 EPOCH 8 done: loss 0.3150 - lr: 0.000007 |
|
2023-10-18 15:59:18,912 DEV : loss 0.28551721572875977 - f1-score (micro avg) 0.4837 |
|
2023-10-18 15:59:18,917 saving best model |
|
2023-10-18 15:59:18,950 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:19,416 epoch 9 - iter 30/304 - loss 0.28078911 - time (sec): 0.46 - samples/sec: 6426.25 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 15:59:19,859 epoch 9 - iter 60/304 - loss 0.31270793 - time (sec): 0.91 - samples/sec: 6741.42 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 15:59:20,329 epoch 9 - iter 90/304 - loss 0.32929336 - time (sec): 1.38 - samples/sec: 6720.48 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 15:59:20,789 epoch 9 - iter 120/304 - loss 0.31423175 - time (sec): 1.84 - samples/sec: 6770.94 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 15:59:21,257 epoch 9 - iter 150/304 - loss 0.30916725 - time (sec): 2.31 - samples/sec: 6746.51 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 15:59:21,691 epoch 9 - iter 180/304 - loss 0.30962789 - time (sec): 2.74 - samples/sec: 6691.23 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 15:59:22,146 epoch 9 - iter 210/304 - loss 0.30835230 - time (sec): 3.19 - samples/sec: 6779.09 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 15:59:22,600 epoch 9 - iter 240/304 - loss 0.31168621 - time (sec): 3.65 - samples/sec: 6717.51 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 15:59:23,037 epoch 9 - iter 270/304 - loss 0.31088202 - time (sec): 4.09 - samples/sec: 6693.78 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 15:59:23,506 epoch 9 - iter 300/304 - loss 0.30699286 - time (sec): 4.56 - samples/sec: 6721.21 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 15:59:23,561 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:23,561 EPOCH 9 done: loss 0.3077 - lr: 0.000003 |
|
2023-10-18 15:59:24,070 DEV : loss 0.2837096154689789 - f1-score (micro avg) 0.4972 |
|
2023-10-18 15:59:24,075 saving best model |
|
2023-10-18 15:59:24,109 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:24,575 epoch 10 - iter 30/304 - loss 0.27159700 - time (sec): 0.47 - samples/sec: 6626.72 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 15:59:25,027 epoch 10 - iter 60/304 - loss 0.24714464 - time (sec): 0.92 - samples/sec: 6802.35 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 15:59:25,489 epoch 10 - iter 90/304 - loss 0.27521157 - time (sec): 1.38 - samples/sec: 6731.20 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 15:59:25,944 epoch 10 - iter 120/304 - loss 0.27513884 - time (sec): 1.83 - samples/sec: 6576.63 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 15:59:26,390 epoch 10 - iter 150/304 - loss 0.27161433 - time (sec): 2.28 - samples/sec: 6546.91 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 15:59:26,860 epoch 10 - iter 180/304 - loss 0.28174389 - time (sec): 2.75 - samples/sec: 6579.13 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 15:59:27,323 epoch 10 - iter 210/304 - loss 0.28762007 - time (sec): 3.21 - samples/sec: 6588.38 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 15:59:27,772 epoch 10 - iter 240/304 - loss 0.29678191 - time (sec): 3.66 - samples/sec: 6617.42 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 15:59:28,233 epoch 10 - iter 270/304 - loss 0.29971860 - time (sec): 4.12 - samples/sec: 6611.91 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 15:59:28,691 epoch 10 - iter 300/304 - loss 0.30190093 - time (sec): 4.58 - samples/sec: 6676.94 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 15:59:28,749 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:28,749 EPOCH 10 done: loss 0.2995 - lr: 0.000000 |
|
2023-10-18 15:59:29,260 DEV : loss 0.28129124641418457 - f1-score (micro avg) 0.4944 |
|
2023-10-18 15:59:29,294 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 15:59:29,294 Loading model from best epoch ... |
|
2023-10-18 15:59:29,373 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object |
|
2023-10-18 15:59:29,853 |
|
Results: |
|
- F-score (micro) 0.4822 |
|
- F-score (macro) 0.2957 |
|
- Accuracy 0.331 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
scope 0.4337 0.5629 0.4899 151 |
|
work 0.3333 0.5368 0.4113 95 |
|
pers 0.5934 0.5625 0.5775 96 |
|
loc 0.0000 0.0000 0.0000 3 |
|
date 0.0000 0.0000 0.0000 3 |
|
|
|
micro avg 0.4318 0.5460 0.4822 348 |
|
macro avg 0.2721 0.3325 0.2957 348 |
|
weighted avg 0.4429 0.5460 0.4842 348 |
|
|
|
2023-10-18 15:59:29,853 ---------------------------------------------------------------------------------------------------- |
|
|