flair-hipe-2022-ajmc-fr / training.log
stefan-it's picture
Upload folder using huggingface_hub
0faafe5
2023-10-18 16:39:12,200 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:12,200 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 16:39:12,200 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:12,200 MultiCorpus: 966 train + 219 dev + 204 test sentences
- NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator
2023-10-18 16:39:12,200 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:12,200 Train: 966 sentences
2023-10-18 16:39:12,200 (train_with_dev=False, train_with_test=False)
2023-10-18 16:39:12,200 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:12,200 Training Params:
2023-10-18 16:39:12,200 - learning_rate: "5e-05"
2023-10-18 16:39:12,200 - mini_batch_size: "4"
2023-10-18 16:39:12,201 - max_epochs: "10"
2023-10-18 16:39:12,201 - shuffle: "True"
2023-10-18 16:39:12,201 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:12,201 Plugins:
2023-10-18 16:39:12,201 - TensorboardLogger
2023-10-18 16:39:12,201 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 16:39:12,201 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:12,201 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 16:39:12,201 - metric: "('micro avg', 'f1-score')"
2023-10-18 16:39:12,201 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:12,201 Computation:
2023-10-18 16:39:12,201 - compute on device: cuda:0
2023-10-18 16:39:12,201 - embedding storage: none
2023-10-18 16:39:12,201 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:12,201 Model training base path: "hmbench-ajmc/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-18 16:39:12,201 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:12,201 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:12,201 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 16:39:12,666 epoch 1 - iter 24/242 - loss 3.34072656 - time (sec): 0.46 - samples/sec: 5198.93 - lr: 0.000005 - momentum: 0.000000
2023-10-18 16:39:13,105 epoch 1 - iter 48/242 - loss 3.26591753 - time (sec): 0.90 - samples/sec: 5222.64 - lr: 0.000010 - momentum: 0.000000
2023-10-18 16:39:13,577 epoch 1 - iter 72/242 - loss 3.15898740 - time (sec): 1.38 - samples/sec: 5196.43 - lr: 0.000015 - momentum: 0.000000
2023-10-18 16:39:13,988 epoch 1 - iter 96/242 - loss 2.98050313 - time (sec): 1.79 - samples/sec: 5272.31 - lr: 0.000020 - momentum: 0.000000
2023-10-18 16:39:14,398 epoch 1 - iter 120/242 - loss 2.73215890 - time (sec): 2.20 - samples/sec: 5626.62 - lr: 0.000025 - momentum: 0.000000
2023-10-18 16:39:14,785 epoch 1 - iter 144/242 - loss 2.47482205 - time (sec): 2.58 - samples/sec: 5885.59 - lr: 0.000030 - momentum: 0.000000
2023-10-18 16:39:15,168 epoch 1 - iter 168/242 - loss 2.27234750 - time (sec): 2.97 - samples/sec: 5932.36 - lr: 0.000035 - momentum: 0.000000
2023-10-18 16:39:15,546 epoch 1 - iter 192/242 - loss 2.08230281 - time (sec): 3.34 - samples/sec: 6041.97 - lr: 0.000039 - momentum: 0.000000
2023-10-18 16:39:15,914 epoch 1 - iter 216/242 - loss 1.93805523 - time (sec): 3.71 - samples/sec: 6037.75 - lr: 0.000044 - momentum: 0.000000
2023-10-18 16:39:16,289 epoch 1 - iter 240/242 - loss 1.83328931 - time (sec): 4.09 - samples/sec: 6023.32 - lr: 0.000049 - momentum: 0.000000
2023-10-18 16:39:16,316 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:16,316 EPOCH 1 done: loss 1.8276 - lr: 0.000049
2023-10-18 16:39:16,587 DEV : loss 0.6149806380271912 - f1-score (micro avg) 0.0
2023-10-18 16:39:16,591 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:16,960 epoch 2 - iter 24/242 - loss 0.63637611 - time (sec): 0.37 - samples/sec: 6301.48 - lr: 0.000049 - momentum: 0.000000
2023-10-18 16:39:17,332 epoch 2 - iter 48/242 - loss 0.63515292 - time (sec): 0.74 - samples/sec: 6569.69 - lr: 0.000049 - momentum: 0.000000
2023-10-18 16:39:17,705 epoch 2 - iter 72/242 - loss 0.60643911 - time (sec): 1.11 - samples/sec: 6697.20 - lr: 0.000048 - momentum: 0.000000
2023-10-18 16:39:18,097 epoch 2 - iter 96/242 - loss 0.63024893 - time (sec): 1.50 - samples/sec: 6649.95 - lr: 0.000048 - momentum: 0.000000
2023-10-18 16:39:18,489 epoch 2 - iter 120/242 - loss 0.62899928 - time (sec): 1.90 - samples/sec: 6615.47 - lr: 0.000047 - momentum: 0.000000
2023-10-18 16:39:18,855 epoch 2 - iter 144/242 - loss 0.61009483 - time (sec): 2.26 - samples/sec: 6651.46 - lr: 0.000047 - momentum: 0.000000
2023-10-18 16:39:19,223 epoch 2 - iter 168/242 - loss 0.59249093 - time (sec): 2.63 - samples/sec: 6624.33 - lr: 0.000046 - momentum: 0.000000
2023-10-18 16:39:19,591 epoch 2 - iter 192/242 - loss 0.57656336 - time (sec): 3.00 - samples/sec: 6541.42 - lr: 0.000046 - momentum: 0.000000
2023-10-18 16:39:19,962 epoch 2 - iter 216/242 - loss 0.56723500 - time (sec): 3.37 - samples/sec: 6543.31 - lr: 0.000045 - momentum: 0.000000
2023-10-18 16:39:20,343 epoch 2 - iter 240/242 - loss 0.56412615 - time (sec): 3.75 - samples/sec: 6561.27 - lr: 0.000045 - momentum: 0.000000
2023-10-18 16:39:20,370 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:20,370 EPOCH 2 done: loss 0.5645 - lr: 0.000045
2023-10-18 16:39:20,800 DEV : loss 0.4076858460903168 - f1-score (micro avg) 0.4069
2023-10-18 16:39:20,804 saving best model
2023-10-18 16:39:20,839 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:21,219 epoch 3 - iter 24/242 - loss 0.44950477 - time (sec): 0.38 - samples/sec: 6957.07 - lr: 0.000044 - momentum: 0.000000
2023-10-18 16:39:21,590 epoch 3 - iter 48/242 - loss 0.46441966 - time (sec): 0.75 - samples/sec: 6757.35 - lr: 0.000043 - momentum: 0.000000
2023-10-18 16:39:21,986 epoch 3 - iter 72/242 - loss 0.48355561 - time (sec): 1.15 - samples/sec: 6834.81 - lr: 0.000043 - momentum: 0.000000
2023-10-18 16:39:22,344 epoch 3 - iter 96/242 - loss 0.48599707 - time (sec): 1.50 - samples/sec: 6618.72 - lr: 0.000042 - momentum: 0.000000
2023-10-18 16:39:22,700 epoch 3 - iter 120/242 - loss 0.48366048 - time (sec): 1.86 - samples/sec: 6597.14 - lr: 0.000042 - momentum: 0.000000
2023-10-18 16:39:23,032 epoch 3 - iter 144/242 - loss 0.46809598 - time (sec): 2.19 - samples/sec: 6783.65 - lr: 0.000041 - momentum: 0.000000
2023-10-18 16:39:23,379 epoch 3 - iter 168/242 - loss 0.46574184 - time (sec): 2.54 - samples/sec: 6809.11 - lr: 0.000041 - momentum: 0.000000
2023-10-18 16:39:23,763 epoch 3 - iter 192/242 - loss 0.44190262 - time (sec): 2.92 - samples/sec: 6824.99 - lr: 0.000040 - momentum: 0.000000
2023-10-18 16:39:24,141 epoch 3 - iter 216/242 - loss 0.43496200 - time (sec): 3.30 - samples/sec: 6751.17 - lr: 0.000040 - momentum: 0.000000
2023-10-18 16:39:24,503 epoch 3 - iter 240/242 - loss 0.42851370 - time (sec): 3.66 - samples/sec: 6729.96 - lr: 0.000039 - momentum: 0.000000
2023-10-18 16:39:24,529 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:24,529 EPOCH 3 done: loss 0.4294 - lr: 0.000039
2023-10-18 16:39:25,096 DEV : loss 0.32929080724716187 - f1-score (micro avg) 0.4931
2023-10-18 16:39:25,100 saving best model
2023-10-18 16:39:25,136 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:25,519 epoch 4 - iter 24/242 - loss 0.49055937 - time (sec): 0.38 - samples/sec: 6751.48 - lr: 0.000038 - momentum: 0.000000
2023-10-18 16:39:25,890 epoch 4 - iter 48/242 - loss 0.41222060 - time (sec): 0.75 - samples/sec: 6915.77 - lr: 0.000038 - momentum: 0.000000
2023-10-18 16:39:26,278 epoch 4 - iter 72/242 - loss 0.40643787 - time (sec): 1.14 - samples/sec: 6695.47 - lr: 0.000037 - momentum: 0.000000
2023-10-18 16:39:26,651 epoch 4 - iter 96/242 - loss 0.38427024 - time (sec): 1.51 - samples/sec: 6482.55 - lr: 0.000037 - momentum: 0.000000
2023-10-18 16:39:27,044 epoch 4 - iter 120/242 - loss 0.38029072 - time (sec): 1.91 - samples/sec: 6525.11 - lr: 0.000036 - momentum: 0.000000
2023-10-18 16:39:27,411 epoch 4 - iter 144/242 - loss 0.36912080 - time (sec): 2.27 - samples/sec: 6502.47 - lr: 0.000036 - momentum: 0.000000
2023-10-18 16:39:27,785 epoch 4 - iter 168/242 - loss 0.36034886 - time (sec): 2.65 - samples/sec: 6431.46 - lr: 0.000035 - momentum: 0.000000
2023-10-18 16:39:28,163 epoch 4 - iter 192/242 - loss 0.36068588 - time (sec): 3.03 - samples/sec: 6403.71 - lr: 0.000035 - momentum: 0.000000
2023-10-18 16:39:28,539 epoch 4 - iter 216/242 - loss 0.36250961 - time (sec): 3.40 - samples/sec: 6473.68 - lr: 0.000034 - momentum: 0.000000
2023-10-18 16:39:28,915 epoch 4 - iter 240/242 - loss 0.36148176 - time (sec): 3.78 - samples/sec: 6516.27 - lr: 0.000033 - momentum: 0.000000
2023-10-18 16:39:28,942 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:28,942 EPOCH 4 done: loss 0.3605 - lr: 0.000033
2023-10-18 16:39:29,376 DEV : loss 0.29638388752937317 - f1-score (micro avg) 0.4934
2023-10-18 16:39:29,380 saving best model
2023-10-18 16:39:29,415 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:29,790 epoch 5 - iter 24/242 - loss 0.33638521 - time (sec): 0.37 - samples/sec: 6839.32 - lr: 0.000033 - momentum: 0.000000
2023-10-18 16:39:30,161 epoch 5 - iter 48/242 - loss 0.32890594 - time (sec): 0.75 - samples/sec: 6941.91 - lr: 0.000032 - momentum: 0.000000
2023-10-18 16:39:30,526 epoch 5 - iter 72/242 - loss 0.30964898 - time (sec): 1.11 - samples/sec: 6746.25 - lr: 0.000032 - momentum: 0.000000
2023-10-18 16:39:30,897 epoch 5 - iter 96/242 - loss 0.31626763 - time (sec): 1.48 - samples/sec: 6659.49 - lr: 0.000031 - momentum: 0.000000
2023-10-18 16:39:31,267 epoch 5 - iter 120/242 - loss 0.33297265 - time (sec): 1.85 - samples/sec: 6735.79 - lr: 0.000031 - momentum: 0.000000
2023-10-18 16:39:31,634 epoch 5 - iter 144/242 - loss 0.33145979 - time (sec): 2.22 - samples/sec: 6650.67 - lr: 0.000030 - momentum: 0.000000
2023-10-18 16:39:32,007 epoch 5 - iter 168/242 - loss 0.33311663 - time (sec): 2.59 - samples/sec: 6659.52 - lr: 0.000030 - momentum: 0.000000
2023-10-18 16:39:32,382 epoch 5 - iter 192/242 - loss 0.32948726 - time (sec): 2.97 - samples/sec: 6574.73 - lr: 0.000029 - momentum: 0.000000
2023-10-18 16:39:32,767 epoch 5 - iter 216/242 - loss 0.32772837 - time (sec): 3.35 - samples/sec: 6557.20 - lr: 0.000028 - momentum: 0.000000
2023-10-18 16:39:33,116 epoch 5 - iter 240/242 - loss 0.32800025 - time (sec): 3.70 - samples/sec: 6633.02 - lr: 0.000028 - momentum: 0.000000
2023-10-18 16:39:33,140 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:33,140 EPOCH 5 done: loss 0.3270 - lr: 0.000028
2023-10-18 16:39:33,590 DEV : loss 0.2624404728412628 - f1-score (micro avg) 0.5279
2023-10-18 16:39:33,596 saving best model
2023-10-18 16:39:33,628 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:33,950 epoch 6 - iter 24/242 - loss 0.34206753 - time (sec): 0.32 - samples/sec: 6258.48 - lr: 0.000027 - momentum: 0.000000
2023-10-18 16:39:34,240 epoch 6 - iter 48/242 - loss 0.34319052 - time (sec): 0.61 - samples/sec: 7246.90 - lr: 0.000027 - momentum: 0.000000
2023-10-18 16:39:34,530 epoch 6 - iter 72/242 - loss 0.31813508 - time (sec): 0.90 - samples/sec: 7666.92 - lr: 0.000026 - momentum: 0.000000
2023-10-18 16:39:34,822 epoch 6 - iter 96/242 - loss 0.30359291 - time (sec): 1.19 - samples/sec: 8008.47 - lr: 0.000026 - momentum: 0.000000
2023-10-18 16:39:35,119 epoch 6 - iter 120/242 - loss 0.32660567 - time (sec): 1.49 - samples/sec: 8158.46 - lr: 0.000025 - momentum: 0.000000
2023-10-18 16:39:35,412 epoch 6 - iter 144/242 - loss 0.33032778 - time (sec): 1.78 - samples/sec: 8169.00 - lr: 0.000025 - momentum: 0.000000
2023-10-18 16:39:35,719 epoch 6 - iter 168/242 - loss 0.32862954 - time (sec): 2.09 - samples/sec: 8156.09 - lr: 0.000024 - momentum: 0.000000
2023-10-18 16:39:36,094 epoch 6 - iter 192/242 - loss 0.32217550 - time (sec): 2.47 - samples/sec: 8004.08 - lr: 0.000023 - momentum: 0.000000
2023-10-18 16:39:36,470 epoch 6 - iter 216/242 - loss 0.31474353 - time (sec): 2.84 - samples/sec: 7773.42 - lr: 0.000023 - momentum: 0.000000
2023-10-18 16:39:36,838 epoch 6 - iter 240/242 - loss 0.31348762 - time (sec): 3.21 - samples/sec: 7666.77 - lr: 0.000022 - momentum: 0.000000
2023-10-18 16:39:36,865 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:36,865 EPOCH 6 done: loss 0.3122 - lr: 0.000022
2023-10-18 16:39:37,300 DEV : loss 0.25362712144851685 - f1-score (micro avg) 0.5309
2023-10-18 16:39:37,304 saving best model
2023-10-18 16:39:37,339 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:37,704 epoch 7 - iter 24/242 - loss 0.28891833 - time (sec): 0.36 - samples/sec: 6040.40 - lr: 0.000022 - momentum: 0.000000
2023-10-18 16:39:38,065 epoch 7 - iter 48/242 - loss 0.27193590 - time (sec): 0.73 - samples/sec: 6258.79 - lr: 0.000021 - momentum: 0.000000
2023-10-18 16:39:38,449 epoch 7 - iter 72/242 - loss 0.27996064 - time (sec): 1.11 - samples/sec: 6339.45 - lr: 0.000021 - momentum: 0.000000
2023-10-18 16:39:38,848 epoch 7 - iter 96/242 - loss 0.27139693 - time (sec): 1.51 - samples/sec: 6410.66 - lr: 0.000020 - momentum: 0.000000
2023-10-18 16:39:39,213 epoch 7 - iter 120/242 - loss 0.27737023 - time (sec): 1.87 - samples/sec: 6395.49 - lr: 0.000020 - momentum: 0.000000
2023-10-18 16:39:39,587 epoch 7 - iter 144/242 - loss 0.27279140 - time (sec): 2.25 - samples/sec: 6460.96 - lr: 0.000019 - momentum: 0.000000
2023-10-18 16:39:39,964 epoch 7 - iter 168/242 - loss 0.27081089 - time (sec): 2.62 - samples/sec: 6418.60 - lr: 0.000018 - momentum: 0.000000
2023-10-18 16:39:40,337 epoch 7 - iter 192/242 - loss 0.26738266 - time (sec): 3.00 - samples/sec: 6459.44 - lr: 0.000018 - momentum: 0.000000
2023-10-18 16:39:40,711 epoch 7 - iter 216/242 - loss 0.27349357 - time (sec): 3.37 - samples/sec: 6563.41 - lr: 0.000017 - momentum: 0.000000
2023-10-18 16:39:41,045 epoch 7 - iter 240/242 - loss 0.28316936 - time (sec): 3.71 - samples/sec: 6647.90 - lr: 0.000017 - momentum: 0.000000
2023-10-18 16:39:41,064 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:41,064 EPOCH 7 done: loss 0.2828 - lr: 0.000017
2023-10-18 16:39:41,492 DEV : loss 0.23814086616039276 - f1-score (micro avg) 0.5738
2023-10-18 16:39:41,497 saving best model
2023-10-18 16:39:41,530 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:41,903 epoch 8 - iter 24/242 - loss 0.31547182 - time (sec): 0.37 - samples/sec: 7024.26 - lr: 0.000016 - momentum: 0.000000
2023-10-18 16:39:42,260 epoch 8 - iter 48/242 - loss 0.30247072 - time (sec): 0.73 - samples/sec: 6520.83 - lr: 0.000016 - momentum: 0.000000
2023-10-18 16:39:42,616 epoch 8 - iter 72/242 - loss 0.28965626 - time (sec): 1.09 - samples/sec: 6685.88 - lr: 0.000015 - momentum: 0.000000
2023-10-18 16:39:42,994 epoch 8 - iter 96/242 - loss 0.28917354 - time (sec): 1.46 - samples/sec: 6478.82 - lr: 0.000015 - momentum: 0.000000
2023-10-18 16:39:43,366 epoch 8 - iter 120/242 - loss 0.28649000 - time (sec): 1.84 - samples/sec: 6390.07 - lr: 0.000014 - momentum: 0.000000
2023-10-18 16:39:43,727 epoch 8 - iter 144/242 - loss 0.28862576 - time (sec): 2.20 - samples/sec: 6586.60 - lr: 0.000013 - momentum: 0.000000
2023-10-18 16:39:44,131 epoch 8 - iter 168/242 - loss 0.28779853 - time (sec): 2.60 - samples/sec: 6585.48 - lr: 0.000013 - momentum: 0.000000
2023-10-18 16:39:44,518 epoch 8 - iter 192/242 - loss 0.28278129 - time (sec): 2.99 - samples/sec: 6495.38 - lr: 0.000012 - momentum: 0.000000
2023-10-18 16:39:44,891 epoch 8 - iter 216/242 - loss 0.27617550 - time (sec): 3.36 - samples/sec: 6504.31 - lr: 0.000012 - momentum: 0.000000
2023-10-18 16:39:45,271 epoch 8 - iter 240/242 - loss 0.28359245 - time (sec): 3.74 - samples/sec: 6589.64 - lr: 0.000011 - momentum: 0.000000
2023-10-18 16:39:45,297 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:45,297 EPOCH 8 done: loss 0.2832 - lr: 0.000011
2023-10-18 16:39:45,733 DEV : loss 0.24089400470256805 - f1-score (micro avg) 0.5721
2023-10-18 16:39:45,738 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:46,116 epoch 9 - iter 24/242 - loss 0.21235456 - time (sec): 0.38 - samples/sec: 6467.11 - lr: 0.000011 - momentum: 0.000000
2023-10-18 16:39:46,502 epoch 9 - iter 48/242 - loss 0.21511467 - time (sec): 0.76 - samples/sec: 6572.69 - lr: 0.000010 - momentum: 0.000000
2023-10-18 16:39:46,896 epoch 9 - iter 72/242 - loss 0.24755941 - time (sec): 1.16 - samples/sec: 6696.29 - lr: 0.000010 - momentum: 0.000000
2023-10-18 16:39:47,253 epoch 9 - iter 96/242 - loss 0.26974009 - time (sec): 1.51 - samples/sec: 6649.04 - lr: 0.000009 - momentum: 0.000000
2023-10-18 16:39:47,638 epoch 9 - iter 120/242 - loss 0.26561447 - time (sec): 1.90 - samples/sec: 6657.26 - lr: 0.000008 - momentum: 0.000000
2023-10-18 16:39:48,026 epoch 9 - iter 144/242 - loss 0.27419224 - time (sec): 2.29 - samples/sec: 6636.15 - lr: 0.000008 - momentum: 0.000000
2023-10-18 16:39:48,386 epoch 9 - iter 168/242 - loss 0.26908420 - time (sec): 2.65 - samples/sec: 6620.82 - lr: 0.000007 - momentum: 0.000000
2023-10-18 16:39:48,778 epoch 9 - iter 192/242 - loss 0.27675038 - time (sec): 3.04 - samples/sec: 6521.79 - lr: 0.000007 - momentum: 0.000000
2023-10-18 16:39:49,160 epoch 9 - iter 216/242 - loss 0.27189560 - time (sec): 3.42 - samples/sec: 6519.32 - lr: 0.000006 - momentum: 0.000000
2023-10-18 16:39:49,524 epoch 9 - iter 240/242 - loss 0.26927925 - time (sec): 3.78 - samples/sec: 6503.00 - lr: 0.000006 - momentum: 0.000000
2023-10-18 16:39:49,553 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:49,553 EPOCH 9 done: loss 0.2698 - lr: 0.000006
2023-10-18 16:39:49,988 DEV : loss 0.2282724827528 - f1-score (micro avg) 0.5742
2023-10-18 16:39:49,993 saving best model
2023-10-18 16:39:50,028 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:50,419 epoch 10 - iter 24/242 - loss 0.20255313 - time (sec): 0.39 - samples/sec: 6123.98 - lr: 0.000005 - momentum: 0.000000
2023-10-18 16:39:50,786 epoch 10 - iter 48/242 - loss 0.23425561 - time (sec): 0.76 - samples/sec: 6310.31 - lr: 0.000005 - momentum: 0.000000
2023-10-18 16:39:51,158 epoch 10 - iter 72/242 - loss 0.25213496 - time (sec): 1.13 - samples/sec: 6444.70 - lr: 0.000004 - momentum: 0.000000
2023-10-18 16:39:51,531 epoch 10 - iter 96/242 - loss 0.26020205 - time (sec): 1.50 - samples/sec: 6443.11 - lr: 0.000003 - momentum: 0.000000
2023-10-18 16:39:51,907 epoch 10 - iter 120/242 - loss 0.27241365 - time (sec): 1.88 - samples/sec: 6380.29 - lr: 0.000003 - momentum: 0.000000
2023-10-18 16:39:52,269 epoch 10 - iter 144/242 - loss 0.27332284 - time (sec): 2.24 - samples/sec: 6330.34 - lr: 0.000002 - momentum: 0.000000
2023-10-18 16:39:52,650 epoch 10 - iter 168/242 - loss 0.27164285 - time (sec): 2.62 - samples/sec: 6442.64 - lr: 0.000002 - momentum: 0.000000
2023-10-18 16:39:53,023 epoch 10 - iter 192/242 - loss 0.26523221 - time (sec): 3.00 - samples/sec: 6484.70 - lr: 0.000001 - momentum: 0.000000
2023-10-18 16:39:53,394 epoch 10 - iter 216/242 - loss 0.26808745 - time (sec): 3.37 - samples/sec: 6500.64 - lr: 0.000001 - momentum: 0.000000
2023-10-18 16:39:53,756 epoch 10 - iter 240/242 - loss 0.26508205 - time (sec): 3.73 - samples/sec: 6581.85 - lr: 0.000000 - momentum: 0.000000
2023-10-18 16:39:53,778 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:53,779 EPOCH 10 done: loss 0.2666 - lr: 0.000000
2023-10-18 16:39:54,231 DEV : loss 0.22958189249038696 - f1-score (micro avg) 0.5752
2023-10-18 16:39:54,236 saving best model
2023-10-18 16:39:54,301 ----------------------------------------------------------------------------------------------------
2023-10-18 16:39:54,301 Loading model from best epoch ...
2023-10-18 16:39:54,370 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-18 16:39:54,801
Results:
- F-score (micro) 0.5168
- F-score (macro) 0.2892
- Accuracy 0.3656
By class:
precision recall f1-score support
scope 0.3858 0.5891 0.4663 129
pers 0.6069 0.7554 0.6731 139
work 0.4318 0.2375 0.3065 80
loc 0.0000 0.0000 0.0000 9
date 0.0000 0.0000 0.0000 3
micro avg 0.4831 0.5556 0.5168 360
macro avg 0.2849 0.3164 0.2892 360
weighted avg 0.4685 0.5556 0.4951 360
2023-10-18 16:39:54,801 ----------------------------------------------------------------------------------------------------