stefan-it's picture
Upload ./training.log with huggingface_hub
7c730ba
raw
history blame
23.9 kB
2023-10-25 21:05:36,363 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:36,364 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 21:05:36,364 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:36,364 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-25 21:05:36,364 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:36,364 Train: 1166 sentences
2023-10-25 21:05:36,364 (train_with_dev=False, train_with_test=False)
2023-10-25 21:05:36,364 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:36,364 Training Params:
2023-10-25 21:05:36,364 - learning_rate: "3e-05"
2023-10-25 21:05:36,364 - mini_batch_size: "8"
2023-10-25 21:05:36,364 - max_epochs: "10"
2023-10-25 21:05:36,364 - shuffle: "True"
2023-10-25 21:05:36,364 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:36,364 Plugins:
2023-10-25 21:05:36,364 - TensorboardLogger
2023-10-25 21:05:36,364 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 21:05:36,364 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:36,364 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 21:05:36,364 - metric: "('micro avg', 'f1-score')"
2023-10-25 21:05:36,364 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:36,365 Computation:
2023-10-25 21:05:36,365 - compute on device: cuda:0
2023-10-25 21:05:36,365 - embedding storage: none
2023-10-25 21:05:36,365 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:36,365 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-25 21:05:36,365 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:36,365 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:36,365 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 21:05:37,164 epoch 1 - iter 14/146 - loss 2.83025878 - time (sec): 0.80 - samples/sec: 4267.88 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:05:37,955 epoch 1 - iter 28/146 - loss 2.46691922 - time (sec): 1.59 - samples/sec: 4479.04 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:05:38,902 epoch 1 - iter 42/146 - loss 1.86829011 - time (sec): 2.54 - samples/sec: 4669.08 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:05:39,724 epoch 1 - iter 56/146 - loss 1.54384758 - time (sec): 3.36 - samples/sec: 4663.76 - lr: 0.000011 - momentum: 0.000000
2023-10-25 21:05:40,514 epoch 1 - iter 70/146 - loss 1.34624853 - time (sec): 4.15 - samples/sec: 4705.75 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:05:41,496 epoch 1 - iter 84/146 - loss 1.20137129 - time (sec): 5.13 - samples/sec: 4668.86 - lr: 0.000017 - momentum: 0.000000
2023-10-25 21:05:42,434 epoch 1 - iter 98/146 - loss 1.07011610 - time (sec): 6.07 - samples/sec: 4743.31 - lr: 0.000020 - momentum: 0.000000
2023-10-25 21:05:43,498 epoch 1 - iter 112/146 - loss 0.97878778 - time (sec): 7.13 - samples/sec: 4737.41 - lr: 0.000023 - momentum: 0.000000
2023-10-25 21:05:44,331 epoch 1 - iter 126/146 - loss 0.90038816 - time (sec): 7.96 - samples/sec: 4778.42 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:05:45,275 epoch 1 - iter 140/146 - loss 0.82667252 - time (sec): 8.91 - samples/sec: 4803.31 - lr: 0.000029 - momentum: 0.000000
2023-10-25 21:05:45,697 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:45,697 EPOCH 1 done: loss 0.8075 - lr: 0.000029
2023-10-25 21:05:46,358 DEV : loss 0.17039310932159424 - f1-score (micro avg) 0.5702
2023-10-25 21:05:46,362 saving best model
2023-10-25 21:05:46,879 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:47,843 epoch 2 - iter 14/146 - loss 0.20229098 - time (sec): 0.96 - samples/sec: 4761.19 - lr: 0.000030 - momentum: 0.000000
2023-10-25 21:05:48,858 epoch 2 - iter 28/146 - loss 0.18766990 - time (sec): 1.98 - samples/sec: 4776.95 - lr: 0.000029 - momentum: 0.000000
2023-10-25 21:05:49,755 epoch 2 - iter 42/146 - loss 0.18648775 - time (sec): 2.88 - samples/sec: 4775.20 - lr: 0.000029 - momentum: 0.000000
2023-10-25 21:05:50,655 epoch 2 - iter 56/146 - loss 0.19029101 - time (sec): 3.77 - samples/sec: 4741.42 - lr: 0.000029 - momentum: 0.000000
2023-10-25 21:05:51,431 epoch 2 - iter 70/146 - loss 0.18913930 - time (sec): 4.55 - samples/sec: 4763.79 - lr: 0.000028 - momentum: 0.000000
2023-10-25 21:05:52,200 epoch 2 - iter 84/146 - loss 0.19230771 - time (sec): 5.32 - samples/sec: 4757.70 - lr: 0.000028 - momentum: 0.000000
2023-10-25 21:05:53,030 epoch 2 - iter 98/146 - loss 0.18747787 - time (sec): 6.15 - samples/sec: 4744.20 - lr: 0.000028 - momentum: 0.000000
2023-10-25 21:05:54,055 epoch 2 - iter 112/146 - loss 0.18004954 - time (sec): 7.18 - samples/sec: 4732.79 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:05:54,915 epoch 2 - iter 126/146 - loss 0.17380432 - time (sec): 8.03 - samples/sec: 4783.21 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:05:55,761 epoch 2 - iter 140/146 - loss 0.17394433 - time (sec): 8.88 - samples/sec: 4827.33 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:05:56,111 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:56,111 EPOCH 2 done: loss 0.1735 - lr: 0.000027
2023-10-25 21:05:57,015 DEV : loss 0.10457519441843033 - f1-score (micro avg) 0.7177
2023-10-25 21:05:57,019 saving best model
2023-10-25 21:05:57,704 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:58,621 epoch 3 - iter 14/146 - loss 0.09931197 - time (sec): 0.91 - samples/sec: 4567.22 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:05:59,407 epoch 3 - iter 28/146 - loss 0.09405829 - time (sec): 1.70 - samples/sec: 4431.34 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:06:00,370 epoch 3 - iter 42/146 - loss 0.08962058 - time (sec): 2.66 - samples/sec: 4500.54 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:06:01,259 epoch 3 - iter 56/146 - loss 0.08813612 - time (sec): 3.55 - samples/sec: 4372.04 - lr: 0.000025 - momentum: 0.000000
2023-10-25 21:06:02,395 epoch 3 - iter 70/146 - loss 0.09225973 - time (sec): 4.69 - samples/sec: 4529.05 - lr: 0.000025 - momentum: 0.000000
2023-10-25 21:06:03,276 epoch 3 - iter 84/146 - loss 0.09278810 - time (sec): 5.57 - samples/sec: 4642.69 - lr: 0.000025 - momentum: 0.000000
2023-10-25 21:06:04,165 epoch 3 - iter 98/146 - loss 0.09139854 - time (sec): 6.46 - samples/sec: 4698.69 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:06:04,901 epoch 3 - iter 112/146 - loss 0.09379452 - time (sec): 7.19 - samples/sec: 4736.93 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:06:05,723 epoch 3 - iter 126/146 - loss 0.09468401 - time (sec): 8.02 - samples/sec: 4729.48 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:06:06,622 epoch 3 - iter 140/146 - loss 0.09287278 - time (sec): 8.91 - samples/sec: 4744.25 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:06:07,059 ----------------------------------------------------------------------------------------------------
2023-10-25 21:06:07,060 EPOCH 3 done: loss 0.0934 - lr: 0.000024
2023-10-25 21:06:08,132 DEV : loss 0.09595039486885071 - f1-score (micro avg) 0.7332
2023-10-25 21:06:08,137 saving best model
2023-10-25 21:06:08,810 ----------------------------------------------------------------------------------------------------
2023-10-25 21:06:09,790 epoch 4 - iter 14/146 - loss 0.07411911 - time (sec): 0.98 - samples/sec: 5160.10 - lr: 0.000023 - momentum: 0.000000
2023-10-25 21:06:10,610 epoch 4 - iter 28/146 - loss 0.06840387 - time (sec): 1.80 - samples/sec: 4874.67 - lr: 0.000023 - momentum: 0.000000
2023-10-25 21:06:11,437 epoch 4 - iter 42/146 - loss 0.07292162 - time (sec): 2.62 - samples/sec: 4818.23 - lr: 0.000022 - momentum: 0.000000
2023-10-25 21:06:12,396 epoch 4 - iter 56/146 - loss 0.06469956 - time (sec): 3.58 - samples/sec: 4740.65 - lr: 0.000022 - momentum: 0.000000
2023-10-25 21:06:13,143 epoch 4 - iter 70/146 - loss 0.06518597 - time (sec): 4.33 - samples/sec: 4699.77 - lr: 0.000022 - momentum: 0.000000
2023-10-25 21:06:14,098 epoch 4 - iter 84/146 - loss 0.06696645 - time (sec): 5.29 - samples/sec: 4659.86 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:06:14,934 epoch 4 - iter 98/146 - loss 0.06661296 - time (sec): 6.12 - samples/sec: 4711.64 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:06:15,737 epoch 4 - iter 112/146 - loss 0.06367292 - time (sec): 6.92 - samples/sec: 4697.66 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:06:16,739 epoch 4 - iter 126/146 - loss 0.06160560 - time (sec): 7.93 - samples/sec: 4707.84 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:06:17,616 epoch 4 - iter 140/146 - loss 0.06032160 - time (sec): 8.80 - samples/sec: 4821.21 - lr: 0.000020 - momentum: 0.000000
2023-10-25 21:06:17,933 ----------------------------------------------------------------------------------------------------
2023-10-25 21:06:17,933 EPOCH 4 done: loss 0.0601 - lr: 0.000020
2023-10-25 21:06:18,846 DEV : loss 0.10524275153875351 - f1-score (micro avg) 0.7642
2023-10-25 21:06:18,850 saving best model
2023-10-25 21:06:19,534 ----------------------------------------------------------------------------------------------------
2023-10-25 21:06:20,415 epoch 5 - iter 14/146 - loss 0.02996552 - time (sec): 0.88 - samples/sec: 5280.96 - lr: 0.000020 - momentum: 0.000000
2023-10-25 21:06:21,159 epoch 5 - iter 28/146 - loss 0.03043510 - time (sec): 1.62 - samples/sec: 5157.31 - lr: 0.000019 - momentum: 0.000000
2023-10-25 21:06:22,023 epoch 5 - iter 42/146 - loss 0.03731623 - time (sec): 2.48 - samples/sec: 5263.23 - lr: 0.000019 - momentum: 0.000000
2023-10-25 21:06:22,930 epoch 5 - iter 56/146 - loss 0.03644991 - time (sec): 3.39 - samples/sec: 5056.90 - lr: 0.000019 - momentum: 0.000000
2023-10-25 21:06:23,888 epoch 5 - iter 70/146 - loss 0.03446408 - time (sec): 4.35 - samples/sec: 4852.99 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:06:24,758 epoch 5 - iter 84/146 - loss 0.03395159 - time (sec): 5.22 - samples/sec: 4784.45 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:06:25,718 epoch 5 - iter 98/146 - loss 0.03646640 - time (sec): 6.18 - samples/sec: 4713.73 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:06:26,607 epoch 5 - iter 112/146 - loss 0.03898602 - time (sec): 7.07 - samples/sec: 4726.24 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:06:27,672 epoch 5 - iter 126/146 - loss 0.03853003 - time (sec): 8.13 - samples/sec: 4730.88 - lr: 0.000017 - momentum: 0.000000
2023-10-25 21:06:28,445 epoch 5 - iter 140/146 - loss 0.03950165 - time (sec): 8.91 - samples/sec: 4785.32 - lr: 0.000017 - momentum: 0.000000
2023-10-25 21:06:28,835 ----------------------------------------------------------------------------------------------------
2023-10-25 21:06:28,836 EPOCH 5 done: loss 0.0395 - lr: 0.000017
2023-10-25 21:06:29,746 DEV : loss 0.10796511173248291 - f1-score (micro avg) 0.7617
2023-10-25 21:06:29,751 ----------------------------------------------------------------------------------------------------
2023-10-25 21:06:30,562 epoch 6 - iter 14/146 - loss 0.02045471 - time (sec): 0.81 - samples/sec: 5141.92 - lr: 0.000016 - momentum: 0.000000
2023-10-25 21:06:31,512 epoch 6 - iter 28/146 - loss 0.02431460 - time (sec): 1.76 - samples/sec: 4759.68 - lr: 0.000016 - momentum: 0.000000
2023-10-25 21:06:32,460 epoch 6 - iter 42/146 - loss 0.02349163 - time (sec): 2.71 - samples/sec: 4864.11 - lr: 0.000016 - momentum: 0.000000
2023-10-25 21:06:33,331 epoch 6 - iter 56/146 - loss 0.02207213 - time (sec): 3.58 - samples/sec: 4826.86 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:06:34,276 epoch 6 - iter 70/146 - loss 0.02538588 - time (sec): 4.52 - samples/sec: 4762.20 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:06:35,169 epoch 6 - iter 84/146 - loss 0.02401518 - time (sec): 5.42 - samples/sec: 4691.30 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:06:36,103 epoch 6 - iter 98/146 - loss 0.02319494 - time (sec): 6.35 - samples/sec: 4741.03 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:06:37,157 epoch 6 - iter 112/146 - loss 0.02399453 - time (sec): 7.41 - samples/sec: 4643.43 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:06:38,007 epoch 6 - iter 126/146 - loss 0.02529112 - time (sec): 8.26 - samples/sec: 4654.13 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:06:38,896 epoch 6 - iter 140/146 - loss 0.02430354 - time (sec): 9.14 - samples/sec: 4677.55 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:06:39,258 ----------------------------------------------------------------------------------------------------
2023-10-25 21:06:39,258 EPOCH 6 done: loss 0.0249 - lr: 0.000014
2023-10-25 21:06:40,322 DEV : loss 0.11456426978111267 - f1-score (micro avg) 0.7849
2023-10-25 21:06:40,327 saving best model
2023-10-25 21:06:40,998 ----------------------------------------------------------------------------------------------------
2023-10-25 21:06:41,831 epoch 7 - iter 14/146 - loss 0.01377810 - time (sec): 0.83 - samples/sec: 5025.09 - lr: 0.000013 - momentum: 0.000000
2023-10-25 21:06:43,030 epoch 7 - iter 28/146 - loss 0.02000949 - time (sec): 2.03 - samples/sec: 4964.90 - lr: 0.000013 - momentum: 0.000000
2023-10-25 21:06:43,818 epoch 7 - iter 42/146 - loss 0.02159133 - time (sec): 2.82 - samples/sec: 4774.67 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:06:44,680 epoch 7 - iter 56/146 - loss 0.02141280 - time (sec): 3.68 - samples/sec: 4696.42 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:06:45,530 epoch 7 - iter 70/146 - loss 0.02033340 - time (sec): 4.53 - samples/sec: 4722.77 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:06:46,479 epoch 7 - iter 84/146 - loss 0.01873330 - time (sec): 5.48 - samples/sec: 4804.21 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:06:47,374 epoch 7 - iter 98/146 - loss 0.01877254 - time (sec): 6.37 - samples/sec: 4815.74 - lr: 0.000011 - momentum: 0.000000
2023-10-25 21:06:48,168 epoch 7 - iter 112/146 - loss 0.01963305 - time (sec): 7.17 - samples/sec: 4765.05 - lr: 0.000011 - momentum: 0.000000
2023-10-25 21:06:48,984 epoch 7 - iter 126/146 - loss 0.01898464 - time (sec): 7.98 - samples/sec: 4793.92 - lr: 0.000011 - momentum: 0.000000
2023-10-25 21:06:49,883 epoch 7 - iter 140/146 - loss 0.01845447 - time (sec): 8.88 - samples/sec: 4782.28 - lr: 0.000010 - momentum: 0.000000
2023-10-25 21:06:50,299 ----------------------------------------------------------------------------------------------------
2023-10-25 21:06:50,299 EPOCH 7 done: loss 0.0181 - lr: 0.000010
2023-10-25 21:06:51,208 DEV : loss 0.13842682540416718 - f1-score (micro avg) 0.7788
2023-10-25 21:06:51,213 ----------------------------------------------------------------------------------------------------
2023-10-25 21:06:52,098 epoch 8 - iter 14/146 - loss 0.02179360 - time (sec): 0.88 - samples/sec: 4362.26 - lr: 0.000010 - momentum: 0.000000
2023-10-25 21:06:53,100 epoch 8 - iter 28/146 - loss 0.01457476 - time (sec): 1.89 - samples/sec: 4502.85 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:06:54,176 epoch 8 - iter 42/146 - loss 0.01363156 - time (sec): 2.96 - samples/sec: 4560.96 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:06:55,065 epoch 8 - iter 56/146 - loss 0.01358221 - time (sec): 3.85 - samples/sec: 4529.92 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:06:55,987 epoch 8 - iter 70/146 - loss 0.01502044 - time (sec): 4.77 - samples/sec: 4552.97 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:06:56,784 epoch 8 - iter 84/146 - loss 0.01489034 - time (sec): 5.57 - samples/sec: 4658.20 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:06:57,539 epoch 8 - iter 98/146 - loss 0.01449699 - time (sec): 6.32 - samples/sec: 4659.56 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:06:58,527 epoch 8 - iter 112/146 - loss 0.01427531 - time (sec): 7.31 - samples/sec: 4717.80 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:06:59,324 epoch 8 - iter 126/146 - loss 0.01369188 - time (sec): 8.11 - samples/sec: 4719.35 - lr: 0.000007 - momentum: 0.000000
2023-10-25 21:07:00,154 epoch 8 - iter 140/146 - loss 0.01269696 - time (sec): 8.94 - samples/sec: 4804.98 - lr: 0.000007 - momentum: 0.000000
2023-10-25 21:07:00,462 ----------------------------------------------------------------------------------------------------
2023-10-25 21:07:00,462 EPOCH 8 done: loss 0.0128 - lr: 0.000007
2023-10-25 21:07:01,376 DEV : loss 0.15397675335407257 - f1-score (micro avg) 0.7315
2023-10-25 21:07:01,380 ----------------------------------------------------------------------------------------------------
2023-10-25 21:07:02,243 epoch 9 - iter 14/146 - loss 0.00725404 - time (sec): 0.86 - samples/sec: 5396.63 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:07:03,063 epoch 9 - iter 28/146 - loss 0.00858283 - time (sec): 1.68 - samples/sec: 5303.83 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:07:03,844 epoch 9 - iter 42/146 - loss 0.00697012 - time (sec): 2.46 - samples/sec: 5096.55 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:07:04,896 epoch 9 - iter 56/146 - loss 0.00943578 - time (sec): 3.51 - samples/sec: 4980.37 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:07:05,865 epoch 9 - iter 70/146 - loss 0.01141288 - time (sec): 4.48 - samples/sec: 4965.81 - lr: 0.000005 - momentum: 0.000000
2023-10-25 21:07:06,742 epoch 9 - iter 84/146 - loss 0.01134343 - time (sec): 5.36 - samples/sec: 4920.44 - lr: 0.000005 - momentum: 0.000000
2023-10-25 21:07:07,668 epoch 9 - iter 98/146 - loss 0.01085725 - time (sec): 6.29 - samples/sec: 4919.30 - lr: 0.000005 - momentum: 0.000000
2023-10-25 21:07:08,425 epoch 9 - iter 112/146 - loss 0.01140709 - time (sec): 7.04 - samples/sec: 4880.19 - lr: 0.000004 - momentum: 0.000000
2023-10-25 21:07:09,309 epoch 9 - iter 126/146 - loss 0.01114475 - time (sec): 7.93 - samples/sec: 4855.84 - lr: 0.000004 - momentum: 0.000000
2023-10-25 21:07:10,174 epoch 9 - iter 140/146 - loss 0.01073435 - time (sec): 8.79 - samples/sec: 4862.91 - lr: 0.000004 - momentum: 0.000000
2023-10-25 21:07:10,483 ----------------------------------------------------------------------------------------------------
2023-10-25 21:07:10,483 EPOCH 9 done: loss 0.0109 - lr: 0.000004
2023-10-25 21:07:11,390 DEV : loss 0.15072251856327057 - f1-score (micro avg) 0.7598
2023-10-25 21:07:11,395 ----------------------------------------------------------------------------------------------------
2023-10-25 21:07:12,200 epoch 10 - iter 14/146 - loss 0.00398947 - time (sec): 0.80 - samples/sec: 5186.37 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:07:13,172 epoch 10 - iter 28/146 - loss 0.00875898 - time (sec): 1.78 - samples/sec: 4781.14 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:07:14,022 epoch 10 - iter 42/146 - loss 0.00922190 - time (sec): 2.63 - samples/sec: 4759.26 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:07:14,955 epoch 10 - iter 56/146 - loss 0.01137857 - time (sec): 3.56 - samples/sec: 4758.01 - lr: 0.000002 - momentum: 0.000000
2023-10-25 21:07:15,813 epoch 10 - iter 70/146 - loss 0.00974451 - time (sec): 4.42 - samples/sec: 4783.44 - lr: 0.000002 - momentum: 0.000000
2023-10-25 21:07:16,574 epoch 10 - iter 84/146 - loss 0.00891124 - time (sec): 5.18 - samples/sec: 4755.93 - lr: 0.000002 - momentum: 0.000000
2023-10-25 21:07:17,576 epoch 10 - iter 98/146 - loss 0.01023278 - time (sec): 6.18 - samples/sec: 4727.60 - lr: 0.000001 - momentum: 0.000000
2023-10-25 21:07:18,489 epoch 10 - iter 112/146 - loss 0.00961012 - time (sec): 7.09 - samples/sec: 4783.79 - lr: 0.000001 - momentum: 0.000000
2023-10-25 21:07:19,586 epoch 10 - iter 126/146 - loss 0.00898474 - time (sec): 8.19 - samples/sec: 4684.43 - lr: 0.000001 - momentum: 0.000000
2023-10-25 21:07:20,438 epoch 10 - iter 140/146 - loss 0.00851157 - time (sec): 9.04 - samples/sec: 4730.83 - lr: 0.000000 - momentum: 0.000000
2023-10-25 21:07:20,797 ----------------------------------------------------------------------------------------------------
2023-10-25 21:07:20,797 EPOCH 10 done: loss 0.0087 - lr: 0.000000
2023-10-25 21:07:21,711 DEV : loss 0.1518029123544693 - f1-score (micro avg) 0.7588
2023-10-25 21:07:22,231 ----------------------------------------------------------------------------------------------------
2023-10-25 21:07:22,233 Loading model from best epoch ...
2023-10-25 21:07:23,959 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 21:07:25,507
Results:
- F-score (micro) 0.7581
- F-score (macro) 0.6581
- Accuracy 0.6331
By class:
precision recall f1-score support
PER 0.7855 0.8420 0.8128 348
LOC 0.6943 0.8352 0.7583 261
ORG 0.4348 0.3846 0.4082 52
HumanProd 0.5926 0.7273 0.6531 22
micro avg 0.7197 0.8009 0.7581 683
macro avg 0.6268 0.6973 0.6581 683
weighted avg 0.7177 0.8009 0.7560 683
2023-10-25 21:07:25,507 ----------------------------------------------------------------------------------------------------