stefan-it's picture
Upload ./training.log with huggingface_hub
c76b9ec
2023-10-25 20:57:20,509 ----------------------------------------------------------------------------------------------------
2023-10-25 20:57:20,511 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 20:57:20,511 ----------------------------------------------------------------------------------------------------
2023-10-25 20:57:20,511 MultiCorpus: 1085 train + 148 dev + 364 test sentences
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator
2023-10-25 20:57:20,511 ----------------------------------------------------------------------------------------------------
2023-10-25 20:57:20,511 Train: 1085 sentences
2023-10-25 20:57:20,511 (train_with_dev=False, train_with_test=False)
2023-10-25 20:57:20,511 ----------------------------------------------------------------------------------------------------
2023-10-25 20:57:20,511 Training Params:
2023-10-25 20:57:20,511 - learning_rate: "3e-05"
2023-10-25 20:57:20,512 - mini_batch_size: "8"
2023-10-25 20:57:20,512 - max_epochs: "10"
2023-10-25 20:57:20,512 - shuffle: "True"
2023-10-25 20:57:20,512 ----------------------------------------------------------------------------------------------------
2023-10-25 20:57:20,512 Plugins:
2023-10-25 20:57:20,512 - TensorboardLogger
2023-10-25 20:57:20,512 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 20:57:20,512 ----------------------------------------------------------------------------------------------------
2023-10-25 20:57:20,512 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 20:57:20,512 - metric: "('micro avg', 'f1-score')"
2023-10-25 20:57:20,512 ----------------------------------------------------------------------------------------------------
2023-10-25 20:57:20,512 Computation:
2023-10-25 20:57:20,512 - compute on device: cuda:0
2023-10-25 20:57:20,512 - embedding storage: none
2023-10-25 20:57:20,512 ----------------------------------------------------------------------------------------------------
2023-10-25 20:57:20,512 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-25 20:57:20,512 ----------------------------------------------------------------------------------------------------
2023-10-25 20:57:20,513 ----------------------------------------------------------------------------------------------------
2023-10-25 20:57:20,513 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 20:57:21,498 epoch 1 - iter 13/136 - loss 2.92990884 - time (sec): 0.98 - samples/sec: 5127.53 - lr: 0.000003 - momentum: 0.000000
2023-10-25 20:57:22,541 epoch 1 - iter 26/136 - loss 2.55165335 - time (sec): 2.03 - samples/sec: 5026.31 - lr: 0.000006 - momentum: 0.000000
2023-10-25 20:57:23,554 epoch 1 - iter 39/136 - loss 2.04744423 - time (sec): 3.04 - samples/sec: 4980.68 - lr: 0.000008 - momentum: 0.000000
2023-10-25 20:57:24,655 epoch 1 - iter 52/136 - loss 1.61339578 - time (sec): 4.14 - samples/sec: 5099.28 - lr: 0.000011 - momentum: 0.000000
2023-10-25 20:57:25,631 epoch 1 - iter 65/136 - loss 1.42349176 - time (sec): 5.12 - samples/sec: 5012.82 - lr: 0.000014 - momentum: 0.000000
2023-10-25 20:57:26,638 epoch 1 - iter 78/136 - loss 1.27186349 - time (sec): 6.12 - samples/sec: 4920.72 - lr: 0.000017 - momentum: 0.000000
2023-10-25 20:57:27,850 epoch 1 - iter 91/136 - loss 1.11113718 - time (sec): 7.34 - samples/sec: 4941.45 - lr: 0.000020 - momentum: 0.000000
2023-10-25 20:57:28,832 epoch 1 - iter 104/136 - loss 1.02098309 - time (sec): 8.32 - samples/sec: 4908.84 - lr: 0.000023 - momentum: 0.000000
2023-10-25 20:57:29,872 epoch 1 - iter 117/136 - loss 0.93707203 - time (sec): 9.36 - samples/sec: 4859.80 - lr: 0.000026 - momentum: 0.000000
2023-10-25 20:57:30,903 epoch 1 - iter 130/136 - loss 0.87769480 - time (sec): 10.39 - samples/sec: 4840.00 - lr: 0.000028 - momentum: 0.000000
2023-10-25 20:57:31,283 ----------------------------------------------------------------------------------------------------
2023-10-25 20:57:31,284 EPOCH 1 done: loss 0.8541 - lr: 0.000028
2023-10-25 20:57:32,387 DEV : loss 0.16994501650333405 - f1-score (micro avg) 0.613
2023-10-25 20:57:32,398 saving best model
2023-10-25 20:57:32,931 ----------------------------------------------------------------------------------------------------
2023-10-25 20:57:33,928 epoch 2 - iter 13/136 - loss 0.20415905 - time (sec): 0.99 - samples/sec: 4751.30 - lr: 0.000030 - momentum: 0.000000
2023-10-25 20:57:34,874 epoch 2 - iter 26/136 - loss 0.18777881 - time (sec): 1.94 - samples/sec: 4792.72 - lr: 0.000029 - momentum: 0.000000
2023-10-25 20:57:35,900 epoch 2 - iter 39/136 - loss 0.17721837 - time (sec): 2.97 - samples/sec: 4904.96 - lr: 0.000029 - momentum: 0.000000
2023-10-25 20:57:37,005 epoch 2 - iter 52/136 - loss 0.16931145 - time (sec): 4.07 - samples/sec: 4744.31 - lr: 0.000029 - momentum: 0.000000
2023-10-25 20:57:38,017 epoch 2 - iter 65/136 - loss 0.17210883 - time (sec): 5.08 - samples/sec: 4810.67 - lr: 0.000028 - momentum: 0.000000
2023-10-25 20:57:39,068 epoch 2 - iter 78/136 - loss 0.15995899 - time (sec): 6.13 - samples/sec: 4901.43 - lr: 0.000028 - momentum: 0.000000
2023-10-25 20:57:40,018 epoch 2 - iter 91/136 - loss 0.15946327 - time (sec): 7.08 - samples/sec: 4902.65 - lr: 0.000028 - momentum: 0.000000
2023-10-25 20:57:40,921 epoch 2 - iter 104/136 - loss 0.15402756 - time (sec): 7.99 - samples/sec: 4948.05 - lr: 0.000027 - momentum: 0.000000
2023-10-25 20:57:42,007 epoch 2 - iter 117/136 - loss 0.14965945 - time (sec): 9.07 - samples/sec: 4961.57 - lr: 0.000027 - momentum: 0.000000
2023-10-25 20:57:42,936 epoch 2 - iter 130/136 - loss 0.14655700 - time (sec): 10.00 - samples/sec: 5015.06 - lr: 0.000027 - momentum: 0.000000
2023-10-25 20:57:43,313 ----------------------------------------------------------------------------------------------------
2023-10-25 20:57:43,313 EPOCH 2 done: loss 0.1453 - lr: 0.000027
2023-10-25 20:57:44,548 DEV : loss 0.10967841744422913 - f1-score (micro avg) 0.7458
2023-10-25 20:57:44,554 saving best model
2023-10-25 20:57:45,287 ----------------------------------------------------------------------------------------------------
2023-10-25 20:57:46,242 epoch 3 - iter 13/136 - loss 0.07746010 - time (sec): 0.95 - samples/sec: 5501.51 - lr: 0.000026 - momentum: 0.000000
2023-10-25 20:57:47,192 epoch 3 - iter 26/136 - loss 0.07658585 - time (sec): 1.90 - samples/sec: 5289.55 - lr: 0.000026 - momentum: 0.000000
2023-10-25 20:57:48,252 epoch 3 - iter 39/136 - loss 0.07212515 - time (sec): 2.96 - samples/sec: 5179.41 - lr: 0.000026 - momentum: 0.000000
2023-10-25 20:57:49,294 epoch 3 - iter 52/136 - loss 0.07599664 - time (sec): 4.00 - samples/sec: 5037.24 - lr: 0.000025 - momentum: 0.000000
2023-10-25 20:57:50,176 epoch 3 - iter 65/136 - loss 0.07909179 - time (sec): 4.89 - samples/sec: 5020.73 - lr: 0.000025 - momentum: 0.000000
2023-10-25 20:57:51,247 epoch 3 - iter 78/136 - loss 0.08230846 - time (sec): 5.96 - samples/sec: 4937.25 - lr: 0.000025 - momentum: 0.000000
2023-10-25 20:57:52,192 epoch 3 - iter 91/136 - loss 0.08233746 - time (sec): 6.90 - samples/sec: 5016.99 - lr: 0.000024 - momentum: 0.000000
2023-10-25 20:57:53,266 epoch 3 - iter 104/136 - loss 0.08114831 - time (sec): 7.98 - samples/sec: 4936.38 - lr: 0.000024 - momentum: 0.000000
2023-10-25 20:57:54,376 epoch 3 - iter 117/136 - loss 0.08185953 - time (sec): 9.09 - samples/sec: 4904.42 - lr: 0.000024 - momentum: 0.000000
2023-10-25 20:57:55,366 epoch 3 - iter 130/136 - loss 0.07910712 - time (sec): 10.08 - samples/sec: 4914.93 - lr: 0.000024 - momentum: 0.000000
2023-10-25 20:57:55,881 ----------------------------------------------------------------------------------------------------
2023-10-25 20:57:55,882 EPOCH 3 done: loss 0.0776 - lr: 0.000024
2023-10-25 20:57:57,053 DEV : loss 0.0935666635632515 - f1-score (micro avg) 0.8015
2023-10-25 20:57:57,059 saving best model
2023-10-25 20:57:58,210 ----------------------------------------------------------------------------------------------------
2023-10-25 20:57:59,243 epoch 4 - iter 13/136 - loss 0.06733069 - time (sec): 1.03 - samples/sec: 5627.57 - lr: 0.000023 - momentum: 0.000000
2023-10-25 20:58:00,107 epoch 4 - iter 26/136 - loss 0.05352415 - time (sec): 1.90 - samples/sec: 5440.88 - lr: 0.000023 - momentum: 0.000000
2023-10-25 20:58:01,167 epoch 4 - iter 39/136 - loss 0.04727758 - time (sec): 2.96 - samples/sec: 5014.87 - lr: 0.000022 - momentum: 0.000000
2023-10-25 20:58:02,233 epoch 4 - iter 52/136 - loss 0.04394289 - time (sec): 4.02 - samples/sec: 5086.01 - lr: 0.000022 - momentum: 0.000000
2023-10-25 20:58:03,417 epoch 4 - iter 65/136 - loss 0.04532334 - time (sec): 5.21 - samples/sec: 4877.50 - lr: 0.000022 - momentum: 0.000000
2023-10-25 20:58:04,380 epoch 4 - iter 78/136 - loss 0.04726676 - time (sec): 6.17 - samples/sec: 4857.29 - lr: 0.000021 - momentum: 0.000000
2023-10-25 20:58:05,436 epoch 4 - iter 91/136 - loss 0.04783662 - time (sec): 7.22 - samples/sec: 4788.09 - lr: 0.000021 - momentum: 0.000000
2023-10-25 20:58:06,326 epoch 4 - iter 104/136 - loss 0.04641077 - time (sec): 8.11 - samples/sec: 4836.17 - lr: 0.000021 - momentum: 0.000000
2023-10-25 20:58:07,338 epoch 4 - iter 117/136 - loss 0.04569068 - time (sec): 9.13 - samples/sec: 4885.19 - lr: 0.000021 - momentum: 0.000000
2023-10-25 20:58:08,264 epoch 4 - iter 130/136 - loss 0.04531045 - time (sec): 10.05 - samples/sec: 4925.63 - lr: 0.000020 - momentum: 0.000000
2023-10-25 20:58:08,714 ----------------------------------------------------------------------------------------------------
2023-10-25 20:58:08,714 EPOCH 4 done: loss 0.0448 - lr: 0.000020
2023-10-25 20:58:09,861 DEV : loss 0.1180381178855896 - f1-score (micro avg) 0.8165
2023-10-25 20:58:09,867 saving best model
2023-10-25 20:58:10,587 ----------------------------------------------------------------------------------------------------
2023-10-25 20:58:11,495 epoch 5 - iter 13/136 - loss 0.03999052 - time (sec): 0.91 - samples/sec: 4746.95 - lr: 0.000020 - momentum: 0.000000
2023-10-25 20:58:12,564 epoch 5 - iter 26/136 - loss 0.03877379 - time (sec): 1.98 - samples/sec: 5143.91 - lr: 0.000019 - momentum: 0.000000
2023-10-25 20:58:13,366 epoch 5 - iter 39/136 - loss 0.04104911 - time (sec): 2.78 - samples/sec: 4969.15 - lr: 0.000019 - momentum: 0.000000
2023-10-25 20:58:14,396 epoch 5 - iter 52/136 - loss 0.03582605 - time (sec): 3.81 - samples/sec: 4918.58 - lr: 0.000019 - momentum: 0.000000
2023-10-25 20:58:15,284 epoch 5 - iter 65/136 - loss 0.03311757 - time (sec): 4.70 - samples/sec: 4947.49 - lr: 0.000018 - momentum: 0.000000
2023-10-25 20:58:16,438 epoch 5 - iter 78/136 - loss 0.03365822 - time (sec): 5.85 - samples/sec: 4889.46 - lr: 0.000018 - momentum: 0.000000
2023-10-25 20:58:17,429 epoch 5 - iter 91/136 - loss 0.03140306 - time (sec): 6.84 - samples/sec: 4859.76 - lr: 0.000018 - momentum: 0.000000
2023-10-25 20:58:18,539 epoch 5 - iter 104/136 - loss 0.03087682 - time (sec): 7.95 - samples/sec: 4846.31 - lr: 0.000018 - momentum: 0.000000
2023-10-25 20:58:19,430 epoch 5 - iter 117/136 - loss 0.02923008 - time (sec): 8.84 - samples/sec: 4879.53 - lr: 0.000017 - momentum: 0.000000
2023-10-25 20:58:20,456 epoch 5 - iter 130/136 - loss 0.03015711 - time (sec): 9.87 - samples/sec: 4978.43 - lr: 0.000017 - momentum: 0.000000
2023-10-25 20:58:20,973 ----------------------------------------------------------------------------------------------------
2023-10-25 20:58:20,974 EPOCH 5 done: loss 0.0295 - lr: 0.000017
2023-10-25 20:58:22,150 DEV : loss 0.11727390438318253 - f1-score (micro avg) 0.8088
2023-10-25 20:58:22,156 ----------------------------------------------------------------------------------------------------
2023-10-25 20:58:23,485 epoch 6 - iter 13/136 - loss 0.01807270 - time (sec): 1.33 - samples/sec: 4027.12 - lr: 0.000016 - momentum: 0.000000
2023-10-25 20:58:24,422 epoch 6 - iter 26/136 - loss 0.01886995 - time (sec): 2.26 - samples/sec: 4455.37 - lr: 0.000016 - momentum: 0.000000
2023-10-25 20:58:25,465 epoch 6 - iter 39/136 - loss 0.02451372 - time (sec): 3.31 - samples/sec: 4665.73 - lr: 0.000016 - momentum: 0.000000
2023-10-25 20:58:26,479 epoch 6 - iter 52/136 - loss 0.02065873 - time (sec): 4.32 - samples/sec: 4746.12 - lr: 0.000015 - momentum: 0.000000
2023-10-25 20:58:27,522 epoch 6 - iter 65/136 - loss 0.02021702 - time (sec): 5.36 - samples/sec: 4808.08 - lr: 0.000015 - momentum: 0.000000
2023-10-25 20:58:28,600 epoch 6 - iter 78/136 - loss 0.01893068 - time (sec): 6.44 - samples/sec: 4809.68 - lr: 0.000015 - momentum: 0.000000
2023-10-25 20:58:29,580 epoch 6 - iter 91/136 - loss 0.01805364 - time (sec): 7.42 - samples/sec: 4847.49 - lr: 0.000015 - momentum: 0.000000
2023-10-25 20:58:30,512 epoch 6 - iter 104/136 - loss 0.01718406 - time (sec): 8.35 - samples/sec: 4876.08 - lr: 0.000014 - momentum: 0.000000
2023-10-25 20:58:31,575 epoch 6 - iter 117/136 - loss 0.01675969 - time (sec): 9.42 - samples/sec: 4846.20 - lr: 0.000014 - momentum: 0.000000
2023-10-25 20:58:32,611 epoch 6 - iter 130/136 - loss 0.01889550 - time (sec): 10.45 - samples/sec: 4803.09 - lr: 0.000014 - momentum: 0.000000
2023-10-25 20:58:32,983 ----------------------------------------------------------------------------------------------------
2023-10-25 20:58:32,984 EPOCH 6 done: loss 0.0187 - lr: 0.000014
2023-10-25 20:58:34,198 DEV : loss 0.1490204632282257 - f1-score (micro avg) 0.803
2023-10-25 20:58:34,204 ----------------------------------------------------------------------------------------------------
2023-10-25 20:58:35,206 epoch 7 - iter 13/136 - loss 0.01586148 - time (sec): 1.00 - samples/sec: 4925.95 - lr: 0.000013 - momentum: 0.000000
2023-10-25 20:58:36,177 epoch 7 - iter 26/136 - loss 0.01681382 - time (sec): 1.97 - samples/sec: 4906.10 - lr: 0.000013 - momentum: 0.000000
2023-10-25 20:58:37,190 epoch 7 - iter 39/136 - loss 0.01778000 - time (sec): 2.99 - samples/sec: 5193.78 - lr: 0.000012 - momentum: 0.000000
2023-10-25 20:58:38,194 epoch 7 - iter 52/136 - loss 0.01659602 - time (sec): 3.99 - samples/sec: 5214.42 - lr: 0.000012 - momentum: 0.000000
2023-10-25 20:58:39,216 epoch 7 - iter 65/136 - loss 0.01555708 - time (sec): 5.01 - samples/sec: 5276.45 - lr: 0.000012 - momentum: 0.000000
2023-10-25 20:58:40,258 epoch 7 - iter 78/136 - loss 0.01558607 - time (sec): 6.05 - samples/sec: 5305.58 - lr: 0.000012 - momentum: 0.000000
2023-10-25 20:58:41,101 epoch 7 - iter 91/136 - loss 0.01578396 - time (sec): 6.90 - samples/sec: 5234.84 - lr: 0.000011 - momentum: 0.000000
2023-10-25 20:58:42,219 epoch 7 - iter 104/136 - loss 0.01483323 - time (sec): 8.01 - samples/sec: 5201.77 - lr: 0.000011 - momentum: 0.000000
2023-10-25 20:58:43,132 epoch 7 - iter 117/136 - loss 0.01536701 - time (sec): 8.93 - samples/sec: 5193.89 - lr: 0.000011 - momentum: 0.000000
2023-10-25 20:58:44,007 epoch 7 - iter 130/136 - loss 0.01644397 - time (sec): 9.80 - samples/sec: 5080.17 - lr: 0.000010 - momentum: 0.000000
2023-10-25 20:58:44,444 ----------------------------------------------------------------------------------------------------
2023-10-25 20:58:44,445 EPOCH 7 done: loss 0.0162 - lr: 0.000010
2023-10-25 20:58:45,626 DEV : loss 0.14709986746311188 - f1-score (micro avg) 0.8096
2023-10-25 20:58:45,632 ----------------------------------------------------------------------------------------------------
2023-10-25 20:58:46,673 epoch 8 - iter 13/136 - loss 0.01048661 - time (sec): 1.04 - samples/sec: 5216.80 - lr: 0.000010 - momentum: 0.000000
2023-10-25 20:58:47,884 epoch 8 - iter 26/136 - loss 0.00918910 - time (sec): 2.25 - samples/sec: 4798.99 - lr: 0.000009 - momentum: 0.000000
2023-10-25 20:58:48,876 epoch 8 - iter 39/136 - loss 0.00801765 - time (sec): 3.24 - samples/sec: 4835.38 - lr: 0.000009 - momentum: 0.000000
2023-10-25 20:58:49,811 epoch 8 - iter 52/136 - loss 0.00951333 - time (sec): 4.18 - samples/sec: 4941.81 - lr: 0.000009 - momentum: 0.000000
2023-10-25 20:58:50,861 epoch 8 - iter 65/136 - loss 0.00958210 - time (sec): 5.23 - samples/sec: 5002.81 - lr: 0.000009 - momentum: 0.000000
2023-10-25 20:58:51,749 epoch 8 - iter 78/136 - loss 0.01103231 - time (sec): 6.12 - samples/sec: 4905.51 - lr: 0.000008 - momentum: 0.000000
2023-10-25 20:58:52,774 epoch 8 - iter 91/136 - loss 0.01146889 - time (sec): 7.14 - samples/sec: 4902.71 - lr: 0.000008 - momentum: 0.000000
2023-10-25 20:58:53,756 epoch 8 - iter 104/136 - loss 0.01148418 - time (sec): 8.12 - samples/sec: 4894.10 - lr: 0.000008 - momentum: 0.000000
2023-10-25 20:58:54,740 epoch 8 - iter 117/136 - loss 0.01250022 - time (sec): 9.11 - samples/sec: 4937.33 - lr: 0.000007 - momentum: 0.000000
2023-10-25 20:58:55,742 epoch 8 - iter 130/136 - loss 0.01246305 - time (sec): 10.11 - samples/sec: 4930.91 - lr: 0.000007 - momentum: 0.000000
2023-10-25 20:58:56,172 ----------------------------------------------------------------------------------------------------
2023-10-25 20:58:56,172 EPOCH 8 done: loss 0.0130 - lr: 0.000007
2023-10-25 20:58:57,358 DEV : loss 0.16205309331417084 - f1-score (micro avg) 0.8183
2023-10-25 20:58:57,364 saving best model
2023-10-25 20:58:58,100 ----------------------------------------------------------------------------------------------------
2023-10-25 20:58:59,189 epoch 9 - iter 13/136 - loss 0.00483657 - time (sec): 1.09 - samples/sec: 5339.49 - lr: 0.000006 - momentum: 0.000000
2023-10-25 20:59:00,097 epoch 9 - iter 26/136 - loss 0.00526744 - time (sec): 1.99 - samples/sec: 4947.19 - lr: 0.000006 - momentum: 0.000000
2023-10-25 20:59:01,077 epoch 9 - iter 39/136 - loss 0.00787094 - time (sec): 2.97 - samples/sec: 5056.61 - lr: 0.000006 - momentum: 0.000000
2023-10-25 20:59:02,123 epoch 9 - iter 52/136 - loss 0.01159497 - time (sec): 4.02 - samples/sec: 5193.06 - lr: 0.000006 - momentum: 0.000000
2023-10-25 20:59:03,041 epoch 9 - iter 65/136 - loss 0.01128801 - time (sec): 4.94 - samples/sec: 5167.89 - lr: 0.000005 - momentum: 0.000000
2023-10-25 20:59:04,110 epoch 9 - iter 78/136 - loss 0.01095419 - time (sec): 6.01 - samples/sec: 5110.67 - lr: 0.000005 - momentum: 0.000000
2023-10-25 20:59:05,047 epoch 9 - iter 91/136 - loss 0.01033969 - time (sec): 6.94 - samples/sec: 5006.06 - lr: 0.000005 - momentum: 0.000000
2023-10-25 20:59:05,926 epoch 9 - iter 104/136 - loss 0.01030358 - time (sec): 7.82 - samples/sec: 5022.70 - lr: 0.000004 - momentum: 0.000000
2023-10-25 20:59:06,884 epoch 9 - iter 117/136 - loss 0.01015395 - time (sec): 8.78 - samples/sec: 5070.38 - lr: 0.000004 - momentum: 0.000000
2023-10-25 20:59:07,853 epoch 9 - iter 130/136 - loss 0.00984376 - time (sec): 9.75 - samples/sec: 5102.10 - lr: 0.000004 - momentum: 0.000000
2023-10-25 20:59:08,267 ----------------------------------------------------------------------------------------------------
2023-10-25 20:59:08,267 EPOCH 9 done: loss 0.0099 - lr: 0.000004
2023-10-25 20:59:09,517 DEV : loss 0.17475153505802155 - f1-score (micro avg) 0.8117
2023-10-25 20:59:09,524 ----------------------------------------------------------------------------------------------------
2023-10-25 20:59:10,421 epoch 10 - iter 13/136 - loss 0.01947312 - time (sec): 0.90 - samples/sec: 4741.61 - lr: 0.000003 - momentum: 0.000000
2023-10-25 20:59:11,667 epoch 10 - iter 26/136 - loss 0.01091648 - time (sec): 2.14 - samples/sec: 4142.73 - lr: 0.000003 - momentum: 0.000000
2023-10-25 20:59:12,604 epoch 10 - iter 39/136 - loss 0.00996893 - time (sec): 3.08 - samples/sec: 4426.47 - lr: 0.000003 - momentum: 0.000000
2023-10-25 20:59:13,600 epoch 10 - iter 52/136 - loss 0.00856799 - time (sec): 4.07 - samples/sec: 4501.11 - lr: 0.000002 - momentum: 0.000000
2023-10-25 20:59:14,760 epoch 10 - iter 65/136 - loss 0.00788452 - time (sec): 5.23 - samples/sec: 4662.51 - lr: 0.000002 - momentum: 0.000000
2023-10-25 20:59:15,743 epoch 10 - iter 78/136 - loss 0.00803522 - time (sec): 6.22 - samples/sec: 4773.14 - lr: 0.000002 - momentum: 0.000000
2023-10-25 20:59:16,609 epoch 10 - iter 91/136 - loss 0.00759162 - time (sec): 7.08 - samples/sec: 4782.21 - lr: 0.000001 - momentum: 0.000000
2023-10-25 20:59:17,630 epoch 10 - iter 104/136 - loss 0.00729881 - time (sec): 8.10 - samples/sec: 4884.75 - lr: 0.000001 - momentum: 0.000000
2023-10-25 20:59:18,560 epoch 10 - iter 117/136 - loss 0.00781195 - time (sec): 9.03 - samples/sec: 4889.68 - lr: 0.000001 - momentum: 0.000000
2023-10-25 20:59:19,677 epoch 10 - iter 130/136 - loss 0.00830425 - time (sec): 10.15 - samples/sec: 4910.15 - lr: 0.000000 - momentum: 0.000000
2023-10-25 20:59:20,151 ----------------------------------------------------------------------------------------------------
2023-10-25 20:59:20,152 EPOCH 10 done: loss 0.0080 - lr: 0.000000
2023-10-25 20:59:21,449 DEV : loss 0.17531758546829224 - f1-score (micro avg) 0.8147
2023-10-25 20:59:21,950 ----------------------------------------------------------------------------------------------------
2023-10-25 20:59:21,951 Loading model from best epoch ...
2023-10-25 20:59:23,863 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-25 20:59:25,930
Results:
- F-score (micro) 0.7987
- F-score (macro) 0.7473
- Accuracy 0.6811
By class:
precision recall f1-score support
LOC 0.7989 0.8910 0.8424 312
PER 0.7328 0.8702 0.7956 208
ORG 0.5556 0.4545 0.5000 55
HumanProd 0.8000 0.9091 0.8511 22
micro avg 0.7579 0.8442 0.7987 597
macro avg 0.7218 0.7812 0.7473 597
weighted avg 0.7535 0.8442 0.7949 597
2023-10-25 20:59:25,930 ----------------------------------------------------------------------------------------------------