|
2023-10-25 21:31:24,935 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:31:24,936 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 21:31:24,936 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:31:24,936 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-25 21:31:24,936 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:31:24,937 Train: 1085 sentences |
|
2023-10-25 21:31:24,937 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 21:31:24,937 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:31:24,937 Training Params: |
|
2023-10-25 21:31:24,937 - learning_rate: "3e-05" |
|
2023-10-25 21:31:24,937 - mini_batch_size: "8" |
|
2023-10-25 21:31:24,937 - max_epochs: "10" |
|
2023-10-25 21:31:24,937 - shuffle: "True" |
|
2023-10-25 21:31:24,937 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:31:24,937 Plugins: |
|
2023-10-25 21:31:24,937 - TensorboardLogger |
|
2023-10-25 21:31:24,937 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 21:31:24,937 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:31:24,937 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 21:31:24,937 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 21:31:24,937 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:31:24,937 Computation: |
|
2023-10-25 21:31:24,937 - compute on device: cuda:0 |
|
2023-10-25 21:31:24,937 - embedding storage: none |
|
2023-10-25 21:31:24,937 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:31:24,937 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-25 21:31:24,937 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:31:24,937 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:31:24,937 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 21:31:25,855 epoch 1 - iter 13/136 - loss 2.67862976 - time (sec): 0.92 - samples/sec: 5455.64 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 21:31:26,870 epoch 1 - iter 26/136 - loss 2.31862643 - time (sec): 1.93 - samples/sec: 5241.39 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 21:31:27,857 epoch 1 - iter 39/136 - loss 1.79659135 - time (sec): 2.92 - samples/sec: 5245.24 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 21:31:28,918 epoch 1 - iter 52/136 - loss 1.46309717 - time (sec): 3.98 - samples/sec: 5273.26 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 21:31:29,964 epoch 1 - iter 65/136 - loss 1.28129944 - time (sec): 5.03 - samples/sec: 5140.43 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 21:31:31,013 epoch 1 - iter 78/136 - loss 1.14224358 - time (sec): 6.07 - samples/sec: 5077.33 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 21:31:32,039 epoch 1 - iter 91/136 - loss 1.02123046 - time (sec): 7.10 - samples/sec: 5116.90 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 21:31:33,016 epoch 1 - iter 104/136 - loss 0.93704629 - time (sec): 8.08 - samples/sec: 5082.36 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 21:31:34,048 epoch 1 - iter 117/136 - loss 0.86134920 - time (sec): 9.11 - samples/sec: 5049.79 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 21:31:34,972 epoch 1 - iter 130/136 - loss 0.81315635 - time (sec): 10.03 - samples/sec: 4978.28 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 21:31:35,395 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:31:35,395 EPOCH 1 done: loss 0.7886 - lr: 0.000028 |
|
2023-10-25 21:31:36,496 DEV : loss 0.15381869673728943 - f1-score (micro avg) 0.6475 |
|
2023-10-25 21:31:36,503 saving best model |
|
2023-10-25 21:31:37,020 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:31:38,009 epoch 2 - iter 13/136 - loss 0.14729279 - time (sec): 0.99 - samples/sec: 5174.73 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 21:31:38,993 epoch 2 - iter 26/136 - loss 0.17121608 - time (sec): 1.97 - samples/sec: 5363.99 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 21:31:40,025 epoch 2 - iter 39/136 - loss 0.16370443 - time (sec): 3.00 - samples/sec: 4965.75 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 21:31:41,014 epoch 2 - iter 52/136 - loss 0.15904671 - time (sec): 3.99 - samples/sec: 4935.00 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 21:31:41,969 epoch 2 - iter 65/136 - loss 0.15169814 - time (sec): 4.95 - samples/sec: 4959.65 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 21:31:42,925 epoch 2 - iter 78/136 - loss 0.15515706 - time (sec): 5.90 - samples/sec: 5047.51 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 21:31:43,922 epoch 2 - iter 91/136 - loss 0.15141408 - time (sec): 6.90 - samples/sec: 5069.46 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 21:31:44,959 epoch 2 - iter 104/136 - loss 0.15001351 - time (sec): 7.94 - samples/sec: 4974.03 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 21:31:45,937 epoch 2 - iter 117/136 - loss 0.14937968 - time (sec): 8.92 - samples/sec: 5053.65 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 21:31:46,900 epoch 2 - iter 130/136 - loss 0.14671523 - time (sec): 9.88 - samples/sec: 5019.57 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 21:31:47,358 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:31:47,359 EPOCH 2 done: loss 0.1457 - lr: 0.000027 |
|
2023-10-25 21:31:48,659 DEV : loss 0.10822859406471252 - f1-score (micro avg) 0.7601 |
|
2023-10-25 21:31:48,666 saving best model |
|
2023-10-25 21:31:49,411 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:31:50,343 epoch 3 - iter 13/136 - loss 0.09159492 - time (sec): 0.93 - samples/sec: 4581.24 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 21:31:51,270 epoch 3 - iter 26/136 - loss 0.09188390 - time (sec): 1.86 - samples/sec: 4829.89 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 21:31:52,317 epoch 3 - iter 39/136 - loss 0.07817653 - time (sec): 2.90 - samples/sec: 4913.41 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 21:31:53,237 epoch 3 - iter 52/136 - loss 0.07930223 - time (sec): 3.82 - samples/sec: 4985.05 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 21:31:54,301 epoch 3 - iter 65/136 - loss 0.07711853 - time (sec): 4.89 - samples/sec: 4930.78 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 21:31:55,353 epoch 3 - iter 78/136 - loss 0.07404006 - time (sec): 5.94 - samples/sec: 5104.97 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 21:31:56,460 epoch 3 - iter 91/136 - loss 0.07528584 - time (sec): 7.05 - samples/sec: 5062.43 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 21:31:57,382 epoch 3 - iter 104/136 - loss 0.07545838 - time (sec): 7.97 - samples/sec: 5024.83 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 21:31:58,355 epoch 3 - iter 117/136 - loss 0.07590779 - time (sec): 8.94 - samples/sec: 4957.33 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 21:31:59,312 epoch 3 - iter 130/136 - loss 0.07606308 - time (sec): 9.90 - samples/sec: 4978.39 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 21:31:59,829 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:31:59,830 EPOCH 3 done: loss 0.0758 - lr: 0.000024 |
|
2023-10-25 21:32:01,578 DEV : loss 0.10140043497085571 - f1-score (micro avg) 0.7633 |
|
2023-10-25 21:32:01,585 saving best model |
|
2023-10-25 21:32:02,308 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:32:03,306 epoch 4 - iter 13/136 - loss 0.04720147 - time (sec): 1.00 - samples/sec: 5398.05 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 21:32:04,364 epoch 4 - iter 26/136 - loss 0.04723812 - time (sec): 2.05 - samples/sec: 5475.58 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 21:32:05,451 epoch 4 - iter 39/136 - loss 0.04375497 - time (sec): 3.14 - samples/sec: 5265.04 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 21:32:06,345 epoch 4 - iter 52/136 - loss 0.04366552 - time (sec): 4.04 - samples/sec: 5215.14 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 21:32:07,242 epoch 4 - iter 65/136 - loss 0.04276438 - time (sec): 4.93 - samples/sec: 5140.59 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 21:32:08,243 epoch 4 - iter 78/136 - loss 0.04392016 - time (sec): 5.93 - samples/sec: 5061.38 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 21:32:09,338 epoch 4 - iter 91/136 - loss 0.04355417 - time (sec): 7.03 - samples/sec: 5010.34 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 21:32:10,401 epoch 4 - iter 104/136 - loss 0.04553449 - time (sec): 8.09 - samples/sec: 5030.91 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 21:32:11,297 epoch 4 - iter 117/136 - loss 0.04613281 - time (sec): 8.99 - samples/sec: 5012.27 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 21:32:12,347 epoch 4 - iter 130/136 - loss 0.04531020 - time (sec): 10.04 - samples/sec: 4970.93 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 21:32:12,767 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:32:12,767 EPOCH 4 done: loss 0.0456 - lr: 0.000020 |
|
2023-10-25 21:32:14,146 DEV : loss 0.10330618172883987 - f1-score (micro avg) 0.8 |
|
2023-10-25 21:32:14,152 saving best model |
|
2023-10-25 21:32:14,883 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:32:15,913 epoch 5 - iter 13/136 - loss 0.02499789 - time (sec): 1.03 - samples/sec: 4912.36 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 21:32:16,874 epoch 5 - iter 26/136 - loss 0.01992048 - time (sec): 1.99 - samples/sec: 4736.67 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 21:32:17,823 epoch 5 - iter 39/136 - loss 0.02764478 - time (sec): 2.94 - samples/sec: 4782.33 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 21:32:18,798 epoch 5 - iter 52/136 - loss 0.02728486 - time (sec): 3.91 - samples/sec: 4825.41 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 21:32:19,695 epoch 5 - iter 65/136 - loss 0.03034797 - time (sec): 4.81 - samples/sec: 4828.95 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 21:32:20,826 epoch 5 - iter 78/136 - loss 0.03063707 - time (sec): 5.94 - samples/sec: 4901.53 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 21:32:22,099 epoch 5 - iter 91/136 - loss 0.03029661 - time (sec): 7.21 - samples/sec: 4867.24 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 21:32:23,087 epoch 5 - iter 104/136 - loss 0.03057002 - time (sec): 8.20 - samples/sec: 4899.67 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 21:32:23,977 epoch 5 - iter 117/136 - loss 0.03238794 - time (sec): 9.09 - samples/sec: 4905.19 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 21:32:24,939 epoch 5 - iter 130/136 - loss 0.03113197 - time (sec): 10.05 - samples/sec: 4943.76 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 21:32:25,365 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:32:25,366 EPOCH 5 done: loss 0.0304 - lr: 0.000017 |
|
2023-10-25 21:32:27,025 DEV : loss 0.1195770800113678 - f1-score (micro avg) 0.8037 |
|
2023-10-25 21:32:27,032 saving best model |
|
2023-10-25 21:32:27,749 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:32:28,803 epoch 6 - iter 13/136 - loss 0.01756530 - time (sec): 1.05 - samples/sec: 5366.79 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 21:32:29,850 epoch 6 - iter 26/136 - loss 0.02348632 - time (sec): 2.10 - samples/sec: 5058.44 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 21:32:30,796 epoch 6 - iter 39/136 - loss 0.02052357 - time (sec): 3.05 - samples/sec: 5110.69 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 21:32:31,849 epoch 6 - iter 52/136 - loss 0.02138897 - time (sec): 4.10 - samples/sec: 4940.68 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 21:32:32,811 epoch 6 - iter 65/136 - loss 0.02488717 - time (sec): 5.06 - samples/sec: 4853.29 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 21:32:33,815 epoch 6 - iter 78/136 - loss 0.02260196 - time (sec): 6.06 - samples/sec: 4945.45 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 21:32:34,799 epoch 6 - iter 91/136 - loss 0.02378811 - time (sec): 7.05 - samples/sec: 4971.81 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 21:32:35,887 epoch 6 - iter 104/136 - loss 0.02407465 - time (sec): 8.14 - samples/sec: 5025.23 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 21:32:36,916 epoch 6 - iter 117/136 - loss 0.02377918 - time (sec): 9.17 - samples/sec: 4953.62 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 21:32:37,826 epoch 6 - iter 130/136 - loss 0.02268322 - time (sec): 10.08 - samples/sec: 5001.50 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 21:32:38,197 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:32:38,197 EPOCH 6 done: loss 0.0222 - lr: 0.000014 |
|
2023-10-25 21:32:39,383 DEV : loss 0.1351010650396347 - f1-score (micro avg) 0.7927 |
|
2023-10-25 21:32:39,390 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:32:40,439 epoch 7 - iter 13/136 - loss 0.01667780 - time (sec): 1.05 - samples/sec: 4199.34 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 21:32:41,346 epoch 7 - iter 26/136 - loss 0.01388388 - time (sec): 1.95 - samples/sec: 4535.34 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 21:32:42,346 epoch 7 - iter 39/136 - loss 0.01450612 - time (sec): 2.95 - samples/sec: 4460.74 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 21:32:43,416 epoch 7 - iter 52/136 - loss 0.01658526 - time (sec): 4.03 - samples/sec: 4680.42 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 21:32:44,330 epoch 7 - iter 65/136 - loss 0.01539127 - time (sec): 4.94 - samples/sec: 4724.57 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 21:32:45,264 epoch 7 - iter 78/136 - loss 0.01885348 - time (sec): 5.87 - samples/sec: 4879.61 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 21:32:46,287 epoch 7 - iter 91/136 - loss 0.01876729 - time (sec): 6.90 - samples/sec: 4938.69 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 21:32:47,232 epoch 7 - iter 104/136 - loss 0.01819473 - time (sec): 7.84 - samples/sec: 4961.39 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 21:32:48,233 epoch 7 - iter 117/136 - loss 0.01693316 - time (sec): 8.84 - samples/sec: 5000.54 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 21:32:49,163 epoch 7 - iter 130/136 - loss 0.01647128 - time (sec): 9.77 - samples/sec: 5032.42 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 21:32:49,680 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:32:49,681 EPOCH 7 done: loss 0.0166 - lr: 0.000010 |
|
2023-10-25 21:32:50,858 DEV : loss 0.1340150088071823 - f1-score (micro avg) 0.8 |
|
2023-10-25 21:32:50,865 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:32:52,199 epoch 8 - iter 13/136 - loss 0.00560534 - time (sec): 1.33 - samples/sec: 3356.32 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 21:32:53,265 epoch 8 - iter 26/136 - loss 0.00976307 - time (sec): 2.40 - samples/sec: 4207.42 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 21:32:54,304 epoch 8 - iter 39/136 - loss 0.01070634 - time (sec): 3.44 - samples/sec: 4583.64 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 21:32:55,296 epoch 8 - iter 52/136 - loss 0.01208915 - time (sec): 4.43 - samples/sec: 4643.84 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 21:32:56,323 epoch 8 - iter 65/136 - loss 0.01109595 - time (sec): 5.46 - samples/sec: 4568.73 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 21:32:57,330 epoch 8 - iter 78/136 - loss 0.01173631 - time (sec): 6.46 - samples/sec: 4733.52 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 21:32:58,296 epoch 8 - iter 91/136 - loss 0.01105965 - time (sec): 7.43 - samples/sec: 4775.36 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 21:32:59,316 epoch 8 - iter 104/136 - loss 0.01148285 - time (sec): 8.45 - samples/sec: 4777.31 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 21:33:00,234 epoch 8 - iter 117/136 - loss 0.01174659 - time (sec): 9.37 - samples/sec: 4800.88 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 21:33:01,265 epoch 8 - iter 130/136 - loss 0.01071709 - time (sec): 10.40 - samples/sec: 4801.33 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 21:33:01,692 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:33:01,693 EPOCH 8 done: loss 0.0121 - lr: 0.000007 |
|
2023-10-25 21:33:02,845 DEV : loss 0.16605901718139648 - f1-score (micro avg) 0.8051 |
|
2023-10-25 21:33:02,852 saving best model |
|
2023-10-25 21:33:03,579 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:33:04,564 epoch 9 - iter 13/136 - loss 0.00519371 - time (sec): 0.98 - samples/sec: 4935.82 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 21:33:05,403 epoch 9 - iter 26/136 - loss 0.00741030 - time (sec): 1.81 - samples/sec: 4742.54 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 21:33:06,394 epoch 9 - iter 39/136 - loss 0.00860202 - time (sec): 2.81 - samples/sec: 4879.48 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 21:33:07,354 epoch 9 - iter 52/136 - loss 0.00994720 - time (sec): 3.76 - samples/sec: 4840.00 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 21:33:08,435 epoch 9 - iter 65/136 - loss 0.00914803 - time (sec): 4.85 - samples/sec: 4892.40 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 21:33:09,545 epoch 9 - iter 78/136 - loss 0.00869169 - time (sec): 5.96 - samples/sec: 4949.71 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 21:33:10,573 epoch 9 - iter 91/136 - loss 0.00816170 - time (sec): 6.98 - samples/sec: 5004.09 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 21:33:11,706 epoch 9 - iter 104/136 - loss 0.00799395 - time (sec): 8.12 - samples/sec: 5036.45 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 21:33:12,646 epoch 9 - iter 117/136 - loss 0.00901235 - time (sec): 9.06 - samples/sec: 5069.49 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 21:33:13,534 epoch 9 - iter 130/136 - loss 0.00901908 - time (sec): 9.95 - samples/sec: 5056.08 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 21:33:13,901 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:33:13,901 EPOCH 9 done: loss 0.0090 - lr: 0.000004 |
|
2023-10-25 21:33:15,092 DEV : loss 0.17188507318496704 - f1-score (micro avg) 0.8124 |
|
2023-10-25 21:33:15,098 saving best model |
|
2023-10-25 21:33:15,802 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:33:16,783 epoch 10 - iter 13/136 - loss 0.00872606 - time (sec): 0.97 - samples/sec: 4629.98 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 21:33:17,714 epoch 10 - iter 26/136 - loss 0.01279988 - time (sec): 1.91 - samples/sec: 4849.49 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 21:33:19,030 epoch 10 - iter 39/136 - loss 0.00977020 - time (sec): 3.22 - samples/sec: 4685.56 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 21:33:19,887 epoch 10 - iter 52/136 - loss 0.00929420 - time (sec): 4.08 - samples/sec: 4725.22 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 21:33:20,819 epoch 10 - iter 65/136 - loss 0.00906255 - time (sec): 5.01 - samples/sec: 4785.44 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 21:33:21,900 epoch 10 - iter 78/136 - loss 0.00778176 - time (sec): 6.09 - samples/sec: 4778.21 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 21:33:22,977 epoch 10 - iter 91/136 - loss 0.00714978 - time (sec): 7.17 - samples/sec: 4762.80 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 21:33:23,940 epoch 10 - iter 104/136 - loss 0.00657170 - time (sec): 8.13 - samples/sec: 4838.79 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 21:33:24,878 epoch 10 - iter 117/136 - loss 0.00701785 - time (sec): 9.07 - samples/sec: 4910.58 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 21:33:25,928 epoch 10 - iter 130/136 - loss 0.00745080 - time (sec): 10.12 - samples/sec: 4919.23 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 21:33:26,456 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:33:26,457 EPOCH 10 done: loss 0.0077 - lr: 0.000000 |
|
2023-10-25 21:33:27,624 DEV : loss 0.16700904071331024 - f1-score (micro avg) 0.8145 |
|
2023-10-25 21:33:27,630 saving best model |
|
2023-10-25 21:33:28,840 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:33:28,841 Loading model from best epoch ... |
|
2023-10-25 21:33:30,739 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-25 21:33:32,729 |
|
Results: |
|
- F-score (micro) 0.7818 |
|
- F-score (macro) 0.7411 |
|
- Accuracy 0.657 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.7959 0.8622 0.8277 312 |
|
PER 0.6894 0.8750 0.7712 208 |
|
ORG 0.5400 0.4909 0.5143 55 |
|
HumanProd 0.8000 0.9091 0.8511 22 |
|
|
|
micro avg 0.7356 0.8342 0.7818 597 |
|
macro avg 0.7063 0.7843 0.7411 597 |
|
weighted avg 0.7353 0.8342 0.7800 597 |
|
|
|
2023-10-25 21:33:32,729 ---------------------------------------------------------------------------------------------------- |
|
|