stefan-it's picture
Upload ./training.log with huggingface_hub
7e5dc9d
2023-10-25 21:28:09,101 ----------------------------------------------------------------------------------------------------
2023-10-25 21:28:09,102 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 21:28:09,102 ----------------------------------------------------------------------------------------------------
2023-10-25 21:28:09,102 MultiCorpus: 1085 train + 148 dev + 364 test sentences
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator
2023-10-25 21:28:09,102 ----------------------------------------------------------------------------------------------------
2023-10-25 21:28:09,103 Train: 1085 sentences
2023-10-25 21:28:09,103 (train_with_dev=False, train_with_test=False)
2023-10-25 21:28:09,103 ----------------------------------------------------------------------------------------------------
2023-10-25 21:28:09,103 Training Params:
2023-10-25 21:28:09,103 - learning_rate: "5e-05"
2023-10-25 21:28:09,103 - mini_batch_size: "4"
2023-10-25 21:28:09,103 - max_epochs: "10"
2023-10-25 21:28:09,103 - shuffle: "True"
2023-10-25 21:28:09,103 ----------------------------------------------------------------------------------------------------
2023-10-25 21:28:09,103 Plugins:
2023-10-25 21:28:09,103 - TensorboardLogger
2023-10-25 21:28:09,103 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 21:28:09,103 ----------------------------------------------------------------------------------------------------
2023-10-25 21:28:09,103 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 21:28:09,103 - metric: "('micro avg', 'f1-score')"
2023-10-25 21:28:09,103 ----------------------------------------------------------------------------------------------------
2023-10-25 21:28:09,103 Computation:
2023-10-25 21:28:09,103 - compute on device: cuda:0
2023-10-25 21:28:09,103 - embedding storage: none
2023-10-25 21:28:09,103 ----------------------------------------------------------------------------------------------------
2023-10-25 21:28:09,103 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-25 21:28:09,103 ----------------------------------------------------------------------------------------------------
2023-10-25 21:28:09,103 ----------------------------------------------------------------------------------------------------
2023-10-25 21:28:09,103 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 21:28:10,568 epoch 1 - iter 27/272 - loss 2.78765850 - time (sec): 1.46 - samples/sec: 3749.30 - lr: 0.000005 - momentum: 0.000000
2023-10-25 21:28:12,027 epoch 1 - iter 54/272 - loss 2.00101582 - time (sec): 2.92 - samples/sec: 3713.94 - lr: 0.000010 - momentum: 0.000000
2023-10-25 21:28:13,451 epoch 1 - iter 81/272 - loss 1.57394260 - time (sec): 4.35 - samples/sec: 3560.78 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:28:14,978 epoch 1 - iter 108/272 - loss 1.25429045 - time (sec): 5.87 - samples/sec: 3555.28 - lr: 0.000020 - momentum: 0.000000
2023-10-25 21:28:16,446 epoch 1 - iter 135/272 - loss 1.07013551 - time (sec): 7.34 - samples/sec: 3469.41 - lr: 0.000025 - momentum: 0.000000
2023-10-25 21:28:17,961 epoch 1 - iter 162/272 - loss 0.92511254 - time (sec): 8.86 - samples/sec: 3510.43 - lr: 0.000030 - momentum: 0.000000
2023-10-25 21:28:19,538 epoch 1 - iter 189/272 - loss 0.82909206 - time (sec): 10.43 - samples/sec: 3458.10 - lr: 0.000035 - momentum: 0.000000
2023-10-25 21:28:21,053 epoch 1 - iter 216/272 - loss 0.74558155 - time (sec): 11.95 - samples/sec: 3467.28 - lr: 0.000040 - momentum: 0.000000
2023-10-25 21:28:22,540 epoch 1 - iter 243/272 - loss 0.68899771 - time (sec): 13.44 - samples/sec: 3437.71 - lr: 0.000044 - momentum: 0.000000
2023-10-25 21:28:24,103 epoch 1 - iter 270/272 - loss 0.64040727 - time (sec): 15.00 - samples/sec: 3450.74 - lr: 0.000049 - momentum: 0.000000
2023-10-25 21:28:24,209 ----------------------------------------------------------------------------------------------------
2023-10-25 21:28:24,209 EPOCH 1 done: loss 0.6380 - lr: 0.000049
2023-10-25 21:28:24,928 DEV : loss 0.14035949110984802 - f1-score (micro avg) 0.6881
2023-10-25 21:28:24,934 saving best model
2023-10-25 21:28:25,364 ----------------------------------------------------------------------------------------------------
2023-10-25 21:28:26,882 epoch 2 - iter 27/272 - loss 0.09840230 - time (sec): 1.52 - samples/sec: 3106.26 - lr: 0.000049 - momentum: 0.000000
2023-10-25 21:28:28,448 epoch 2 - iter 54/272 - loss 0.09235375 - time (sec): 3.08 - samples/sec: 3434.25 - lr: 0.000049 - momentum: 0.000000
2023-10-25 21:28:30,007 epoch 2 - iter 81/272 - loss 0.11428755 - time (sec): 4.64 - samples/sec: 3233.28 - lr: 0.000048 - momentum: 0.000000
2023-10-25 21:28:31,513 epoch 2 - iter 108/272 - loss 0.11632033 - time (sec): 6.15 - samples/sec: 3451.06 - lr: 0.000048 - momentum: 0.000000
2023-10-25 21:28:33,039 epoch 2 - iter 135/272 - loss 0.11619477 - time (sec): 7.67 - samples/sec: 3453.70 - lr: 0.000047 - momentum: 0.000000
2023-10-25 21:28:34,571 epoch 2 - iter 162/272 - loss 0.11865776 - time (sec): 9.21 - samples/sec: 3407.72 - lr: 0.000047 - momentum: 0.000000
2023-10-25 21:28:36,059 epoch 2 - iter 189/272 - loss 0.11920523 - time (sec): 10.69 - samples/sec: 3470.34 - lr: 0.000046 - momentum: 0.000000
2023-10-25 21:28:37,574 epoch 2 - iter 216/272 - loss 0.11666654 - time (sec): 12.21 - samples/sec: 3540.98 - lr: 0.000046 - momentum: 0.000000
2023-10-25 21:28:39,053 epoch 2 - iter 243/272 - loss 0.11787206 - time (sec): 13.69 - samples/sec: 3477.79 - lr: 0.000045 - momentum: 0.000000
2023-10-25 21:28:40,560 epoch 2 - iter 270/272 - loss 0.12009223 - time (sec): 15.19 - samples/sec: 3412.65 - lr: 0.000045 - momentum: 0.000000
2023-10-25 21:28:40,657 ----------------------------------------------------------------------------------------------------
2023-10-25 21:28:40,657 EPOCH 2 done: loss 0.1202 - lr: 0.000045
2023-10-25 21:28:41,905 DEV : loss 0.12892909348011017 - f1-score (micro avg) 0.7395
2023-10-25 21:28:41,911 saving best model
2023-10-25 21:28:42,606 ----------------------------------------------------------------------------------------------------
2023-10-25 21:28:44,437 epoch 3 - iter 27/272 - loss 0.09342546 - time (sec): 1.82 - samples/sec: 2742.45 - lr: 0.000044 - momentum: 0.000000
2023-10-25 21:28:45,908 epoch 3 - iter 54/272 - loss 0.07117048 - time (sec): 3.29 - samples/sec: 3278.77 - lr: 0.000043 - momentum: 0.000000
2023-10-25 21:28:47,432 epoch 3 - iter 81/272 - loss 0.06750136 - time (sec): 4.81 - samples/sec: 3238.86 - lr: 0.000043 - momentum: 0.000000
2023-10-25 21:28:48,965 epoch 3 - iter 108/272 - loss 0.06927074 - time (sec): 6.34 - samples/sec: 3317.73 - lr: 0.000042 - momentum: 0.000000
2023-10-25 21:28:50,533 epoch 3 - iter 135/272 - loss 0.06639235 - time (sec): 7.91 - samples/sec: 3380.15 - lr: 0.000042 - momentum: 0.000000
2023-10-25 21:28:52,011 epoch 3 - iter 162/272 - loss 0.06610577 - time (sec): 9.39 - samples/sec: 3350.69 - lr: 0.000041 - momentum: 0.000000
2023-10-25 21:28:53,559 epoch 3 - iter 189/272 - loss 0.07007724 - time (sec): 10.94 - samples/sec: 3314.61 - lr: 0.000041 - momentum: 0.000000
2023-10-25 21:28:55,123 epoch 3 - iter 216/272 - loss 0.06922837 - time (sec): 12.50 - samples/sec: 3315.90 - lr: 0.000040 - momentum: 0.000000
2023-10-25 21:28:56,720 epoch 3 - iter 243/272 - loss 0.06963662 - time (sec): 14.10 - samples/sec: 3284.26 - lr: 0.000040 - momentum: 0.000000
2023-10-25 21:28:58,265 epoch 3 - iter 270/272 - loss 0.06919947 - time (sec): 15.64 - samples/sec: 3310.51 - lr: 0.000039 - momentum: 0.000000
2023-10-25 21:28:58,371 ----------------------------------------------------------------------------------------------------
2023-10-25 21:28:58,371 EPOCH 3 done: loss 0.0690 - lr: 0.000039
2023-10-25 21:28:59,614 DEV : loss 0.1450655460357666 - f1-score (micro avg) 0.7698
2023-10-25 21:28:59,620 saving best model
2023-10-25 21:29:00,271 ----------------------------------------------------------------------------------------------------
2023-10-25 21:29:01,797 epoch 4 - iter 27/272 - loss 0.03839164 - time (sec): 1.52 - samples/sec: 3389.43 - lr: 0.000038 - momentum: 0.000000
2023-10-25 21:29:03,246 epoch 4 - iter 54/272 - loss 0.05836842 - time (sec): 2.97 - samples/sec: 3375.16 - lr: 0.000038 - momentum: 0.000000
2023-10-25 21:29:04,837 epoch 4 - iter 81/272 - loss 0.05467419 - time (sec): 4.56 - samples/sec: 3543.96 - lr: 0.000037 - momentum: 0.000000
2023-10-25 21:29:06,369 epoch 4 - iter 108/272 - loss 0.05545031 - time (sec): 6.09 - samples/sec: 3456.12 - lr: 0.000037 - momentum: 0.000000
2023-10-25 21:29:07,934 epoch 4 - iter 135/272 - loss 0.05172334 - time (sec): 7.66 - samples/sec: 3437.62 - lr: 0.000036 - momentum: 0.000000
2023-10-25 21:29:09,467 epoch 4 - iter 162/272 - loss 0.05002860 - time (sec): 9.19 - samples/sec: 3467.60 - lr: 0.000036 - momentum: 0.000000
2023-10-25 21:29:10,944 epoch 4 - iter 189/272 - loss 0.04822871 - time (sec): 10.67 - samples/sec: 3465.66 - lr: 0.000035 - momentum: 0.000000
2023-10-25 21:29:12,476 epoch 4 - iter 216/272 - loss 0.04524283 - time (sec): 12.20 - samples/sec: 3457.22 - lr: 0.000034 - momentum: 0.000000
2023-10-25 21:29:13,983 epoch 4 - iter 243/272 - loss 0.04533348 - time (sec): 13.71 - samples/sec: 3443.64 - lr: 0.000034 - momentum: 0.000000
2023-10-25 21:29:15,536 epoch 4 - iter 270/272 - loss 0.04712815 - time (sec): 15.26 - samples/sec: 3395.72 - lr: 0.000033 - momentum: 0.000000
2023-10-25 21:29:15,635 ----------------------------------------------------------------------------------------------------
2023-10-25 21:29:15,636 EPOCH 4 done: loss 0.0470 - lr: 0.000033
2023-10-25 21:29:16,808 DEV : loss 0.14436790347099304 - f1-score (micro avg) 0.7925
2023-10-25 21:29:16,814 saving best model
2023-10-25 21:29:17,460 ----------------------------------------------------------------------------------------------------
2023-10-25 21:29:19,028 epoch 5 - iter 27/272 - loss 0.02507648 - time (sec): 1.56 - samples/sec: 3238.49 - lr: 0.000033 - momentum: 0.000000
2023-10-25 21:29:20,566 epoch 5 - iter 54/272 - loss 0.03155307 - time (sec): 3.10 - samples/sec: 3166.66 - lr: 0.000032 - momentum: 0.000000
2023-10-25 21:29:22,132 epoch 5 - iter 81/272 - loss 0.02739434 - time (sec): 4.67 - samples/sec: 3166.84 - lr: 0.000032 - momentum: 0.000000
2023-10-25 21:29:23,726 epoch 5 - iter 108/272 - loss 0.02941651 - time (sec): 6.26 - samples/sec: 3191.64 - lr: 0.000031 - momentum: 0.000000
2023-10-25 21:29:25,199 epoch 5 - iter 135/272 - loss 0.02739614 - time (sec): 7.74 - samples/sec: 3135.22 - lr: 0.000031 - momentum: 0.000000
2023-10-25 21:29:26,698 epoch 5 - iter 162/272 - loss 0.03066702 - time (sec): 9.23 - samples/sec: 3237.55 - lr: 0.000030 - momentum: 0.000000
2023-10-25 21:29:28,219 epoch 5 - iter 189/272 - loss 0.03167286 - time (sec): 10.76 - samples/sec: 3268.30 - lr: 0.000029 - momentum: 0.000000
2023-10-25 21:29:29,737 epoch 5 - iter 216/272 - loss 0.03130840 - time (sec): 12.27 - samples/sec: 3274.44 - lr: 0.000029 - momentum: 0.000000
2023-10-25 21:29:31,274 epoch 5 - iter 243/272 - loss 0.03201179 - time (sec): 13.81 - samples/sec: 3346.11 - lr: 0.000028 - momentum: 0.000000
2023-10-25 21:29:32,794 epoch 5 - iter 270/272 - loss 0.03426102 - time (sec): 15.33 - samples/sec: 3380.00 - lr: 0.000028 - momentum: 0.000000
2023-10-25 21:29:32,902 ----------------------------------------------------------------------------------------------------
2023-10-25 21:29:32,902 EPOCH 5 done: loss 0.0342 - lr: 0.000028
2023-10-25 21:29:34,118 DEV : loss 0.1501101553440094 - f1-score (micro avg) 0.7964
2023-10-25 21:29:34,124 saving best model
2023-10-25 21:29:34,805 ----------------------------------------------------------------------------------------------------
2023-10-25 21:29:36,368 epoch 6 - iter 27/272 - loss 0.01909779 - time (sec): 1.56 - samples/sec: 3493.85 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:29:37,843 epoch 6 - iter 54/272 - loss 0.01727977 - time (sec): 3.04 - samples/sec: 3533.53 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:29:39,322 epoch 6 - iter 81/272 - loss 0.01553708 - time (sec): 4.51 - samples/sec: 3427.14 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:29:40,841 epoch 6 - iter 108/272 - loss 0.01998247 - time (sec): 6.03 - samples/sec: 3439.13 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:29:42,736 epoch 6 - iter 135/272 - loss 0.01744104 - time (sec): 7.93 - samples/sec: 3234.22 - lr: 0.000025 - momentum: 0.000000
2023-10-25 21:29:44,174 epoch 6 - iter 162/272 - loss 0.01847424 - time (sec): 9.37 - samples/sec: 3320.47 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:29:45,642 epoch 6 - iter 189/272 - loss 0.01935988 - time (sec): 10.83 - samples/sec: 3383.82 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:29:47,102 epoch 6 - iter 216/272 - loss 0.01931684 - time (sec): 12.29 - samples/sec: 3326.04 - lr: 0.000023 - momentum: 0.000000
2023-10-25 21:29:48,602 epoch 6 - iter 243/272 - loss 0.01934257 - time (sec): 13.79 - samples/sec: 3372.67 - lr: 0.000023 - momentum: 0.000000
2023-10-25 21:29:50,066 epoch 6 - iter 270/272 - loss 0.02070140 - time (sec): 15.26 - samples/sec: 3379.89 - lr: 0.000022 - momentum: 0.000000
2023-10-25 21:29:50,171 ----------------------------------------------------------------------------------------------------
2023-10-25 21:29:50,172 EPOCH 6 done: loss 0.0205 - lr: 0.000022
2023-10-25 21:29:51,433 DEV : loss 0.165546253323555 - f1-score (micro avg) 0.8066
2023-10-25 21:29:51,440 saving best model
2023-10-25 21:29:54,714 ----------------------------------------------------------------------------------------------------
2023-10-25 21:29:56,247 epoch 7 - iter 27/272 - loss 0.00697830 - time (sec): 1.53 - samples/sec: 3623.25 - lr: 0.000022 - momentum: 0.000000
2023-10-25 21:29:57,777 epoch 7 - iter 54/272 - loss 0.00914022 - time (sec): 3.06 - samples/sec: 3451.66 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:29:59,278 epoch 7 - iter 81/272 - loss 0.00936467 - time (sec): 4.56 - samples/sec: 3464.62 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:30:00,780 epoch 7 - iter 108/272 - loss 0.00900487 - time (sec): 6.06 - samples/sec: 3543.39 - lr: 0.000020 - momentum: 0.000000
2023-10-25 21:30:02,352 epoch 7 - iter 135/272 - loss 0.01188333 - time (sec): 7.64 - samples/sec: 3443.14 - lr: 0.000019 - momentum: 0.000000
2023-10-25 21:30:03,855 epoch 7 - iter 162/272 - loss 0.01348404 - time (sec): 9.14 - samples/sec: 3432.95 - lr: 0.000019 - momentum: 0.000000
2023-10-25 21:30:05,413 epoch 7 - iter 189/272 - loss 0.01577447 - time (sec): 10.70 - samples/sec: 3464.67 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:30:06,948 epoch 7 - iter 216/272 - loss 0.01509764 - time (sec): 12.23 - samples/sec: 3455.51 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:30:08,406 epoch 7 - iter 243/272 - loss 0.01702924 - time (sec): 13.69 - samples/sec: 3423.92 - lr: 0.000017 - momentum: 0.000000
2023-10-25 21:30:09,929 epoch 7 - iter 270/272 - loss 0.01657090 - time (sec): 15.21 - samples/sec: 3393.89 - lr: 0.000017 - momentum: 0.000000
2023-10-25 21:30:10,042 ----------------------------------------------------------------------------------------------------
2023-10-25 21:30:10,042 EPOCH 7 done: loss 0.0165 - lr: 0.000017
2023-10-25 21:30:11,380 DEV : loss 0.18749994039535522 - f1-score (micro avg) 0.8036
2023-10-25 21:30:11,386 ----------------------------------------------------------------------------------------------------
2023-10-25 21:30:12,890 epoch 8 - iter 27/272 - loss 0.01739788 - time (sec): 1.50 - samples/sec: 4044.12 - lr: 0.000016 - momentum: 0.000000
2023-10-25 21:30:14,357 epoch 8 - iter 54/272 - loss 0.01871958 - time (sec): 2.97 - samples/sec: 3716.48 - lr: 0.000016 - momentum: 0.000000
2023-10-25 21:30:15,894 epoch 8 - iter 81/272 - loss 0.01462857 - time (sec): 4.51 - samples/sec: 3692.83 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:30:17,387 epoch 8 - iter 108/272 - loss 0.01387024 - time (sec): 6.00 - samples/sec: 3631.22 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:30:18,930 epoch 8 - iter 135/272 - loss 0.01409383 - time (sec): 7.54 - samples/sec: 3613.71 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:30:20,417 epoch 8 - iter 162/272 - loss 0.01425509 - time (sec): 9.03 - samples/sec: 3556.26 - lr: 0.000013 - momentum: 0.000000
2023-10-25 21:30:21,950 epoch 8 - iter 189/272 - loss 0.01373470 - time (sec): 10.56 - samples/sec: 3529.86 - lr: 0.000013 - momentum: 0.000000
2023-10-25 21:30:23,498 epoch 8 - iter 216/272 - loss 0.01224350 - time (sec): 12.11 - samples/sec: 3512.96 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:30:25,052 epoch 8 - iter 243/272 - loss 0.01131915 - time (sec): 13.67 - samples/sec: 3470.03 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:30:26,502 epoch 8 - iter 270/272 - loss 0.01157460 - time (sec): 15.12 - samples/sec: 3431.48 - lr: 0.000011 - momentum: 0.000000
2023-10-25 21:30:26,605 ----------------------------------------------------------------------------------------------------
2023-10-25 21:30:26,605 EPOCH 8 done: loss 0.0115 - lr: 0.000011
2023-10-25 21:30:27,835 DEV : loss 0.17698289453983307 - f1-score (micro avg) 0.8266
2023-10-25 21:30:27,841 saving best model
2023-10-25 21:30:28,465 ----------------------------------------------------------------------------------------------------
2023-10-25 21:30:30,036 epoch 9 - iter 27/272 - loss 0.00058588 - time (sec): 1.57 - samples/sec: 3749.18 - lr: 0.000011 - momentum: 0.000000
2023-10-25 21:30:31,531 epoch 9 - iter 54/272 - loss 0.00623629 - time (sec): 3.06 - samples/sec: 3569.82 - lr: 0.000010 - momentum: 0.000000
2023-10-25 21:30:33,037 epoch 9 - iter 81/272 - loss 0.00539403 - time (sec): 4.57 - samples/sec: 3400.69 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:30:34,548 epoch 9 - iter 108/272 - loss 0.00741251 - time (sec): 6.08 - samples/sec: 3341.22 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:30:36,038 epoch 9 - iter 135/272 - loss 0.00926729 - time (sec): 7.57 - samples/sec: 3391.09 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:30:37,554 epoch 9 - iter 162/272 - loss 0.00852067 - time (sec): 9.09 - samples/sec: 3414.77 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:30:39,074 epoch 9 - iter 189/272 - loss 0.00788998 - time (sec): 10.61 - samples/sec: 3466.92 - lr: 0.000007 - momentum: 0.000000
2023-10-25 21:30:40,588 epoch 9 - iter 216/272 - loss 0.00711701 - time (sec): 12.12 - samples/sec: 3418.25 - lr: 0.000007 - momentum: 0.000000
2023-10-25 21:30:42,061 epoch 9 - iter 243/272 - loss 0.00659543 - time (sec): 13.59 - samples/sec: 3439.13 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:30:43,592 epoch 9 - iter 270/272 - loss 0.00744715 - time (sec): 15.13 - samples/sec: 3427.52 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:30:43,695 ----------------------------------------------------------------------------------------------------
2023-10-25 21:30:43,696 EPOCH 9 done: loss 0.0074 - lr: 0.000006
2023-10-25 21:30:45,471 DEV : loss 0.1907862275838852 - f1-score (micro avg) 0.822
2023-10-25 21:30:45,478 ----------------------------------------------------------------------------------------------------
2023-10-25 21:30:46,944 epoch 10 - iter 27/272 - loss 0.00585267 - time (sec): 1.47 - samples/sec: 3723.63 - lr: 0.000005 - momentum: 0.000000
2023-10-25 21:30:48,393 epoch 10 - iter 54/272 - loss 0.00495959 - time (sec): 2.91 - samples/sec: 3508.61 - lr: 0.000004 - momentum: 0.000000
2023-10-25 21:30:49,885 epoch 10 - iter 81/272 - loss 0.00450007 - time (sec): 4.41 - samples/sec: 3559.91 - lr: 0.000004 - momentum: 0.000000
2023-10-25 21:30:51,343 epoch 10 - iter 108/272 - loss 0.00425761 - time (sec): 5.86 - samples/sec: 3633.67 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:30:52,829 epoch 10 - iter 135/272 - loss 0.00470265 - time (sec): 7.35 - samples/sec: 3550.95 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:30:54,275 epoch 10 - iter 162/272 - loss 0.00419178 - time (sec): 8.80 - samples/sec: 3540.66 - lr: 0.000002 - momentum: 0.000000
2023-10-25 21:30:55,718 epoch 10 - iter 189/272 - loss 0.00450719 - time (sec): 10.24 - samples/sec: 3586.16 - lr: 0.000002 - momentum: 0.000000
2023-10-25 21:30:57,167 epoch 10 - iter 216/272 - loss 0.00419873 - time (sec): 11.69 - samples/sec: 3564.16 - lr: 0.000001 - momentum: 0.000000
2023-10-25 21:30:58,628 epoch 10 - iter 243/272 - loss 0.00473732 - time (sec): 13.15 - samples/sec: 3537.51 - lr: 0.000001 - momentum: 0.000000
2023-10-25 21:31:00,035 epoch 10 - iter 270/272 - loss 0.00519675 - time (sec): 14.56 - samples/sec: 3555.18 - lr: 0.000000 - momentum: 0.000000
2023-10-25 21:31:00,128 ----------------------------------------------------------------------------------------------------
2023-10-25 21:31:00,128 EPOCH 10 done: loss 0.0052 - lr: 0.000000
2023-10-25 21:31:01,313 DEV : loss 0.19103476405143738 - f1-score (micro avg) 0.8205
2023-10-25 21:31:01,785 ----------------------------------------------------------------------------------------------------
2023-10-25 21:31:01,786 Loading model from best epoch ...
2023-10-25 21:31:03,678 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-25 21:31:05,500
Results:
- F-score (micro) 0.7823
- F-score (macro) 0.7373
- Accuracy 0.664
By class:
precision recall f1-score support
LOC 0.8000 0.8718 0.8344 312
PER 0.6947 0.8750 0.7745 208
ORG 0.4545 0.3636 0.4040 55
HumanProd 0.8800 1.0000 0.9362 22
micro avg 0.7392 0.8308 0.7823 597
macro avg 0.7073 0.7776 0.7373 597
weighted avg 0.7344 0.8308 0.7776 597
2023-10-25 21:31:05,500 ----------------------------------------------------------------------------------------------------