stefan-it's picture
Upload folder using huggingface_hub
4667e32
2023-10-12 18:46:41,769 ----------------------------------------------------------------------------------------------------
2023-10-12 18:46:41,771 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-12 18:46:41,772 ----------------------------------------------------------------------------------------------------
2023-10-12 18:46:41,772 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-12 18:46:41,772 ----------------------------------------------------------------------------------------------------
2023-10-12 18:46:41,772 Train: 20847 sentences
2023-10-12 18:46:41,772 (train_with_dev=False, train_with_test=False)
2023-10-12 18:46:41,772 ----------------------------------------------------------------------------------------------------
2023-10-12 18:46:41,772 Training Params:
2023-10-12 18:46:41,772 - learning_rate: "0.00015"
2023-10-12 18:46:41,772 - mini_batch_size: "4"
2023-10-12 18:46:41,773 - max_epochs: "10"
2023-10-12 18:46:41,773 - shuffle: "True"
2023-10-12 18:46:41,773 ----------------------------------------------------------------------------------------------------
2023-10-12 18:46:41,773 Plugins:
2023-10-12 18:46:41,773 - TensorboardLogger
2023-10-12 18:46:41,773 - LinearScheduler | warmup_fraction: '0.1'
2023-10-12 18:46:41,773 ----------------------------------------------------------------------------------------------------
2023-10-12 18:46:41,773 Final evaluation on model from best epoch (best-model.pt)
2023-10-12 18:46:41,773 - metric: "('micro avg', 'f1-score')"
2023-10-12 18:46:41,773 ----------------------------------------------------------------------------------------------------
2023-10-12 18:46:41,773 Computation:
2023-10-12 18:46:41,773 - compute on device: cuda:0
2023-10-12 18:46:41,773 - embedding storage: none
2023-10-12 18:46:41,774 ----------------------------------------------------------------------------------------------------
2023-10-12 18:46:41,774 Model training base path: "hmbench-newseye/de-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-5"
2023-10-12 18:46:41,774 ----------------------------------------------------------------------------------------------------
2023-10-12 18:46:41,774 ----------------------------------------------------------------------------------------------------
2023-10-12 18:46:41,774 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-12 18:49:15,237 epoch 1 - iter 521/5212 - loss 2.76292849 - time (sec): 153.46 - samples/sec: 265.22 - lr: 0.000015 - momentum: 0.000000
2023-10-12 18:51:48,941 epoch 1 - iter 1042/5212 - loss 2.34035816 - time (sec): 307.16 - samples/sec: 258.13 - lr: 0.000030 - momentum: 0.000000
2023-10-12 18:54:18,340 epoch 1 - iter 1563/5212 - loss 1.85312724 - time (sec): 456.56 - samples/sec: 252.44 - lr: 0.000045 - momentum: 0.000000
2023-10-12 18:56:50,681 epoch 1 - iter 2084/5212 - loss 1.51172426 - time (sec): 608.90 - samples/sec: 249.98 - lr: 0.000060 - momentum: 0.000000
2023-10-12 18:59:20,445 epoch 1 - iter 2605/5212 - loss 1.31079938 - time (sec): 758.67 - samples/sec: 249.93 - lr: 0.000075 - momentum: 0.000000
2023-10-12 19:01:53,736 epoch 1 - iter 3126/5212 - loss 1.15481567 - time (sec): 911.96 - samples/sec: 249.68 - lr: 0.000090 - momentum: 0.000000
2023-10-12 19:04:28,717 epoch 1 - iter 3647/5212 - loss 1.03999737 - time (sec): 1066.94 - samples/sec: 247.05 - lr: 0.000105 - momentum: 0.000000
2023-10-12 19:07:12,068 epoch 1 - iter 4168/5212 - loss 0.94975549 - time (sec): 1230.29 - samples/sec: 243.75 - lr: 0.000120 - momentum: 0.000000
2023-10-12 19:09:41,939 epoch 1 - iter 4689/5212 - loss 0.88001240 - time (sec): 1380.16 - samples/sec: 242.03 - lr: 0.000135 - momentum: 0.000000
2023-10-12 19:12:09,089 epoch 1 - iter 5210/5212 - loss 0.82123342 - time (sec): 1527.31 - samples/sec: 240.48 - lr: 0.000150 - momentum: 0.000000
2023-10-12 19:12:09,617 ----------------------------------------------------------------------------------------------------
2023-10-12 19:12:09,617 EPOCH 1 done: loss 0.8210 - lr: 0.000150
2023-10-12 19:12:47,892 DEV : loss 0.1282375454902649 - f1-score (micro avg) 0.1789
2023-10-12 19:12:47,951 saving best model
2023-10-12 19:12:48,879 ----------------------------------------------------------------------------------------------------
2023-10-12 19:15:16,200 epoch 2 - iter 521/5212 - loss 0.21112691 - time (sec): 147.32 - samples/sec: 242.92 - lr: 0.000148 - momentum: 0.000000
2023-10-12 19:17:45,002 epoch 2 - iter 1042/5212 - loss 0.18824513 - time (sec): 296.12 - samples/sec: 242.99 - lr: 0.000147 - momentum: 0.000000
2023-10-12 19:20:21,035 epoch 2 - iter 1563/5212 - loss 0.17597690 - time (sec): 452.15 - samples/sec: 236.05 - lr: 0.000145 - momentum: 0.000000
2023-10-12 19:23:00,731 epoch 2 - iter 2084/5212 - loss 0.17106880 - time (sec): 611.85 - samples/sec: 234.34 - lr: 0.000143 - momentum: 0.000000
2023-10-12 19:25:39,395 epoch 2 - iter 2605/5212 - loss 0.16970840 - time (sec): 770.51 - samples/sec: 233.26 - lr: 0.000142 - momentum: 0.000000
2023-10-12 19:28:16,729 epoch 2 - iter 3126/5212 - loss 0.16401569 - time (sec): 927.85 - samples/sec: 235.26 - lr: 0.000140 - momentum: 0.000000
2023-10-12 19:30:55,046 epoch 2 - iter 3647/5212 - loss 0.16203738 - time (sec): 1086.16 - samples/sec: 233.86 - lr: 0.000138 - momentum: 0.000000
2023-10-12 19:33:34,826 epoch 2 - iter 4168/5212 - loss 0.16114061 - time (sec): 1245.94 - samples/sec: 234.36 - lr: 0.000137 - momentum: 0.000000
2023-10-12 19:36:12,855 epoch 2 - iter 4689/5212 - loss 0.15881510 - time (sec): 1403.97 - samples/sec: 234.89 - lr: 0.000135 - momentum: 0.000000
2023-10-12 19:38:56,123 epoch 2 - iter 5210/5212 - loss 0.15670979 - time (sec): 1567.24 - samples/sec: 234.31 - lr: 0.000133 - momentum: 0.000000
2023-10-12 19:38:56,741 ----------------------------------------------------------------------------------------------------
2023-10-12 19:38:56,742 EPOCH 2 done: loss 0.1567 - lr: 0.000133
2023-10-12 19:39:39,874 DEV : loss 0.16946110129356384 - f1-score (micro avg) 0.3451
2023-10-12 19:39:39,943 saving best model
2023-10-12 19:39:42,763 ----------------------------------------------------------------------------------------------------
2023-10-12 19:42:21,014 epoch 3 - iter 521/5212 - loss 0.10074178 - time (sec): 158.25 - samples/sec: 224.62 - lr: 0.000132 - momentum: 0.000000
2023-10-12 19:44:59,477 epoch 3 - iter 1042/5212 - loss 0.10972804 - time (sec): 316.71 - samples/sec: 217.74 - lr: 0.000130 - momentum: 0.000000
2023-10-12 19:47:35,591 epoch 3 - iter 1563/5212 - loss 0.10493572 - time (sec): 472.82 - samples/sec: 225.90 - lr: 0.000128 - momentum: 0.000000
2023-10-12 19:50:16,866 epoch 3 - iter 2084/5212 - loss 0.10489412 - time (sec): 634.10 - samples/sec: 225.16 - lr: 0.000127 - momentum: 0.000000
2023-10-12 19:52:51,660 epoch 3 - iter 2605/5212 - loss 0.10195802 - time (sec): 788.89 - samples/sec: 227.66 - lr: 0.000125 - momentum: 0.000000
2023-10-12 19:55:27,756 epoch 3 - iter 3126/5212 - loss 0.10181064 - time (sec): 944.99 - samples/sec: 229.18 - lr: 0.000123 - momentum: 0.000000
2023-10-12 19:58:01,865 epoch 3 - iter 3647/5212 - loss 0.10418318 - time (sec): 1099.10 - samples/sec: 229.25 - lr: 0.000122 - momentum: 0.000000
2023-10-12 20:00:37,316 epoch 3 - iter 4168/5212 - loss 0.10492334 - time (sec): 1254.55 - samples/sec: 231.05 - lr: 0.000120 - momentum: 0.000000
2023-10-12 20:03:14,249 epoch 3 - iter 4689/5212 - loss 0.10268963 - time (sec): 1411.48 - samples/sec: 233.37 - lr: 0.000118 - momentum: 0.000000
2023-10-12 20:05:50,322 epoch 3 - iter 5210/5212 - loss 0.10507983 - time (sec): 1567.56 - samples/sec: 234.33 - lr: 0.000117 - momentum: 0.000000
2023-10-12 20:05:50,832 ----------------------------------------------------------------------------------------------------
2023-10-12 20:05:50,832 EPOCH 3 done: loss 0.1051 - lr: 0.000117
2023-10-12 20:06:34,155 DEV : loss 0.2250872701406479 - f1-score (micro avg) 0.3354
2023-10-12 20:06:34,209 ----------------------------------------------------------------------------------------------------
2023-10-12 20:09:06,488 epoch 4 - iter 521/5212 - loss 0.08085581 - time (sec): 152.28 - samples/sec: 234.78 - lr: 0.000115 - momentum: 0.000000
2023-10-12 20:11:41,459 epoch 4 - iter 1042/5212 - loss 0.07403465 - time (sec): 307.25 - samples/sec: 237.94 - lr: 0.000113 - momentum: 0.000000
2023-10-12 20:14:15,463 epoch 4 - iter 1563/5212 - loss 0.07375911 - time (sec): 461.25 - samples/sec: 237.25 - lr: 0.000112 - momentum: 0.000000
2023-10-12 20:16:48,933 epoch 4 - iter 2084/5212 - loss 0.07382402 - time (sec): 614.72 - samples/sec: 234.88 - lr: 0.000110 - momentum: 0.000000
2023-10-12 20:19:25,310 epoch 4 - iter 2605/5212 - loss 0.07632121 - time (sec): 771.10 - samples/sec: 236.86 - lr: 0.000108 - momentum: 0.000000
2023-10-12 20:22:01,623 epoch 4 - iter 3126/5212 - loss 0.07620217 - time (sec): 927.41 - samples/sec: 238.49 - lr: 0.000107 - momentum: 0.000000
2023-10-12 20:24:35,917 epoch 4 - iter 3647/5212 - loss 0.07340067 - time (sec): 1081.71 - samples/sec: 238.77 - lr: 0.000105 - momentum: 0.000000
2023-10-12 20:27:09,657 epoch 4 - iter 4168/5212 - loss 0.07375064 - time (sec): 1235.45 - samples/sec: 238.55 - lr: 0.000103 - momentum: 0.000000
2023-10-12 20:29:43,452 epoch 4 - iter 4689/5212 - loss 0.07293438 - time (sec): 1389.24 - samples/sec: 238.15 - lr: 0.000102 - momentum: 0.000000
2023-10-12 20:32:16,851 epoch 4 - iter 5210/5212 - loss 0.07341752 - time (sec): 1542.64 - samples/sec: 238.08 - lr: 0.000100 - momentum: 0.000000
2023-10-12 20:32:17,431 ----------------------------------------------------------------------------------------------------
2023-10-12 20:32:17,431 EPOCH 4 done: loss 0.0735 - lr: 0.000100
2023-10-12 20:32:59,595 DEV : loss 0.23921194672584534 - f1-score (micro avg) 0.3737
2023-10-12 20:32:59,649 saving best model
2023-10-12 20:33:02,488 ----------------------------------------------------------------------------------------------------
2023-10-12 20:35:36,727 epoch 5 - iter 521/5212 - loss 0.03913308 - time (sec): 154.23 - samples/sec: 233.83 - lr: 0.000098 - momentum: 0.000000
2023-10-12 20:38:11,970 epoch 5 - iter 1042/5212 - loss 0.05004518 - time (sec): 309.48 - samples/sec: 226.85 - lr: 0.000097 - momentum: 0.000000
2023-10-12 20:40:54,356 epoch 5 - iter 1563/5212 - loss 0.05132965 - time (sec): 471.86 - samples/sec: 228.10 - lr: 0.000095 - momentum: 0.000000
2023-10-12 20:43:33,992 epoch 5 - iter 2084/5212 - loss 0.05244491 - time (sec): 631.50 - samples/sec: 227.65 - lr: 0.000093 - momentum: 0.000000
2023-10-12 20:46:05,541 epoch 5 - iter 2605/5212 - loss 0.05103574 - time (sec): 783.05 - samples/sec: 231.43 - lr: 0.000092 - momentum: 0.000000
2023-10-12 20:48:38,344 epoch 5 - iter 3126/5212 - loss 0.04987451 - time (sec): 935.85 - samples/sec: 236.61 - lr: 0.000090 - momentum: 0.000000
2023-10-12 20:51:13,890 epoch 5 - iter 3647/5212 - loss 0.05008071 - time (sec): 1091.40 - samples/sec: 238.11 - lr: 0.000088 - momentum: 0.000000
2023-10-12 20:53:47,576 epoch 5 - iter 4168/5212 - loss 0.05030602 - time (sec): 1245.08 - samples/sec: 235.30 - lr: 0.000087 - momentum: 0.000000
2023-10-12 20:56:25,717 epoch 5 - iter 4689/5212 - loss 0.04910738 - time (sec): 1403.22 - samples/sec: 235.57 - lr: 0.000085 - momentum: 0.000000
2023-10-12 20:59:02,470 epoch 5 - iter 5210/5212 - loss 0.04976601 - time (sec): 1559.98 - samples/sec: 235.43 - lr: 0.000083 - momentum: 0.000000
2023-10-12 20:59:03,023 ----------------------------------------------------------------------------------------------------
2023-10-12 20:59:03,024 EPOCH 5 done: loss 0.0497 - lr: 0.000083
2023-10-12 20:59:45,820 DEV : loss 0.3263615667819977 - f1-score (micro avg) 0.384
2023-10-12 20:59:45,890 saving best model
2023-10-12 20:59:48,757 ----------------------------------------------------------------------------------------------------
2023-10-12 21:02:23,308 epoch 6 - iter 521/5212 - loss 0.02791178 - time (sec): 154.55 - samples/sec: 228.86 - lr: 0.000082 - momentum: 0.000000
2023-10-12 21:05:03,072 epoch 6 - iter 1042/5212 - loss 0.03190034 - time (sec): 314.31 - samples/sec: 236.91 - lr: 0.000080 - momentum: 0.000000
2023-10-12 21:07:39,800 epoch 6 - iter 1563/5212 - loss 0.03265756 - time (sec): 471.04 - samples/sec: 237.82 - lr: 0.000078 - momentum: 0.000000
2023-10-12 21:10:13,089 epoch 6 - iter 2084/5212 - loss 0.03451907 - time (sec): 624.33 - samples/sec: 235.96 - lr: 0.000077 - momentum: 0.000000
2023-10-12 21:12:46,251 epoch 6 - iter 2605/5212 - loss 0.03548737 - time (sec): 777.49 - samples/sec: 233.74 - lr: 0.000075 - momentum: 0.000000
2023-10-12 21:15:24,589 epoch 6 - iter 3126/5212 - loss 0.03533261 - time (sec): 935.83 - samples/sec: 236.24 - lr: 0.000073 - momentum: 0.000000
2023-10-12 21:18:01,460 epoch 6 - iter 3647/5212 - loss 0.03553896 - time (sec): 1092.70 - samples/sec: 237.24 - lr: 0.000072 - momentum: 0.000000
2023-10-12 21:20:34,172 epoch 6 - iter 4168/5212 - loss 0.03549980 - time (sec): 1245.41 - samples/sec: 235.23 - lr: 0.000070 - momentum: 0.000000
2023-10-12 21:23:07,241 epoch 6 - iter 4689/5212 - loss 0.03556410 - time (sec): 1398.48 - samples/sec: 235.14 - lr: 0.000068 - momentum: 0.000000
2023-10-12 21:25:43,273 epoch 6 - iter 5210/5212 - loss 0.03566000 - time (sec): 1554.51 - samples/sec: 236.10 - lr: 0.000067 - momentum: 0.000000
2023-10-12 21:25:44,091 ----------------------------------------------------------------------------------------------------
2023-10-12 21:25:44,091 EPOCH 6 done: loss 0.0356 - lr: 0.000067
2023-10-12 21:26:26,238 DEV : loss 0.4169439971446991 - f1-score (micro avg) 0.3567
2023-10-12 21:26:26,291 ----------------------------------------------------------------------------------------------------
2023-10-12 21:29:00,654 epoch 7 - iter 521/5212 - loss 0.02119835 - time (sec): 154.36 - samples/sec: 239.30 - lr: 0.000065 - momentum: 0.000000
2023-10-12 21:31:33,480 epoch 7 - iter 1042/5212 - loss 0.02655194 - time (sec): 307.19 - samples/sec: 238.23 - lr: 0.000063 - momentum: 0.000000
2023-10-12 21:34:06,511 epoch 7 - iter 1563/5212 - loss 0.02488243 - time (sec): 460.22 - samples/sec: 238.09 - lr: 0.000062 - momentum: 0.000000
2023-10-12 21:36:40,328 epoch 7 - iter 2084/5212 - loss 0.02403086 - time (sec): 614.03 - samples/sec: 238.85 - lr: 0.000060 - momentum: 0.000000
2023-10-12 21:39:12,655 epoch 7 - iter 2605/5212 - loss 0.02533732 - time (sec): 766.36 - samples/sec: 240.07 - lr: 0.000058 - momentum: 0.000000
2023-10-12 21:41:48,693 epoch 7 - iter 3126/5212 - loss 0.02466719 - time (sec): 922.40 - samples/sec: 244.66 - lr: 0.000057 - momentum: 0.000000
2023-10-12 21:44:21,354 epoch 7 - iter 3647/5212 - loss 0.02434422 - time (sec): 1075.06 - samples/sec: 243.70 - lr: 0.000055 - momentum: 0.000000
2023-10-12 21:46:50,093 epoch 7 - iter 4168/5212 - loss 0.02517306 - time (sec): 1223.80 - samples/sec: 241.32 - lr: 0.000053 - momentum: 0.000000
2023-10-12 21:49:21,621 epoch 7 - iter 4689/5212 - loss 0.02542309 - time (sec): 1375.33 - samples/sec: 240.61 - lr: 0.000052 - momentum: 0.000000
2023-10-12 21:51:50,963 epoch 7 - iter 5210/5212 - loss 0.02492844 - time (sec): 1524.67 - samples/sec: 240.92 - lr: 0.000050 - momentum: 0.000000
2023-10-12 21:51:51,473 ----------------------------------------------------------------------------------------------------
2023-10-12 21:51:51,474 EPOCH 7 done: loss 0.0249 - lr: 0.000050
2023-10-12 21:52:34,668 DEV : loss 0.44921109080314636 - f1-score (micro avg) 0.3588
2023-10-12 21:52:34,724 ----------------------------------------------------------------------------------------------------
2023-10-12 21:55:06,252 epoch 8 - iter 521/5212 - loss 0.01723823 - time (sec): 151.53 - samples/sec: 248.05 - lr: 0.000048 - momentum: 0.000000
2023-10-12 21:57:37,292 epoch 8 - iter 1042/5212 - loss 0.01658125 - time (sec): 302.57 - samples/sec: 242.31 - lr: 0.000047 - momentum: 0.000000
2023-10-12 22:00:10,298 epoch 8 - iter 1563/5212 - loss 0.01573445 - time (sec): 455.57 - samples/sec: 242.90 - lr: 0.000045 - momentum: 0.000000
2023-10-12 22:02:43,368 epoch 8 - iter 2084/5212 - loss 0.01677614 - time (sec): 608.64 - samples/sec: 243.81 - lr: 0.000043 - momentum: 0.000000
2023-10-12 22:05:14,154 epoch 8 - iter 2605/5212 - loss 0.01684349 - time (sec): 759.43 - samples/sec: 240.27 - lr: 0.000042 - momentum: 0.000000
2023-10-12 22:07:45,939 epoch 8 - iter 3126/5212 - loss 0.01664639 - time (sec): 911.21 - samples/sec: 240.75 - lr: 0.000040 - momentum: 0.000000
2023-10-12 22:10:15,255 epoch 8 - iter 3647/5212 - loss 0.01741545 - time (sec): 1060.53 - samples/sec: 239.06 - lr: 0.000038 - momentum: 0.000000
2023-10-12 22:12:44,856 epoch 8 - iter 4168/5212 - loss 0.01704525 - time (sec): 1210.13 - samples/sec: 239.20 - lr: 0.000037 - momentum: 0.000000
2023-10-12 22:15:17,516 epoch 8 - iter 4689/5212 - loss 0.01702684 - time (sec): 1362.79 - samples/sec: 242.11 - lr: 0.000035 - momentum: 0.000000
2023-10-12 22:17:46,344 epoch 8 - iter 5210/5212 - loss 0.01674797 - time (sec): 1511.62 - samples/sec: 243.02 - lr: 0.000033 - momentum: 0.000000
2023-10-12 22:17:46,804 ----------------------------------------------------------------------------------------------------
2023-10-12 22:17:46,804 EPOCH 8 done: loss 0.0167 - lr: 0.000033
2023-10-12 22:18:27,780 DEV : loss 0.4403105676174164 - f1-score (micro avg) 0.3928
2023-10-12 22:18:27,855 saving best model
2023-10-12 22:18:31,457 ----------------------------------------------------------------------------------------------------
2023-10-12 22:20:59,841 epoch 9 - iter 521/5212 - loss 0.01491704 - time (sec): 148.38 - samples/sec: 252.05 - lr: 0.000032 - momentum: 0.000000
2023-10-12 22:23:25,151 epoch 9 - iter 1042/5212 - loss 0.01413508 - time (sec): 293.69 - samples/sec: 239.66 - lr: 0.000030 - momentum: 0.000000
2023-10-12 22:25:53,399 epoch 9 - iter 1563/5212 - loss 0.01315106 - time (sec): 441.94 - samples/sec: 243.38 - lr: 0.000028 - momentum: 0.000000
2023-10-12 22:28:21,805 epoch 9 - iter 2084/5212 - loss 0.01196193 - time (sec): 590.34 - samples/sec: 245.16 - lr: 0.000027 - momentum: 0.000000
2023-10-12 22:30:52,285 epoch 9 - iter 2605/5212 - loss 0.01228176 - time (sec): 740.82 - samples/sec: 247.75 - lr: 0.000025 - momentum: 0.000000
2023-10-12 22:33:20,379 epoch 9 - iter 3126/5212 - loss 0.01164398 - time (sec): 888.92 - samples/sec: 246.57 - lr: 0.000023 - momentum: 0.000000
2023-10-12 22:35:51,301 epoch 9 - iter 3647/5212 - loss 0.01149750 - time (sec): 1039.84 - samples/sec: 247.54 - lr: 0.000022 - momentum: 0.000000
2023-10-12 22:38:21,783 epoch 9 - iter 4168/5212 - loss 0.01196632 - time (sec): 1190.32 - samples/sec: 246.94 - lr: 0.000020 - momentum: 0.000000
2023-10-12 22:40:52,663 epoch 9 - iter 4689/5212 - loss 0.01189000 - time (sec): 1341.20 - samples/sec: 246.35 - lr: 0.000018 - momentum: 0.000000
2023-10-12 22:43:19,842 epoch 9 - iter 5210/5212 - loss 0.01156761 - time (sec): 1488.38 - samples/sec: 246.82 - lr: 0.000017 - momentum: 0.000000
2023-10-12 22:43:20,283 ----------------------------------------------------------------------------------------------------
2023-10-12 22:43:20,284 EPOCH 9 done: loss 0.0116 - lr: 0.000017
2023-10-12 22:44:01,953 DEV : loss 0.44486090540885925 - f1-score (micro avg) 0.4028
2023-10-12 22:44:02,009 saving best model
2023-10-12 22:44:04,674 ----------------------------------------------------------------------------------------------------
2023-10-12 22:46:35,093 epoch 10 - iter 521/5212 - loss 0.01030731 - time (sec): 150.41 - samples/sec: 243.36 - lr: 0.000015 - momentum: 0.000000
2023-10-12 22:49:05,792 epoch 10 - iter 1042/5212 - loss 0.01158820 - time (sec): 301.11 - samples/sec: 241.13 - lr: 0.000013 - momentum: 0.000000
2023-10-12 22:51:50,787 epoch 10 - iter 1563/5212 - loss 0.00996155 - time (sec): 466.11 - samples/sec: 235.84 - lr: 0.000012 - momentum: 0.000000
2023-10-12 22:54:31,236 epoch 10 - iter 2084/5212 - loss 0.00898869 - time (sec): 626.56 - samples/sec: 231.92 - lr: 0.000010 - momentum: 0.000000
2023-10-12 22:57:06,304 epoch 10 - iter 2605/5212 - loss 0.00842857 - time (sec): 781.62 - samples/sec: 232.94 - lr: 0.000008 - momentum: 0.000000
2023-10-12 22:59:38,073 epoch 10 - iter 3126/5212 - loss 0.00869311 - time (sec): 933.39 - samples/sec: 234.97 - lr: 0.000007 - momentum: 0.000000
2023-10-12 23:02:07,453 epoch 10 - iter 3647/5212 - loss 0.00833985 - time (sec): 1082.77 - samples/sec: 234.54 - lr: 0.000005 - momentum: 0.000000
2023-10-12 23:04:44,311 epoch 10 - iter 4168/5212 - loss 0.00850446 - time (sec): 1239.63 - samples/sec: 234.93 - lr: 0.000003 - momentum: 0.000000
2023-10-12 23:07:20,219 epoch 10 - iter 4689/5212 - loss 0.00844827 - time (sec): 1395.54 - samples/sec: 235.44 - lr: 0.000002 - momentum: 0.000000
2023-10-12 23:09:57,735 epoch 10 - iter 5210/5212 - loss 0.00809271 - time (sec): 1553.06 - samples/sec: 236.51 - lr: 0.000000 - momentum: 0.000000
2023-10-12 23:09:58,253 ----------------------------------------------------------------------------------------------------
2023-10-12 23:09:58,253 EPOCH 10 done: loss 0.0081 - lr: 0.000000
2023-10-12 23:10:41,281 DEV : loss 0.502132773399353 - f1-score (micro avg) 0.4
2023-10-12 23:10:42,276 ----------------------------------------------------------------------------------------------------
2023-10-12 23:10:42,278 Loading model from best epoch ...
2023-10-12 23:10:46,633 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-12 23:12:31,935
Results:
- F-score (micro) 0.4616
- F-score (macro) 0.3197
- Accuracy 0.3054
By class:
precision recall f1-score support
LOC 0.4873 0.5387 0.5117 1214
PER 0.4096 0.5186 0.4577 808
ORG 0.2997 0.3201 0.3096 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4314 0.4962 0.4616 2390
macro avg 0.2992 0.3443 0.3197 2390
weighted avg 0.4303 0.4962 0.4604 2390
2023-10-12 23:12:31,936 ----------------------------------------------------------------------------------------------------