|
2023-10-11 22:21:01,967 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 22:21:01,969 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-11 22:21:01,969 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 22:21:01,969 MultiCorpus: 7142 train + 698 dev + 2570 test sentences |
|
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator |
|
2023-10-11 22:21:01,969 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 22:21:01,969 Train: 7142 sentences |
|
2023-10-11 22:21:01,969 (train_with_dev=False, train_with_test=False) |
|
2023-10-11 22:21:01,970 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 22:21:01,970 Training Params: |
|
2023-10-11 22:21:01,970 - learning_rate: "0.00016" |
|
2023-10-11 22:21:01,970 - mini_batch_size: "8" |
|
2023-10-11 22:21:01,970 - max_epochs: "10" |
|
2023-10-11 22:21:01,970 - shuffle: "True" |
|
2023-10-11 22:21:01,970 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 22:21:01,970 Plugins: |
|
2023-10-11 22:21:01,970 - TensorboardLogger |
|
2023-10-11 22:21:01,970 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-11 22:21:01,970 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 22:21:01,970 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-11 22:21:01,970 - metric: "('micro avg', 'f1-score')" |
|
2023-10-11 22:21:01,970 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 22:21:01,970 Computation: |
|
2023-10-11 22:21:01,970 - compute on device: cuda:0 |
|
2023-10-11 22:21:01,971 - embedding storage: none |
|
2023-10-11 22:21:01,971 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 22:21:01,971 Model training base path: "hmbench-newseye/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-11 22:21:01,971 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 22:21:01,971 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 22:21:01,971 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-11 22:21:54,369 epoch 1 - iter 89/893 - loss 2.81524244 - time (sec): 52.40 - samples/sec: 516.95 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-11 22:22:46,282 epoch 1 - iter 178/893 - loss 2.73176101 - time (sec): 104.31 - samples/sec: 506.78 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-11 22:23:38,432 epoch 1 - iter 267/893 - loss 2.52603415 - time (sec): 156.46 - samples/sec: 509.33 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-11 22:24:27,279 epoch 1 - iter 356/893 - loss 2.31615221 - time (sec): 205.31 - samples/sec: 510.58 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-11 22:25:16,525 epoch 1 - iter 445/893 - loss 2.08800093 - time (sec): 254.55 - samples/sec: 508.25 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-11 22:26:05,675 epoch 1 - iter 534/893 - loss 1.87576125 - time (sec): 303.70 - samples/sec: 503.88 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-11 22:26:54,470 epoch 1 - iter 623/893 - loss 1.70576459 - time (sec): 352.50 - samples/sec: 502.24 - lr: 0.000111 - momentum: 0.000000 |
|
2023-10-11 22:27:42,644 epoch 1 - iter 712/893 - loss 1.56835597 - time (sec): 400.67 - samples/sec: 497.83 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-11 22:28:31,216 epoch 1 - iter 801/893 - loss 1.44155521 - time (sec): 449.24 - samples/sec: 497.76 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-11 22:29:19,353 epoch 1 - iter 890/893 - loss 1.33419276 - time (sec): 497.38 - samples/sec: 498.80 - lr: 0.000159 - momentum: 0.000000 |
|
2023-10-11 22:29:20,732 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 22:29:20,732 EPOCH 1 done: loss 1.3312 - lr: 0.000159 |
|
2023-10-11 22:29:40,984 DEV : loss 0.24347300827503204 - f1-score (micro avg) 0.4712 |
|
2023-10-11 22:29:41,014 saving best model |
|
2023-10-11 22:29:41,873 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 22:30:32,952 epoch 2 - iter 89/893 - loss 0.28954077 - time (sec): 51.08 - samples/sec: 488.58 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-11 22:31:23,495 epoch 2 - iter 178/893 - loss 0.26838336 - time (sec): 101.62 - samples/sec: 495.29 - lr: 0.000156 - momentum: 0.000000 |
|
2023-10-11 22:32:12,958 epoch 2 - iter 267/893 - loss 0.24433660 - time (sec): 151.08 - samples/sec: 499.09 - lr: 0.000155 - momentum: 0.000000 |
|
2023-10-11 22:33:01,075 epoch 2 - iter 356/893 - loss 0.22640434 - time (sec): 199.20 - samples/sec: 501.23 - lr: 0.000153 - momentum: 0.000000 |
|
2023-10-11 22:33:50,431 epoch 2 - iter 445/893 - loss 0.20805597 - time (sec): 248.56 - samples/sec: 507.68 - lr: 0.000151 - momentum: 0.000000 |
|
2023-10-11 22:34:37,653 epoch 2 - iter 534/893 - loss 0.19855291 - time (sec): 295.78 - samples/sec: 505.35 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-11 22:35:25,559 epoch 2 - iter 623/893 - loss 0.18874509 - time (sec): 343.68 - samples/sec: 504.29 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-11 22:36:14,187 epoch 2 - iter 712/893 - loss 0.18046880 - time (sec): 392.31 - samples/sec: 506.71 - lr: 0.000146 - momentum: 0.000000 |
|
2023-10-11 22:37:02,132 epoch 2 - iter 801/893 - loss 0.17469471 - time (sec): 440.26 - samples/sec: 506.29 - lr: 0.000144 - momentum: 0.000000 |
|
2023-10-11 22:37:50,288 epoch 2 - iter 890/893 - loss 0.16713569 - time (sec): 488.41 - samples/sec: 506.95 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-11 22:37:52,010 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 22:37:52,010 EPOCH 2 done: loss 0.1669 - lr: 0.000142 |
|
2023-10-11 22:38:12,850 DEV : loss 0.09539955109357834 - f1-score (micro avg) 0.7653 |
|
2023-10-11 22:38:12,880 saving best model |
|
2023-10-11 22:38:15,886 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 22:39:03,890 epoch 3 - iter 89/893 - loss 0.07439450 - time (sec): 48.00 - samples/sec: 511.31 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-11 22:39:52,230 epoch 3 - iter 178/893 - loss 0.07435914 - time (sec): 96.34 - samples/sec: 519.96 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-11 22:40:39,097 epoch 3 - iter 267/893 - loss 0.07407314 - time (sec): 143.21 - samples/sec: 514.84 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-11 22:41:26,901 epoch 3 - iter 356/893 - loss 0.07395440 - time (sec): 191.01 - samples/sec: 512.44 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-11 22:42:15,822 epoch 3 - iter 445/893 - loss 0.07186459 - time (sec): 239.93 - samples/sec: 515.32 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-11 22:43:06,799 epoch 3 - iter 534/893 - loss 0.07219677 - time (sec): 290.91 - samples/sec: 513.25 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-11 22:43:56,818 epoch 3 - iter 623/893 - loss 0.07111615 - time (sec): 340.93 - samples/sec: 510.62 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-11 22:44:45,025 epoch 3 - iter 712/893 - loss 0.07090444 - time (sec): 389.13 - samples/sec: 506.99 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-11 22:45:33,782 epoch 3 - iter 801/893 - loss 0.07231914 - time (sec): 437.89 - samples/sec: 506.23 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-11 22:46:23,881 epoch 3 - iter 890/893 - loss 0.07073149 - time (sec): 487.99 - samples/sec: 508.09 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-11 22:46:25,367 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 22:46:25,367 EPOCH 3 done: loss 0.0708 - lr: 0.000125 |
|
2023-10-11 22:46:46,567 DEV : loss 0.10698171705007553 - f1-score (micro avg) 0.7863 |
|
2023-10-11 22:46:46,596 saving best model |
|
2023-10-11 22:46:49,112 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 22:47:39,142 epoch 4 - iter 89/893 - loss 0.04988314 - time (sec): 50.03 - samples/sec: 536.16 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-11 22:48:27,553 epoch 4 - iter 178/893 - loss 0.04846943 - time (sec): 98.44 - samples/sec: 516.18 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-11 22:49:16,600 epoch 4 - iter 267/893 - loss 0.04681982 - time (sec): 147.48 - samples/sec: 514.66 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-11 22:50:05,132 epoch 4 - iter 356/893 - loss 0.04623150 - time (sec): 196.02 - samples/sec: 512.55 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-11 22:50:53,786 epoch 4 - iter 445/893 - loss 0.04787124 - time (sec): 244.67 - samples/sec: 507.06 - lr: 0.000116 - momentum: 0.000000 |
|
2023-10-11 22:51:42,456 epoch 4 - iter 534/893 - loss 0.04769015 - time (sec): 293.34 - samples/sec: 509.14 - lr: 0.000114 - momentum: 0.000000 |
|
2023-10-11 22:52:29,965 epoch 4 - iter 623/893 - loss 0.04761173 - time (sec): 340.85 - samples/sec: 507.97 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-11 22:53:17,603 epoch 4 - iter 712/893 - loss 0.04741060 - time (sec): 388.49 - samples/sec: 507.58 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-11 22:54:06,700 epoch 4 - iter 801/893 - loss 0.04736835 - time (sec): 437.58 - samples/sec: 511.68 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-11 22:54:55,040 epoch 4 - iter 890/893 - loss 0.04709479 - time (sec): 485.92 - samples/sec: 510.57 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-11 22:54:56,494 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 22:54:56,495 EPOCH 4 done: loss 0.0471 - lr: 0.000107 |
|
2023-10-11 22:55:18,057 DEV : loss 0.12400590628385544 - f1-score (micro avg) 0.7966 |
|
2023-10-11 22:55:18,087 saving best model |
|
2023-10-11 22:55:20,719 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 22:56:09,865 epoch 5 - iter 89/893 - loss 0.03397882 - time (sec): 49.14 - samples/sec: 499.21 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-11 22:56:58,565 epoch 5 - iter 178/893 - loss 0.03522845 - time (sec): 97.84 - samples/sec: 500.50 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-11 22:57:48,347 epoch 5 - iter 267/893 - loss 0.03521031 - time (sec): 147.62 - samples/sec: 503.01 - lr: 0.000101 - momentum: 0.000000 |
|
2023-10-11 22:58:35,076 epoch 5 - iter 356/893 - loss 0.03460673 - time (sec): 194.35 - samples/sec: 502.96 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-11 22:59:22,704 epoch 5 - iter 445/893 - loss 0.03523870 - time (sec): 241.98 - samples/sec: 502.99 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-11 23:00:10,809 epoch 5 - iter 534/893 - loss 0.03478198 - time (sec): 290.09 - samples/sec: 504.56 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-11 23:01:00,615 epoch 5 - iter 623/893 - loss 0.03495198 - time (sec): 339.89 - samples/sec: 510.85 - lr: 0.000094 - momentum: 0.000000 |
|
2023-10-11 23:01:50,082 epoch 5 - iter 712/893 - loss 0.03581308 - time (sec): 389.36 - samples/sec: 509.69 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-11 23:02:39,996 epoch 5 - iter 801/893 - loss 0.03648619 - time (sec): 439.27 - samples/sec: 508.22 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-11 23:03:30,163 epoch 5 - iter 890/893 - loss 0.03607845 - time (sec): 489.44 - samples/sec: 506.91 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-11 23:03:31,620 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 23:03:31,620 EPOCH 5 done: loss 0.0361 - lr: 0.000089 |
|
2023-10-11 23:03:52,415 DEV : loss 0.14019542932510376 - f1-score (micro avg) 0.8003 |
|
2023-10-11 23:03:52,447 saving best model |
|
2023-10-11 23:03:54,985 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 23:04:45,324 epoch 6 - iter 89/893 - loss 0.02545772 - time (sec): 50.33 - samples/sec: 510.51 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-11 23:05:34,136 epoch 6 - iter 178/893 - loss 0.02647733 - time (sec): 99.15 - samples/sec: 502.13 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-11 23:06:25,652 epoch 6 - iter 267/893 - loss 0.02542927 - time (sec): 150.66 - samples/sec: 512.02 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-11 23:07:14,673 epoch 6 - iter 356/893 - loss 0.02606427 - time (sec): 199.68 - samples/sec: 507.41 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-11 23:08:04,660 epoch 6 - iter 445/893 - loss 0.02787874 - time (sec): 249.67 - samples/sec: 510.08 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-11 23:08:53,284 epoch 6 - iter 534/893 - loss 0.02723140 - time (sec): 298.29 - samples/sec: 508.54 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-11 23:09:42,484 epoch 6 - iter 623/893 - loss 0.02737784 - time (sec): 347.49 - samples/sec: 505.88 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-11 23:10:34,252 epoch 6 - iter 712/893 - loss 0.02676267 - time (sec): 399.26 - samples/sec: 503.29 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-11 23:11:22,438 epoch 6 - iter 801/893 - loss 0.02657678 - time (sec): 447.45 - samples/sec: 501.14 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-11 23:12:11,166 epoch 6 - iter 890/893 - loss 0.02720868 - time (sec): 496.18 - samples/sec: 499.17 - lr: 0.000071 - momentum: 0.000000 |
|
2023-10-11 23:12:12,922 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 23:12:12,922 EPOCH 6 done: loss 0.0273 - lr: 0.000071 |
|
2023-10-11 23:12:34,497 DEV : loss 0.15041321516036987 - f1-score (micro avg) 0.8117 |
|
2023-10-11 23:12:34,527 saving best model |
|
2023-10-11 23:12:37,084 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 23:13:26,120 epoch 7 - iter 89/893 - loss 0.02780661 - time (sec): 49.03 - samples/sec: 490.86 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-11 23:14:16,338 epoch 7 - iter 178/893 - loss 0.02358968 - time (sec): 99.25 - samples/sec: 501.77 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-11 23:15:05,061 epoch 7 - iter 267/893 - loss 0.02258577 - time (sec): 147.97 - samples/sec: 497.61 - lr: 0.000066 - momentum: 0.000000 |
|
2023-10-11 23:15:55,092 epoch 7 - iter 356/893 - loss 0.02043234 - time (sec): 198.00 - samples/sec: 502.39 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-11 23:16:43,870 epoch 7 - iter 445/893 - loss 0.02126625 - time (sec): 246.78 - samples/sec: 504.49 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-11 23:17:34,601 epoch 7 - iter 534/893 - loss 0.02028243 - time (sec): 297.51 - samples/sec: 501.86 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-11 23:18:23,631 epoch 7 - iter 623/893 - loss 0.02053878 - time (sec): 346.54 - samples/sec: 500.97 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-11 23:19:12,400 epoch 7 - iter 712/893 - loss 0.02128759 - time (sec): 395.31 - samples/sec: 500.79 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-11 23:20:01,270 epoch 7 - iter 801/893 - loss 0.02189113 - time (sec): 444.18 - samples/sec: 502.62 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-11 23:20:50,098 epoch 7 - iter 890/893 - loss 0.02197161 - time (sec): 493.01 - samples/sec: 502.77 - lr: 0.000053 - momentum: 0.000000 |
|
2023-10-11 23:20:51,632 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 23:20:51,632 EPOCH 7 done: loss 0.0219 - lr: 0.000053 |
|
2023-10-11 23:21:13,195 DEV : loss 0.1641770303249359 - f1-score (micro avg) 0.8043 |
|
2023-10-11 23:21:13,225 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 23:22:01,599 epoch 8 - iter 89/893 - loss 0.01437145 - time (sec): 48.37 - samples/sec: 517.79 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-11 23:22:50,880 epoch 8 - iter 178/893 - loss 0.01549440 - time (sec): 97.65 - samples/sec: 515.13 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-11 23:23:39,221 epoch 8 - iter 267/893 - loss 0.01587140 - time (sec): 145.99 - samples/sec: 514.78 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-11 23:24:27,855 epoch 8 - iter 356/893 - loss 0.01530621 - time (sec): 194.63 - samples/sec: 505.15 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-11 23:25:15,842 epoch 8 - iter 445/893 - loss 0.01504943 - time (sec): 242.61 - samples/sec: 502.27 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-11 23:26:05,567 epoch 8 - iter 534/893 - loss 0.01505772 - time (sec): 292.34 - samples/sec: 506.57 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-11 23:26:53,841 epoch 8 - iter 623/893 - loss 0.01621767 - time (sec): 340.61 - samples/sec: 502.31 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-11 23:27:43,961 epoch 8 - iter 712/893 - loss 0.01613174 - time (sec): 390.73 - samples/sec: 504.62 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-11 23:28:34,573 epoch 8 - iter 801/893 - loss 0.01616556 - time (sec): 441.35 - samples/sec: 506.32 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-11 23:29:24,349 epoch 8 - iter 890/893 - loss 0.01640991 - time (sec): 491.12 - samples/sec: 504.76 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-11 23:29:25,954 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 23:29:25,955 EPOCH 8 done: loss 0.0164 - lr: 0.000036 |
|
2023-10-11 23:29:47,575 DEV : loss 0.1812078058719635 - f1-score (micro avg) 0.8045 |
|
2023-10-11 23:29:47,606 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 23:30:40,748 epoch 9 - iter 89/893 - loss 0.01204535 - time (sec): 53.14 - samples/sec: 487.81 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-11 23:31:32,198 epoch 9 - iter 178/893 - loss 0.00976354 - time (sec): 104.59 - samples/sec: 477.49 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-11 23:32:23,605 epoch 9 - iter 267/893 - loss 0.01155278 - time (sec): 156.00 - samples/sec: 476.39 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-11 23:33:14,498 epoch 9 - iter 356/893 - loss 0.01127452 - time (sec): 206.89 - samples/sec: 475.23 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-11 23:34:04,554 epoch 9 - iter 445/893 - loss 0.01129248 - time (sec): 256.95 - samples/sec: 478.02 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-11 23:34:56,240 epoch 9 - iter 534/893 - loss 0.01135105 - time (sec): 308.63 - samples/sec: 483.08 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-11 23:35:48,950 epoch 9 - iter 623/893 - loss 0.01222460 - time (sec): 361.34 - samples/sec: 485.20 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-11 23:36:39,707 epoch 9 - iter 712/893 - loss 0.01277049 - time (sec): 412.10 - samples/sec: 486.22 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-11 23:37:28,891 epoch 9 - iter 801/893 - loss 0.01295373 - time (sec): 461.28 - samples/sec: 486.62 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-11 23:38:17,433 epoch 9 - iter 890/893 - loss 0.01308695 - time (sec): 509.82 - samples/sec: 486.57 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-11 23:38:18,887 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 23:38:18,888 EPOCH 9 done: loss 0.0131 - lr: 0.000018 |
|
2023-10-11 23:38:40,916 DEV : loss 0.19328945875167847 - f1-score (micro avg) 0.8091 |
|
2023-10-11 23:38:40,947 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 23:39:33,172 epoch 10 - iter 89/893 - loss 0.01252914 - time (sec): 52.22 - samples/sec: 483.30 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-11 23:40:25,294 epoch 10 - iter 178/893 - loss 0.01243200 - time (sec): 104.34 - samples/sec: 483.94 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-11 23:41:17,755 epoch 10 - iter 267/893 - loss 0.01093071 - time (sec): 156.81 - samples/sec: 483.81 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-11 23:42:09,380 epoch 10 - iter 356/893 - loss 0.01152608 - time (sec): 208.43 - samples/sec: 485.20 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-11 23:43:01,810 epoch 10 - iter 445/893 - loss 0.01150384 - time (sec): 260.86 - samples/sec: 484.83 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-11 23:43:52,835 epoch 10 - iter 534/893 - loss 0.01091038 - time (sec): 311.89 - samples/sec: 482.41 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-11 23:44:45,715 epoch 10 - iter 623/893 - loss 0.01107542 - time (sec): 364.77 - samples/sec: 483.08 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-11 23:45:37,730 epoch 10 - iter 712/893 - loss 0.01039016 - time (sec): 416.78 - samples/sec: 480.44 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-11 23:46:29,653 epoch 10 - iter 801/893 - loss 0.01057808 - time (sec): 468.70 - samples/sec: 478.65 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-11 23:47:19,812 epoch 10 - iter 890/893 - loss 0.01067030 - time (sec): 518.86 - samples/sec: 478.32 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-11 23:47:21,250 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 23:47:21,250 EPOCH 10 done: loss 0.0106 - lr: 0.000000 |
|
2023-10-11 23:47:44,716 DEV : loss 0.19667339324951172 - f1-score (micro avg) 0.8085 |
|
2023-10-11 23:47:45,653 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 23:47:45,656 Loading model from best epoch ... |
|
2023-10-11 23:47:49,477 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-11 23:49:01,259 |
|
Results: |
|
- F-score (micro) 0.7008 |
|
- F-score (macro) 0.6502 |
|
- Accuracy 0.5557 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.6994 0.7160 0.7076 1095 |
|
PER 0.7683 0.7767 0.7725 1012 |
|
ORG 0.4524 0.5994 0.5157 357 |
|
HumanProd 0.5349 0.6970 0.6053 33 |
|
|
|
micro avg 0.6793 0.7237 0.7008 2497 |
|
macro avg 0.6138 0.6973 0.6502 2497 |
|
weighted avg 0.6898 0.7237 0.7051 2497 |
|
|
|
2023-10-11 23:49:01,259 ---------------------------------------------------------------------------------------------------- |
|
|