|
2023-10-10 19:39:45,665 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 19:39:45,667 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-10 19:39:45,667 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 19:39:45,667 MultiCorpus: 7142 train + 698 dev + 2570 test sentences |
|
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator |
|
2023-10-10 19:39:45,667 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 19:39:45,667 Train: 7142 sentences |
|
2023-10-10 19:39:45,667 (train_with_dev=False, train_with_test=False) |
|
2023-10-10 19:39:45,668 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 19:39:45,668 Training Params: |
|
2023-10-10 19:39:45,668 - learning_rate: "0.00015" |
|
2023-10-10 19:39:45,668 - mini_batch_size: "8" |
|
2023-10-10 19:39:45,668 - max_epochs: "10" |
|
2023-10-10 19:39:45,668 - shuffle: "True" |
|
2023-10-10 19:39:45,668 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 19:39:45,668 Plugins: |
|
2023-10-10 19:39:45,668 - TensorboardLogger |
|
2023-10-10 19:39:45,668 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-10 19:39:45,668 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 19:39:45,668 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-10 19:39:45,668 - metric: "('micro avg', 'f1-score')" |
|
2023-10-10 19:39:45,668 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 19:39:45,669 Computation: |
|
2023-10-10 19:39:45,669 - compute on device: cuda:0 |
|
2023-10-10 19:39:45,669 - embedding storage: none |
|
2023-10-10 19:39:45,669 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 19:39:45,669 Model training base path: "hmbench-newseye/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-10 19:39:45,669 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 19:39:45,669 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 19:39:45,669 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-10 19:40:37,221 epoch 1 - iter 89/893 - loss 2.82906495 - time (sec): 51.55 - samples/sec: 494.24 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-10 19:41:26,623 epoch 1 - iter 178/893 - loss 2.78110436 - time (sec): 100.95 - samples/sec: 499.36 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-10 19:42:16,721 epoch 1 - iter 267/893 - loss 2.60742789 - time (sec): 151.05 - samples/sec: 496.82 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-10 19:43:07,007 epoch 1 - iter 356/893 - loss 2.38685711 - time (sec): 201.34 - samples/sec: 494.15 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-10 19:44:00,063 epoch 1 - iter 445/893 - loss 2.13431932 - time (sec): 254.39 - samples/sec: 494.13 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-10 19:44:51,943 epoch 1 - iter 534/893 - loss 1.92436744 - time (sec): 306.27 - samples/sec: 485.92 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-10 19:45:41,674 epoch 1 - iter 623/893 - loss 1.74254208 - time (sec): 356.00 - samples/sec: 484.27 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-10 19:46:33,920 epoch 1 - iter 712/893 - loss 1.57825868 - time (sec): 408.25 - samples/sec: 484.14 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-10 19:47:28,413 epoch 1 - iter 801/893 - loss 1.44698773 - time (sec): 462.74 - samples/sec: 482.79 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-10 19:48:22,731 epoch 1 - iter 890/893 - loss 1.34261657 - time (sec): 517.06 - samples/sec: 479.59 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-10 19:48:24,352 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 19:48:24,352 EPOCH 1 done: loss 1.3396 - lr: 0.000149 |
|
2023-10-10 19:48:45,236 DEV : loss 0.2843281924724579 - f1-score (micro avg) 0.2763 |
|
2023-10-10 19:48:45,266 saving best model |
|
2023-10-10 19:48:46,188 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 19:49:39,164 epoch 2 - iter 89/893 - loss 0.31930581 - time (sec): 52.97 - samples/sec: 499.28 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-10 19:50:29,513 epoch 2 - iter 178/893 - loss 0.31347332 - time (sec): 103.32 - samples/sec: 488.21 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-10 19:51:19,488 epoch 2 - iter 267/893 - loss 0.29696194 - time (sec): 153.30 - samples/sec: 484.60 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-10 19:52:10,169 epoch 2 - iter 356/893 - loss 0.27774706 - time (sec): 203.98 - samples/sec: 487.01 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-10 19:53:00,310 epoch 2 - iter 445/893 - loss 0.26146640 - time (sec): 254.12 - samples/sec: 488.59 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-10 19:53:52,223 epoch 2 - iter 534/893 - loss 0.25100648 - time (sec): 306.03 - samples/sec: 484.05 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-10 19:54:44,777 epoch 2 - iter 623/893 - loss 0.23745422 - time (sec): 358.59 - samples/sec: 483.90 - lr: 0.000138 - momentum: 0.000000 |
|
2023-10-10 19:55:37,395 epoch 2 - iter 712/893 - loss 0.22599111 - time (sec): 411.20 - samples/sec: 485.53 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-10 19:56:28,840 epoch 2 - iter 801/893 - loss 0.21632815 - time (sec): 462.65 - samples/sec: 483.50 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-10 19:57:21,815 epoch 2 - iter 890/893 - loss 0.20811631 - time (sec): 515.62 - samples/sec: 481.20 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-10 19:57:23,363 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 19:57:23,364 EPOCH 2 done: loss 0.2078 - lr: 0.000133 |
|
2023-10-10 19:57:46,164 DEV : loss 0.11342751234769821 - f1-score (micro avg) 0.737 |
|
2023-10-10 19:57:46,198 saving best model |
|
2023-10-10 19:57:56,671 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 19:58:46,865 epoch 3 - iter 89/893 - loss 0.09254059 - time (sec): 50.19 - samples/sec: 475.99 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-10 19:59:37,979 epoch 3 - iter 178/893 - loss 0.08867836 - time (sec): 101.30 - samples/sec: 487.47 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-10 20:00:28,421 epoch 3 - iter 267/893 - loss 0.08987985 - time (sec): 151.75 - samples/sec: 487.94 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-10 20:01:17,835 epoch 3 - iter 356/893 - loss 0.09300563 - time (sec): 201.16 - samples/sec: 484.15 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-10 20:02:09,174 epoch 3 - iter 445/893 - loss 0.09146828 - time (sec): 252.50 - samples/sec: 488.22 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-10 20:02:59,499 epoch 3 - iter 534/893 - loss 0.08889789 - time (sec): 302.82 - samples/sec: 489.12 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-10 20:03:50,308 epoch 3 - iter 623/893 - loss 0.08580958 - time (sec): 353.63 - samples/sec: 488.96 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-10 20:04:42,049 epoch 3 - iter 712/893 - loss 0.08460481 - time (sec): 405.37 - samples/sec: 489.39 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-10 20:05:35,281 epoch 3 - iter 801/893 - loss 0.08297470 - time (sec): 458.61 - samples/sec: 490.86 - lr: 0.000118 - momentum: 0.000000 |
|
2023-10-10 20:06:25,590 epoch 3 - iter 890/893 - loss 0.08298388 - time (sec): 508.91 - samples/sec: 487.34 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-10 20:06:27,249 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 20:06:27,249 EPOCH 3 done: loss 0.0829 - lr: 0.000117 |
|
2023-10-10 20:06:49,372 DEV : loss 0.1113942340016365 - f1-score (micro avg) 0.7559 |
|
2023-10-10 20:06:49,404 saving best model |
|
2023-10-10 20:06:56,053 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 20:07:48,821 epoch 4 - iter 89/893 - loss 0.05345803 - time (sec): 52.76 - samples/sec: 472.25 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-10 20:08:39,233 epoch 4 - iter 178/893 - loss 0.05431726 - time (sec): 103.17 - samples/sec: 476.99 - lr: 0.000113 - momentum: 0.000000 |
|
2023-10-10 20:09:31,627 epoch 4 - iter 267/893 - loss 0.05425313 - time (sec): 155.57 - samples/sec: 474.35 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-10 20:10:23,324 epoch 4 - iter 356/893 - loss 0.05786397 - time (sec): 207.26 - samples/sec: 478.70 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-10 20:11:15,726 epoch 4 - iter 445/893 - loss 0.05597032 - time (sec): 259.67 - samples/sec: 483.48 - lr: 0.000108 - momentum: 0.000000 |
|
2023-10-10 20:12:07,752 epoch 4 - iter 534/893 - loss 0.05501856 - time (sec): 311.69 - samples/sec: 483.60 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-10 20:13:01,380 epoch 4 - iter 623/893 - loss 0.05376043 - time (sec): 365.32 - samples/sec: 484.68 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-10 20:13:51,750 epoch 4 - iter 712/893 - loss 0.05374869 - time (sec): 415.69 - samples/sec: 484.81 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-10 20:14:42,974 epoch 4 - iter 801/893 - loss 0.05390271 - time (sec): 466.91 - samples/sec: 481.79 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-10 20:15:33,480 epoch 4 - iter 890/893 - loss 0.05392475 - time (sec): 517.42 - samples/sec: 479.54 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-10 20:15:34,961 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 20:15:34,962 EPOCH 4 done: loss 0.0540 - lr: 0.000100 |
|
2023-10-10 20:15:56,301 DEV : loss 0.11903274804353714 - f1-score (micro avg) 0.7738 |
|
2023-10-10 20:15:56,332 saving best model |
|
2023-10-10 20:16:02,540 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 20:16:54,836 epoch 5 - iter 89/893 - loss 0.03937382 - time (sec): 52.29 - samples/sec: 486.27 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-10 20:17:46,858 epoch 5 - iter 178/893 - loss 0.03701172 - time (sec): 104.31 - samples/sec: 466.44 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-10 20:18:41,817 epoch 5 - iter 267/893 - loss 0.03814808 - time (sec): 159.27 - samples/sec: 467.57 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-10 20:19:35,645 epoch 5 - iter 356/893 - loss 0.04034811 - time (sec): 213.10 - samples/sec: 472.42 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-10 20:20:27,648 epoch 5 - iter 445/893 - loss 0.04030794 - time (sec): 265.10 - samples/sec: 465.76 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-10 20:21:17,438 epoch 5 - iter 534/893 - loss 0.03985960 - time (sec): 314.89 - samples/sec: 468.02 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-10 20:22:09,623 epoch 5 - iter 623/893 - loss 0.03993508 - time (sec): 367.08 - samples/sec: 469.85 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-10 20:23:02,735 epoch 5 - iter 712/893 - loss 0.03990589 - time (sec): 420.19 - samples/sec: 471.66 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-10 20:23:56,026 epoch 5 - iter 801/893 - loss 0.03955761 - time (sec): 473.48 - samples/sec: 471.16 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-10 20:24:48,739 epoch 5 - iter 890/893 - loss 0.03910254 - time (sec): 526.20 - samples/sec: 471.38 - lr: 0.000083 - momentum: 0.000000 |
|
2023-10-10 20:24:50,326 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 20:24:50,326 EPOCH 5 done: loss 0.0391 - lr: 0.000083 |
|
2023-10-10 20:25:13,897 DEV : loss 0.12937377393245697 - f1-score (micro avg) 0.788 |
|
2023-10-10 20:25:13,940 saving best model |
|
2023-10-10 20:25:16,976 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 20:26:09,547 epoch 6 - iter 89/893 - loss 0.02822273 - time (sec): 52.57 - samples/sec: 474.24 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-10 20:27:00,916 epoch 6 - iter 178/893 - loss 0.02803573 - time (sec): 103.93 - samples/sec: 476.62 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-10 20:27:52,841 epoch 6 - iter 267/893 - loss 0.02761811 - time (sec): 155.86 - samples/sec: 482.22 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-10 20:28:41,858 epoch 6 - iter 356/893 - loss 0.02913699 - time (sec): 204.88 - samples/sec: 484.20 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-10 20:29:32,333 epoch 6 - iter 445/893 - loss 0.02861398 - time (sec): 255.35 - samples/sec: 481.80 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-10 20:30:22,684 epoch 6 - iter 534/893 - loss 0.02936834 - time (sec): 305.70 - samples/sec: 483.71 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-10 20:31:16,351 epoch 6 - iter 623/893 - loss 0.02911242 - time (sec): 359.37 - samples/sec: 484.80 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-10 20:32:07,955 epoch 6 - iter 712/893 - loss 0.02921508 - time (sec): 410.97 - samples/sec: 484.48 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-10 20:32:59,979 epoch 6 - iter 801/893 - loss 0.02955267 - time (sec): 463.00 - samples/sec: 485.26 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-10 20:33:49,274 epoch 6 - iter 890/893 - loss 0.03022345 - time (sec): 512.29 - samples/sec: 484.15 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-10 20:33:50,856 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 20:33:50,857 EPOCH 6 done: loss 0.0301 - lr: 0.000067 |
|
2023-10-10 20:34:13,518 DEV : loss 0.16373801231384277 - f1-score (micro avg) 0.7847 |
|
2023-10-10 20:34:13,556 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 20:35:05,833 epoch 7 - iter 89/893 - loss 0.01814226 - time (sec): 52.28 - samples/sec: 485.39 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-10 20:35:55,252 epoch 7 - iter 178/893 - loss 0.02138159 - time (sec): 101.69 - samples/sec: 479.04 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-10 20:36:47,244 epoch 7 - iter 267/893 - loss 0.02102136 - time (sec): 153.69 - samples/sec: 482.73 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-10 20:37:37,631 epoch 7 - iter 356/893 - loss 0.02233733 - time (sec): 204.07 - samples/sec: 484.89 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-10 20:38:28,716 epoch 7 - iter 445/893 - loss 0.02235734 - time (sec): 255.16 - samples/sec: 481.90 - lr: 0.000058 - momentum: 0.000000 |
|
2023-10-10 20:39:19,282 epoch 7 - iter 534/893 - loss 0.02197453 - time (sec): 305.72 - samples/sec: 485.02 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-10 20:40:10,988 epoch 7 - iter 623/893 - loss 0.02262488 - time (sec): 357.43 - samples/sec: 484.42 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-10 20:41:00,028 epoch 7 - iter 712/893 - loss 0.02251825 - time (sec): 406.47 - samples/sec: 483.34 - lr: 0.000053 - momentum: 0.000000 |
|
2023-10-10 20:41:52,314 epoch 7 - iter 801/893 - loss 0.02299809 - time (sec): 458.76 - samples/sec: 485.89 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-10 20:42:43,299 epoch 7 - iter 890/893 - loss 0.02327864 - time (sec): 509.74 - samples/sec: 486.71 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-10 20:42:44,897 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 20:42:44,897 EPOCH 7 done: loss 0.0233 - lr: 0.000050 |
|
2023-10-10 20:43:07,633 DEV : loss 0.17558923363685608 - f1-score (micro avg) 0.7827 |
|
2023-10-10 20:43:07,671 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 20:44:00,852 epoch 8 - iter 89/893 - loss 0.01531205 - time (sec): 53.18 - samples/sec: 461.75 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-10 20:44:51,517 epoch 8 - iter 178/893 - loss 0.01532229 - time (sec): 103.84 - samples/sec: 467.12 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-10 20:45:42,040 epoch 8 - iter 267/893 - loss 0.01676001 - time (sec): 154.37 - samples/sec: 466.78 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-10 20:46:33,877 epoch 8 - iter 356/893 - loss 0.01635882 - time (sec): 206.20 - samples/sec: 476.10 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-10 20:47:24,646 epoch 8 - iter 445/893 - loss 0.01650485 - time (sec): 256.97 - samples/sec: 477.47 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-10 20:48:14,620 epoch 8 - iter 534/893 - loss 0.01709783 - time (sec): 306.95 - samples/sec: 475.39 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-10 20:49:06,724 epoch 8 - iter 623/893 - loss 0.01756030 - time (sec): 359.05 - samples/sec: 475.74 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-10 20:49:58,454 epoch 8 - iter 712/893 - loss 0.01732975 - time (sec): 410.78 - samples/sec: 475.08 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-10 20:50:49,934 epoch 8 - iter 801/893 - loss 0.01736550 - time (sec): 462.26 - samples/sec: 478.47 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-10 20:51:42,783 epoch 8 - iter 890/893 - loss 0.01724690 - time (sec): 515.11 - samples/sec: 480.96 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-10 20:51:44,567 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 20:51:44,568 EPOCH 8 done: loss 0.0174 - lr: 0.000033 |
|
2023-10-10 20:52:07,454 DEV : loss 0.17957079410552979 - f1-score (micro avg) 0.7849 |
|
2023-10-10 20:52:07,486 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 20:52:59,178 epoch 9 - iter 89/893 - loss 0.01800137 - time (sec): 51.69 - samples/sec: 481.68 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-10 20:53:49,523 epoch 9 - iter 178/893 - loss 0.01641771 - time (sec): 102.03 - samples/sec: 478.59 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-10 20:54:39,930 epoch 9 - iter 267/893 - loss 0.01721855 - time (sec): 152.44 - samples/sec: 490.93 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-10 20:55:28,936 epoch 9 - iter 356/893 - loss 0.01603931 - time (sec): 201.45 - samples/sec: 485.42 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-10 20:56:17,981 epoch 9 - iter 445/893 - loss 0.01512545 - time (sec): 250.49 - samples/sec: 486.34 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-10 20:57:09,343 epoch 9 - iter 534/893 - loss 0.01508396 - time (sec): 301.86 - samples/sec: 484.37 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-10 20:57:58,870 epoch 9 - iter 623/893 - loss 0.01469697 - time (sec): 351.38 - samples/sec: 484.78 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-10 20:58:50,106 epoch 9 - iter 712/893 - loss 0.01406217 - time (sec): 402.62 - samples/sec: 486.23 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-10 20:59:40,498 epoch 9 - iter 801/893 - loss 0.01413326 - time (sec): 453.01 - samples/sec: 488.22 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-10 21:00:33,325 epoch 9 - iter 890/893 - loss 0.01414608 - time (sec): 505.84 - samples/sec: 490.12 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-10 21:00:35,031 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:00:35,031 EPOCH 9 done: loss 0.0142 - lr: 0.000017 |
|
2023-10-10 21:00:56,874 DEV : loss 0.19276094436645508 - f1-score (micro avg) 0.778 |
|
2023-10-10 21:00:56,904 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:01:47,781 epoch 10 - iter 89/893 - loss 0.01006358 - time (sec): 50.88 - samples/sec: 496.06 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-10 21:02:37,723 epoch 10 - iter 178/893 - loss 0.01182496 - time (sec): 100.82 - samples/sec: 488.56 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-10 21:03:26,107 epoch 10 - iter 267/893 - loss 0.01307392 - time (sec): 149.20 - samples/sec: 484.27 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-10 21:04:17,976 epoch 10 - iter 356/893 - loss 0.01229535 - time (sec): 201.07 - samples/sec: 490.49 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-10 21:05:09,179 epoch 10 - iter 445/893 - loss 0.01180387 - time (sec): 252.27 - samples/sec: 494.77 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-10 21:05:59,983 epoch 10 - iter 534/893 - loss 0.01212851 - time (sec): 303.08 - samples/sec: 490.61 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-10 21:06:51,056 epoch 10 - iter 623/893 - loss 0.01223689 - time (sec): 354.15 - samples/sec: 494.49 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-10 21:07:41,031 epoch 10 - iter 712/893 - loss 0.01282452 - time (sec): 404.12 - samples/sec: 491.58 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-10 21:08:33,478 epoch 10 - iter 801/893 - loss 0.01253621 - time (sec): 456.57 - samples/sec: 487.48 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-10 21:09:26,121 epoch 10 - iter 890/893 - loss 0.01243382 - time (sec): 509.21 - samples/sec: 487.12 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-10 21:09:27,717 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:09:27,718 EPOCH 10 done: loss 0.0124 - lr: 0.000000 |
|
2023-10-10 21:09:50,838 DEV : loss 0.19964276254177094 - f1-score (micro avg) 0.7778 |
|
2023-10-10 21:09:51,780 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:09:51,782 Loading model from best epoch ... |
|
2023-10-10 21:09:57,984 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-10 21:11:09,160 |
|
Results: |
|
- F-score (micro) 0.7086 |
|
- F-score (macro) 0.6249 |
|
- Accuracy 0.5641 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.7183 0.7406 0.7293 1095 |
|
PER 0.7906 0.7648 0.7775 1012 |
|
ORG 0.4558 0.5630 0.5038 357 |
|
HumanProd 0.3860 0.6667 0.4889 33 |
|
|
|
micro avg 0.6938 0.7241 0.7086 2497 |
|
macro avg 0.5877 0.6838 0.6249 2497 |
|
weighted avg 0.7057 0.7241 0.7134 2497 |
|
|
|
2023-10-10 21:11:09,161 ---------------------------------------------------------------------------------------------------- |
|
|