|
2023-10-12 22:42:25,718 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 22:42:25,721 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-12 22:42:25,721 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 22:42:25,721 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences |
|
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator |
|
2023-10-12 22:42:25,721 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 22:42:25,721 Train: 14465 sentences |
|
2023-10-12 22:42:25,721 (train_with_dev=False, train_with_test=False) |
|
2023-10-12 22:42:25,722 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 22:42:25,722 Training Params: |
|
2023-10-12 22:42:25,722 - learning_rate: "0.00015" |
|
2023-10-12 22:42:25,722 - mini_batch_size: "8" |
|
2023-10-12 22:42:25,722 - max_epochs: "10" |
|
2023-10-12 22:42:25,722 - shuffle: "True" |
|
2023-10-12 22:42:25,722 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 22:42:25,722 Plugins: |
|
2023-10-12 22:42:25,722 - TensorboardLogger |
|
2023-10-12 22:42:25,722 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-12 22:42:25,722 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 22:42:25,722 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-12 22:42:25,722 - metric: "('micro avg', 'f1-score')" |
|
2023-10-12 22:42:25,722 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 22:42:25,722 Computation: |
|
2023-10-12 22:42:25,723 - compute on device: cuda:0 |
|
2023-10-12 22:42:25,723 - embedding storage: none |
|
2023-10-12 22:42:25,723 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 22:42:25,723 Model training base path: "hmbench-letemps/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-12 22:42:25,723 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 22:42:25,723 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 22:42:25,723 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-12 22:44:07,805 epoch 1 - iter 180/1809 - loss 2.57222062 - time (sec): 102.08 - samples/sec: 376.55 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-12 22:45:47,049 epoch 1 - iter 360/1809 - loss 2.35786106 - time (sec): 201.32 - samples/sec: 377.04 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-12 22:47:19,208 epoch 1 - iter 540/1809 - loss 2.01770182 - time (sec): 293.48 - samples/sec: 384.68 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-12 22:48:49,641 epoch 1 - iter 720/1809 - loss 1.66056163 - time (sec): 383.92 - samples/sec: 395.07 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-12 22:50:18,165 epoch 1 - iter 900/1809 - loss 1.38841808 - time (sec): 472.44 - samples/sec: 400.83 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-12 22:51:47,112 epoch 1 - iter 1080/1809 - loss 1.19834151 - time (sec): 561.39 - samples/sec: 402.83 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-12 22:53:15,665 epoch 1 - iter 1260/1809 - loss 1.05788394 - time (sec): 649.94 - samples/sec: 404.31 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-12 22:54:46,068 epoch 1 - iter 1440/1809 - loss 0.94563350 - time (sec): 740.34 - samples/sec: 406.31 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-12 22:56:17,923 epoch 1 - iter 1620/1809 - loss 0.85880262 - time (sec): 832.20 - samples/sec: 407.21 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-12 22:57:51,086 epoch 1 - iter 1800/1809 - loss 0.78348215 - time (sec): 925.36 - samples/sec: 408.35 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-12 22:57:55,498 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 22:57:55,498 EPOCH 1 done: loss 0.7800 - lr: 0.000149 |
|
2023-10-12 22:58:31,841 DEV : loss 0.1441633254289627 - f1-score (micro avg) 0.3673 |
|
2023-10-12 22:58:31,902 saving best model |
|
2023-10-12 22:58:32,765 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 23:00:09,462 epoch 2 - iter 180/1809 - loss 0.11170747 - time (sec): 96.69 - samples/sec: 400.61 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-12 23:01:44,813 epoch 2 - iter 360/1809 - loss 0.11397447 - time (sec): 192.05 - samples/sec: 403.34 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-12 23:03:16,132 epoch 2 - iter 540/1809 - loss 0.11058507 - time (sec): 283.36 - samples/sec: 406.96 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-12 23:04:47,107 epoch 2 - iter 720/1809 - loss 0.10972006 - time (sec): 374.34 - samples/sec: 406.43 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-12 23:06:17,438 epoch 2 - iter 900/1809 - loss 0.10675904 - time (sec): 464.67 - samples/sec: 405.71 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-12 23:07:48,550 epoch 2 - iter 1080/1809 - loss 0.10643316 - time (sec): 555.78 - samples/sec: 408.15 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-12 23:09:19,398 epoch 2 - iter 1260/1809 - loss 0.10422146 - time (sec): 646.63 - samples/sec: 409.03 - lr: 0.000138 - momentum: 0.000000 |
|
2023-10-12 23:10:52,238 epoch 2 - iter 1440/1809 - loss 0.10187179 - time (sec): 739.47 - samples/sec: 409.41 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-12 23:12:23,351 epoch 2 - iter 1620/1809 - loss 0.09902556 - time (sec): 830.58 - samples/sec: 410.91 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-12 23:13:54,328 epoch 2 - iter 1800/1809 - loss 0.09837045 - time (sec): 921.56 - samples/sec: 410.24 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-12 23:13:58,390 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 23:13:58,390 EPOCH 2 done: loss 0.0981 - lr: 0.000133 |
|
2023-10-12 23:14:36,178 DEV : loss 0.0997246503829956 - f1-score (micro avg) 0.6206 |
|
2023-10-12 23:14:36,235 saving best model |
|
2023-10-12 23:14:38,819 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 23:16:13,637 epoch 3 - iter 180/1809 - loss 0.06267834 - time (sec): 94.81 - samples/sec: 402.42 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-12 23:17:46,059 epoch 3 - iter 360/1809 - loss 0.06023491 - time (sec): 187.24 - samples/sec: 408.98 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-12 23:19:16,706 epoch 3 - iter 540/1809 - loss 0.06190782 - time (sec): 277.88 - samples/sec: 407.79 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-12 23:20:52,077 epoch 3 - iter 720/1809 - loss 0.06109607 - time (sec): 373.25 - samples/sec: 403.10 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-12 23:22:25,696 epoch 3 - iter 900/1809 - loss 0.06165193 - time (sec): 466.87 - samples/sec: 403.46 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-12 23:23:59,309 epoch 3 - iter 1080/1809 - loss 0.06206747 - time (sec): 560.49 - samples/sec: 403.93 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-12 23:25:33,753 epoch 3 - iter 1260/1809 - loss 0.06194083 - time (sec): 654.93 - samples/sec: 404.63 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-12 23:27:05,673 epoch 3 - iter 1440/1809 - loss 0.06118851 - time (sec): 746.85 - samples/sec: 404.71 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-12 23:28:38,308 epoch 3 - iter 1620/1809 - loss 0.06177140 - time (sec): 839.48 - samples/sec: 405.10 - lr: 0.000118 - momentum: 0.000000 |
|
2023-10-12 23:30:11,083 epoch 3 - iter 1800/1809 - loss 0.06083773 - time (sec): 932.26 - samples/sec: 405.59 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-12 23:30:15,308 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 23:30:15,309 EPOCH 3 done: loss 0.0608 - lr: 0.000117 |
|
2023-10-12 23:30:54,908 DEV : loss 0.14741627871990204 - f1-score (micro avg) 0.6255 |
|
2023-10-12 23:30:54,971 saving best model |
|
2023-10-12 23:30:57,602 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 23:32:29,140 epoch 4 - iter 180/1809 - loss 0.04432308 - time (sec): 91.53 - samples/sec: 402.88 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-12 23:34:01,093 epoch 4 - iter 360/1809 - loss 0.04554957 - time (sec): 183.49 - samples/sec: 415.77 - lr: 0.000113 - momentum: 0.000000 |
|
2023-10-12 23:35:32,639 epoch 4 - iter 540/1809 - loss 0.04344483 - time (sec): 275.03 - samples/sec: 412.63 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-12 23:37:05,804 epoch 4 - iter 720/1809 - loss 0.04221052 - time (sec): 368.20 - samples/sec: 408.78 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-12 23:38:39,875 epoch 4 - iter 900/1809 - loss 0.04316114 - time (sec): 462.27 - samples/sec: 405.17 - lr: 0.000108 - momentum: 0.000000 |
|
2023-10-12 23:40:13,662 epoch 4 - iter 1080/1809 - loss 0.04415013 - time (sec): 556.05 - samples/sec: 405.42 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-12 23:41:44,160 epoch 4 - iter 1260/1809 - loss 0.04402235 - time (sec): 646.55 - samples/sec: 406.88 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-12 23:43:15,665 epoch 4 - iter 1440/1809 - loss 0.04423013 - time (sec): 738.06 - samples/sec: 407.84 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-12 23:44:47,822 epoch 4 - iter 1620/1809 - loss 0.04340254 - time (sec): 830.22 - samples/sec: 409.92 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-12 23:46:22,508 epoch 4 - iter 1800/1809 - loss 0.04342319 - time (sec): 924.90 - samples/sec: 408.87 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-12 23:46:26,848 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 23:46:26,848 EPOCH 4 done: loss 0.0437 - lr: 0.000100 |
|
2023-10-12 23:47:04,639 DEV : loss 0.1756805181503296 - f1-score (micro avg) 0.6169 |
|
2023-10-12 23:47:04,696 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 23:48:38,776 epoch 5 - iter 180/1809 - loss 0.03157894 - time (sec): 94.08 - samples/sec: 407.31 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-12 23:50:11,001 epoch 5 - iter 360/1809 - loss 0.02830269 - time (sec): 186.30 - samples/sec: 411.52 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-12 23:51:41,162 epoch 5 - iter 540/1809 - loss 0.02843117 - time (sec): 276.46 - samples/sec: 409.58 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-12 23:53:15,346 epoch 5 - iter 720/1809 - loss 0.03067151 - time (sec): 370.65 - samples/sec: 403.73 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-12 23:54:49,546 epoch 5 - iter 900/1809 - loss 0.03160142 - time (sec): 464.85 - samples/sec: 402.68 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-12 23:56:20,496 epoch 5 - iter 1080/1809 - loss 0.03108926 - time (sec): 555.80 - samples/sec: 405.37 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-12 23:57:47,978 epoch 5 - iter 1260/1809 - loss 0.03185705 - time (sec): 643.28 - samples/sec: 408.31 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-12 23:59:20,267 epoch 5 - iter 1440/1809 - loss 0.03333433 - time (sec): 735.57 - samples/sec: 407.23 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-13 00:00:56,869 epoch 5 - iter 1620/1809 - loss 0.03273045 - time (sec): 832.17 - samples/sec: 408.49 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-13 00:02:34,892 epoch 5 - iter 1800/1809 - loss 0.03316525 - time (sec): 930.19 - samples/sec: 406.50 - lr: 0.000083 - momentum: 0.000000 |
|
2023-10-13 00:02:39,155 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 00:02:39,155 EPOCH 5 done: loss 0.0331 - lr: 0.000083 |
|
2023-10-13 00:03:20,597 DEV : loss 0.23575519025325775 - f1-score (micro avg) 0.617 |
|
2023-10-13 00:03:20,661 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 00:04:52,695 epoch 6 - iter 180/1809 - loss 0.02388711 - time (sec): 92.03 - samples/sec: 408.81 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-13 00:06:24,791 epoch 6 - iter 360/1809 - loss 0.02401339 - time (sec): 184.13 - samples/sec: 411.47 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-13 00:07:59,259 epoch 6 - iter 540/1809 - loss 0.02310670 - time (sec): 278.60 - samples/sec: 406.14 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-13 00:09:31,467 epoch 6 - iter 720/1809 - loss 0.02334831 - time (sec): 370.80 - samples/sec: 408.26 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-13 00:11:06,383 epoch 6 - iter 900/1809 - loss 0.02452192 - time (sec): 465.72 - samples/sec: 407.11 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-13 00:12:41,135 epoch 6 - iter 1080/1809 - loss 0.02459648 - time (sec): 560.47 - samples/sec: 405.40 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-13 00:14:17,989 epoch 6 - iter 1260/1809 - loss 0.02521649 - time (sec): 657.33 - samples/sec: 403.03 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-13 00:15:54,582 epoch 6 - iter 1440/1809 - loss 0.02521885 - time (sec): 753.92 - samples/sec: 399.88 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-13 00:17:34,466 epoch 6 - iter 1620/1809 - loss 0.02506005 - time (sec): 853.80 - samples/sec: 397.24 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-13 00:19:10,740 epoch 6 - iter 1800/1809 - loss 0.02442039 - time (sec): 950.08 - samples/sec: 397.99 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-13 00:19:15,046 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 00:19:15,047 EPOCH 6 done: loss 0.0243 - lr: 0.000067 |
|
2023-10-13 00:19:55,967 DEV : loss 0.2817913591861725 - f1-score (micro avg) 0.6398 |
|
2023-10-13 00:19:56,045 saving best model |
|
2023-10-13 00:19:57,138 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 00:21:32,179 epoch 7 - iter 180/1809 - loss 0.01493884 - time (sec): 95.04 - samples/sec: 388.91 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-13 00:23:05,610 epoch 7 - iter 360/1809 - loss 0.01507536 - time (sec): 188.47 - samples/sec: 401.81 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-13 00:24:41,229 epoch 7 - iter 540/1809 - loss 0.01649935 - time (sec): 284.09 - samples/sec: 399.44 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-13 00:26:17,385 epoch 7 - iter 720/1809 - loss 0.01746454 - time (sec): 380.24 - samples/sec: 397.51 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-13 00:27:48,257 epoch 7 - iter 900/1809 - loss 0.01772093 - time (sec): 471.12 - samples/sec: 400.88 - lr: 0.000058 - momentum: 0.000000 |
|
2023-10-13 00:29:15,753 epoch 7 - iter 1080/1809 - loss 0.01691637 - time (sec): 558.61 - samples/sec: 404.51 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-13 00:30:47,556 epoch 7 - iter 1260/1809 - loss 0.01716291 - time (sec): 650.42 - samples/sec: 405.67 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-13 00:32:24,463 epoch 7 - iter 1440/1809 - loss 0.01829192 - time (sec): 747.32 - samples/sec: 403.28 - lr: 0.000053 - momentum: 0.000000 |
|
2023-10-13 00:34:06,279 epoch 7 - iter 1620/1809 - loss 0.01866002 - time (sec): 849.14 - samples/sec: 400.09 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-13 00:35:43,193 epoch 7 - iter 1800/1809 - loss 0.01826084 - time (sec): 946.05 - samples/sec: 399.82 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-13 00:35:47,355 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 00:35:47,356 EPOCH 7 done: loss 0.0182 - lr: 0.000050 |
|
2023-10-13 00:36:26,299 DEV : loss 0.3023461401462555 - f1-score (micro avg) 0.6502 |
|
2023-10-13 00:36:26,356 saving best model |
|
2023-10-13 00:36:28,945 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 00:37:59,940 epoch 8 - iter 180/1809 - loss 0.01451939 - time (sec): 90.99 - samples/sec: 412.53 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 00:39:33,475 epoch 8 - iter 360/1809 - loss 0.01255235 - time (sec): 184.52 - samples/sec: 411.41 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-13 00:41:07,633 epoch 8 - iter 540/1809 - loss 0.01182338 - time (sec): 278.68 - samples/sec: 408.87 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 00:42:39,556 epoch 8 - iter 720/1809 - loss 0.01274256 - time (sec): 370.61 - samples/sec: 412.85 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 00:44:10,716 epoch 8 - iter 900/1809 - loss 0.01253653 - time (sec): 461.77 - samples/sec: 414.98 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-13 00:45:38,696 epoch 8 - iter 1080/1809 - loss 0.01269966 - time (sec): 549.75 - samples/sec: 414.13 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-13 00:47:06,823 epoch 8 - iter 1260/1809 - loss 0.01283451 - time (sec): 637.87 - samples/sec: 415.05 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-13 00:48:38,750 epoch 8 - iter 1440/1809 - loss 0.01300464 - time (sec): 729.80 - samples/sec: 415.22 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-13 00:50:07,404 epoch 8 - iter 1620/1809 - loss 0.01320616 - time (sec): 818.45 - samples/sec: 416.97 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-13 00:51:35,459 epoch 8 - iter 1800/1809 - loss 0.01348805 - time (sec): 906.51 - samples/sec: 417.43 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-13 00:51:39,256 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 00:51:39,256 EPOCH 8 done: loss 0.0134 - lr: 0.000033 |
|
2023-10-13 00:52:16,475 DEV : loss 0.3305288553237915 - f1-score (micro avg) 0.6487 |
|
2023-10-13 00:52:16,533 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 00:53:46,221 epoch 9 - iter 180/1809 - loss 0.00593172 - time (sec): 89.69 - samples/sec: 414.24 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 00:55:18,160 epoch 9 - iter 360/1809 - loss 0.00911510 - time (sec): 181.63 - samples/sec: 411.53 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 00:56:48,486 epoch 9 - iter 540/1809 - loss 0.00925733 - time (sec): 271.95 - samples/sec: 412.89 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 00:58:22,172 epoch 9 - iter 720/1809 - loss 0.00972486 - time (sec): 365.64 - samples/sec: 416.61 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 00:59:59,270 epoch 9 - iter 900/1809 - loss 0.00982915 - time (sec): 462.74 - samples/sec: 410.07 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 01:01:31,167 epoch 9 - iter 1080/1809 - loss 0.00992239 - time (sec): 554.63 - samples/sec: 410.23 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 01:03:03,196 epoch 9 - iter 1260/1809 - loss 0.00987652 - time (sec): 646.66 - samples/sec: 408.94 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 01:04:36,184 epoch 9 - iter 1440/1809 - loss 0.00999639 - time (sec): 739.65 - samples/sec: 409.36 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 01:06:10,125 epoch 9 - iter 1620/1809 - loss 0.00981387 - time (sec): 833.59 - samples/sec: 409.53 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 01:07:42,547 epoch 9 - iter 1800/1809 - loss 0.00963508 - time (sec): 926.01 - samples/sec: 408.65 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 01:07:46,624 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 01:07:46,624 EPOCH 9 done: loss 0.0096 - lr: 0.000017 |
|
2023-10-13 01:08:26,098 DEV : loss 0.36490538716316223 - f1-score (micro avg) 0.6474 |
|
2023-10-13 01:08:26,158 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 01:09:58,926 epoch 10 - iter 180/1809 - loss 0.01093935 - time (sec): 92.77 - samples/sec: 407.46 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 01:11:31,538 epoch 10 - iter 360/1809 - loss 0.00895096 - time (sec): 185.38 - samples/sec: 404.83 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 01:13:04,293 epoch 10 - iter 540/1809 - loss 0.00986715 - time (sec): 278.13 - samples/sec: 405.74 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 01:14:35,898 epoch 10 - iter 720/1809 - loss 0.00934801 - time (sec): 369.74 - samples/sec: 405.71 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 01:16:09,995 epoch 10 - iter 900/1809 - loss 0.00920377 - time (sec): 463.84 - samples/sec: 405.65 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 01:17:41,994 epoch 10 - iter 1080/1809 - loss 0.00843499 - time (sec): 555.83 - samples/sec: 406.93 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 01:19:11,885 epoch 10 - iter 1260/1809 - loss 0.00837965 - time (sec): 645.72 - samples/sec: 409.24 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 01:20:42,654 epoch 10 - iter 1440/1809 - loss 0.00861846 - time (sec): 736.49 - samples/sec: 410.76 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 01:22:12,500 epoch 10 - iter 1620/1809 - loss 0.00843137 - time (sec): 826.34 - samples/sec: 412.02 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 01:23:41,817 epoch 10 - iter 1800/1809 - loss 0.00810910 - time (sec): 915.66 - samples/sec: 413.30 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-13 01:23:45,653 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 01:23:45,653 EPOCH 10 done: loss 0.0081 - lr: 0.000000 |
|
2023-10-13 01:24:25,128 DEV : loss 0.3651101589202881 - f1-score (micro avg) 0.6381 |
|
2023-10-13 01:24:26,124 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 01:24:26,126 Loading model from best epoch ... |
|
2023-10-13 01:24:31,655 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org |
|
2023-10-13 01:25:29,437 |
|
Results: |
|
- F-score (micro) 0.6478 |
|
- F-score (macro) 0.5104 |
|
- Accuracy 0.4908 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.6357 0.7766 0.6992 591 |
|
pers 0.5565 0.7591 0.6422 357 |
|
org 0.2241 0.1646 0.1898 79 |
|
|
|
micro avg 0.5864 0.7235 0.6478 1027 |
|
macro avg 0.4721 0.5668 0.5104 1027 |
|
weighted avg 0.5765 0.7235 0.6402 1027 |
|
|
|
2023-10-13 01:25:29,437 ---------------------------------------------------------------------------------------------------- |
|
|