|
2023-10-15 04:09:08,570 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:09:08,571 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-15 04:09:08,571 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:09:08,572 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences |
|
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator |
|
2023-10-15 04:09:08,572 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:09:08,572 Train: 3575 sentences |
|
2023-10-15 04:09:08,572 (train_with_dev=False, train_with_test=False) |
|
2023-10-15 04:09:08,572 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:09:08,572 Training Params: |
|
2023-10-15 04:09:08,572 - learning_rate: "0.00015" |
|
2023-10-15 04:09:08,572 - mini_batch_size: "4" |
|
2023-10-15 04:09:08,572 - max_epochs: "10" |
|
2023-10-15 04:09:08,572 - shuffle: "True" |
|
2023-10-15 04:09:08,572 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:09:08,572 Plugins: |
|
2023-10-15 04:09:08,572 - TensorboardLogger |
|
2023-10-15 04:09:08,572 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-15 04:09:08,572 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:09:08,572 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-15 04:09:08,572 - metric: "('micro avg', 'f1-score')" |
|
2023-10-15 04:09:08,572 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:09:08,572 Computation: |
|
2023-10-15 04:09:08,572 - compute on device: cuda:0 |
|
2023-10-15 04:09:08,572 - embedding storage: none |
|
2023-10-15 04:09:08,572 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:09:08,572 Model training base path: "hmbench-hipe2020/de-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-15 04:09:08,572 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:09:08,572 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:09:08,573 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-15 04:09:25,277 epoch 1 - iter 89/894 - loss 3.01669693 - time (sec): 16.70 - samples/sec: 522.34 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-15 04:09:41,508 epoch 1 - iter 178/894 - loss 2.96493085 - time (sec): 32.93 - samples/sec: 509.70 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-15 04:09:58,222 epoch 1 - iter 267/894 - loss 2.79712665 - time (sec): 49.65 - samples/sec: 509.72 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-15 04:10:15,109 epoch 1 - iter 356/894 - loss 2.58130911 - time (sec): 66.54 - samples/sec: 518.52 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-15 04:10:32,072 epoch 1 - iter 445/894 - loss 2.34533400 - time (sec): 83.50 - samples/sec: 524.01 - lr: 0.000074 - momentum: 0.000000 |
|
2023-10-15 04:10:49,283 epoch 1 - iter 534/894 - loss 2.11215980 - time (sec): 100.71 - samples/sec: 522.58 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-15 04:11:05,165 epoch 1 - iter 623/894 - loss 1.92635674 - time (sec): 116.59 - samples/sec: 515.69 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-15 04:11:23,626 epoch 1 - iter 712/894 - loss 1.73388268 - time (sec): 135.05 - samples/sec: 513.95 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-15 04:11:39,671 epoch 1 - iter 801/894 - loss 1.61191467 - time (sec): 151.10 - samples/sec: 511.70 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-15 04:11:56,352 epoch 1 - iter 890/894 - loss 1.48758363 - time (sec): 167.78 - samples/sec: 513.37 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-15 04:11:57,084 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:11:57,084 EPOCH 1 done: loss 1.4828 - lr: 0.000149 |
|
2023-10-15 04:12:20,886 DEV : loss 0.3928602635860443 - f1-score (micro avg) 0.0 |
|
2023-10-15 04:12:20,914 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:12:37,848 epoch 2 - iter 89/894 - loss 0.40241694 - time (sec): 16.93 - samples/sec: 532.50 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-15 04:12:54,611 epoch 2 - iter 178/894 - loss 0.36809400 - time (sec): 33.70 - samples/sec: 517.95 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-15 04:13:10,826 epoch 2 - iter 267/894 - loss 0.37648462 - time (sec): 49.91 - samples/sec: 505.92 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-15 04:13:27,441 epoch 2 - iter 356/894 - loss 0.36497012 - time (sec): 66.53 - samples/sec: 511.54 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-15 04:13:43,772 epoch 2 - iter 445/894 - loss 0.34104738 - time (sec): 82.86 - samples/sec: 511.37 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-15 04:14:00,746 epoch 2 - iter 534/894 - loss 0.32763633 - time (sec): 99.83 - samples/sec: 511.90 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-15 04:14:19,135 epoch 2 - iter 623/894 - loss 0.32056524 - time (sec): 118.22 - samples/sec: 511.61 - lr: 0.000138 - momentum: 0.000000 |
|
2023-10-15 04:14:35,357 epoch 2 - iter 712/894 - loss 0.30819682 - time (sec): 134.44 - samples/sec: 509.93 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-15 04:14:51,947 epoch 2 - iter 801/894 - loss 0.29742880 - time (sec): 151.03 - samples/sec: 510.18 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-15 04:15:08,857 epoch 2 - iter 890/894 - loss 0.28678667 - time (sec): 167.94 - samples/sec: 512.85 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-15 04:15:09,584 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:15:09,585 EPOCH 2 done: loss 0.2860 - lr: 0.000133 |
|
2023-10-15 04:15:35,564 DEV : loss 0.19997233152389526 - f1-score (micro avg) 0.6136 |
|
2023-10-15 04:15:35,590 saving best model |
|
2023-10-15 04:15:36,195 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:15:53,031 epoch 3 - iter 89/894 - loss 0.16944357 - time (sec): 16.83 - samples/sec: 518.92 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-15 04:16:09,384 epoch 3 - iter 178/894 - loss 0.15854015 - time (sec): 33.19 - samples/sec: 504.82 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-15 04:16:28,515 epoch 3 - iter 267/894 - loss 0.16442094 - time (sec): 52.32 - samples/sec: 522.63 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-15 04:16:45,676 epoch 3 - iter 356/894 - loss 0.15557875 - time (sec): 69.48 - samples/sec: 526.84 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-15 04:17:02,066 epoch 3 - iter 445/894 - loss 0.15267942 - time (sec): 85.87 - samples/sec: 522.35 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-15 04:17:17,826 epoch 3 - iter 534/894 - loss 0.14961009 - time (sec): 101.63 - samples/sec: 514.91 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-15 04:17:33,775 epoch 3 - iter 623/894 - loss 0.15096735 - time (sec): 117.58 - samples/sec: 511.73 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-15 04:17:50,817 epoch 3 - iter 712/894 - loss 0.14472220 - time (sec): 134.62 - samples/sec: 517.05 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-15 04:18:07,136 epoch 3 - iter 801/894 - loss 0.14338584 - time (sec): 150.94 - samples/sec: 514.75 - lr: 0.000118 - momentum: 0.000000 |
|
2023-10-15 04:18:23,674 epoch 3 - iter 890/894 - loss 0.13872605 - time (sec): 167.48 - samples/sec: 514.60 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-15 04:18:24,379 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:18:24,379 EPOCH 3 done: loss 0.1388 - lr: 0.000117 |
|
2023-10-15 04:18:50,277 DEV : loss 0.1811581701040268 - f1-score (micro avg) 0.7212 |
|
2023-10-15 04:18:50,303 saving best model |
|
2023-10-15 04:18:53,369 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:19:10,048 epoch 4 - iter 89/894 - loss 0.07264337 - time (sec): 16.68 - samples/sec: 496.21 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-15 04:19:26,851 epoch 4 - iter 178/894 - loss 0.08074117 - time (sec): 33.48 - samples/sec: 513.75 - lr: 0.000113 - momentum: 0.000000 |
|
2023-10-15 04:19:45,305 epoch 4 - iter 267/894 - loss 0.08175014 - time (sec): 51.93 - samples/sec: 516.80 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-15 04:20:02,733 epoch 4 - iter 356/894 - loss 0.07864725 - time (sec): 69.36 - samples/sec: 522.87 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-15 04:20:19,324 epoch 4 - iter 445/894 - loss 0.08009647 - time (sec): 85.95 - samples/sec: 519.46 - lr: 0.000108 - momentum: 0.000000 |
|
2023-10-15 04:20:36,313 epoch 4 - iter 534/894 - loss 0.07890887 - time (sec): 102.94 - samples/sec: 519.76 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-15 04:20:52,676 epoch 4 - iter 623/894 - loss 0.07867496 - time (sec): 119.30 - samples/sec: 517.47 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-15 04:21:09,372 epoch 4 - iter 712/894 - loss 0.07953826 - time (sec): 136.00 - samples/sec: 513.81 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-15 04:21:25,511 epoch 4 - iter 801/894 - loss 0.08107232 - time (sec): 152.14 - samples/sec: 510.97 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-15 04:21:42,113 epoch 4 - iter 890/894 - loss 0.08004768 - time (sec): 168.74 - samples/sec: 511.37 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-15 04:21:42,769 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:21:42,769 EPOCH 4 done: loss 0.0798 - lr: 0.000100 |
|
2023-10-15 04:22:08,721 DEV : loss 0.1746881902217865 - f1-score (micro avg) 0.7458 |
|
2023-10-15 04:22:08,747 saving best model |
|
2023-10-15 04:22:12,600 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:22:29,474 epoch 5 - iter 89/894 - loss 0.05705340 - time (sec): 16.87 - samples/sec: 527.26 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-15 04:22:46,746 epoch 5 - iter 178/894 - loss 0.05878435 - time (sec): 34.14 - samples/sec: 532.89 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-15 04:23:03,102 epoch 5 - iter 267/894 - loss 0.05451082 - time (sec): 50.50 - samples/sec: 526.04 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-15 04:23:19,958 epoch 5 - iter 356/894 - loss 0.05267294 - time (sec): 67.36 - samples/sec: 526.25 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-15 04:23:38,442 epoch 5 - iter 445/894 - loss 0.05334504 - time (sec): 85.84 - samples/sec: 524.68 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-15 04:23:55,070 epoch 5 - iter 534/894 - loss 0.05556145 - time (sec): 102.47 - samples/sec: 523.18 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-15 04:24:11,507 epoch 5 - iter 623/894 - loss 0.05529169 - time (sec): 118.90 - samples/sec: 520.68 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-15 04:24:27,838 epoch 5 - iter 712/894 - loss 0.05506619 - time (sec): 135.24 - samples/sec: 516.46 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-15 04:24:44,106 epoch 5 - iter 801/894 - loss 0.05292623 - time (sec): 151.50 - samples/sec: 513.68 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-15 04:25:00,787 epoch 5 - iter 890/894 - loss 0.05178979 - time (sec): 168.19 - samples/sec: 512.87 - lr: 0.000083 - momentum: 0.000000 |
|
2023-10-15 04:25:01,462 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:25:01,462 EPOCH 5 done: loss 0.0516 - lr: 0.000083 |
|
2023-10-15 04:25:27,317 DEV : loss 0.1961323320865631 - f1-score (micro avg) 0.7533 |
|
2023-10-15 04:25:27,343 saving best model |
|
2023-10-15 04:25:30,115 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:25:46,977 epoch 6 - iter 89/894 - loss 0.02773905 - time (sec): 16.86 - samples/sec: 508.69 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-15 04:26:05,774 epoch 6 - iter 178/894 - loss 0.03176464 - time (sec): 35.66 - samples/sec: 503.28 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-15 04:26:22,897 epoch 6 - iter 267/894 - loss 0.03165230 - time (sec): 52.78 - samples/sec: 507.70 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-15 04:26:40,037 epoch 6 - iter 356/894 - loss 0.02837219 - time (sec): 69.92 - samples/sec: 512.12 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-15 04:26:56,297 epoch 6 - iter 445/894 - loss 0.02927562 - time (sec): 86.18 - samples/sec: 509.14 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-15 04:27:13,036 epoch 6 - iter 534/894 - loss 0.02852822 - time (sec): 102.92 - samples/sec: 511.49 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-15 04:27:30,324 epoch 6 - iter 623/894 - loss 0.03104351 - time (sec): 120.21 - samples/sec: 512.56 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-15 04:27:46,861 epoch 6 - iter 712/894 - loss 0.03080534 - time (sec): 136.75 - samples/sec: 510.43 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-15 04:28:03,322 epoch 6 - iter 801/894 - loss 0.03026852 - time (sec): 153.21 - samples/sec: 507.43 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-15 04:28:20,062 epoch 6 - iter 890/894 - loss 0.02999441 - time (sec): 169.95 - samples/sec: 507.21 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-15 04:28:20,777 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:28:20,777 EPOCH 6 done: loss 0.0301 - lr: 0.000067 |
|
2023-10-15 04:28:46,864 DEV : loss 0.21011091768741608 - f1-score (micro avg) 0.7392 |
|
2023-10-15 04:28:46,891 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:29:03,183 epoch 7 - iter 89/894 - loss 0.01509648 - time (sec): 16.29 - samples/sec: 497.89 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-15 04:29:20,466 epoch 7 - iter 178/894 - loss 0.01768168 - time (sec): 33.57 - samples/sec: 521.88 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-15 04:29:36,944 epoch 7 - iter 267/894 - loss 0.01723374 - time (sec): 50.05 - samples/sec: 515.64 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-15 04:29:53,834 epoch 7 - iter 356/894 - loss 0.01926101 - time (sec): 66.94 - samples/sec: 519.06 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-15 04:30:12,591 epoch 7 - iter 445/894 - loss 0.02063806 - time (sec): 85.70 - samples/sec: 519.28 - lr: 0.000058 - momentum: 0.000000 |
|
2023-10-15 04:30:29,223 epoch 7 - iter 534/894 - loss 0.02231732 - time (sec): 102.33 - samples/sec: 516.46 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-15 04:30:45,697 epoch 7 - iter 623/894 - loss 0.02186638 - time (sec): 118.80 - samples/sec: 515.38 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-15 04:31:02,125 epoch 7 - iter 712/894 - loss 0.02151176 - time (sec): 135.23 - samples/sec: 513.26 - lr: 0.000053 - momentum: 0.000000 |
|
2023-10-15 04:31:18,350 epoch 7 - iter 801/894 - loss 0.02251965 - time (sec): 151.46 - samples/sec: 510.54 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-15 04:31:35,670 epoch 7 - iter 890/894 - loss 0.02216851 - time (sec): 168.78 - samples/sec: 511.33 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-15 04:31:36,315 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:31:36,315 EPOCH 7 done: loss 0.0221 - lr: 0.000050 |
|
2023-10-15 04:32:02,469 DEV : loss 0.21213364601135254 - f1-score (micro avg) 0.7496 |
|
2023-10-15 04:32:02,496 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:32:19,708 epoch 8 - iter 89/894 - loss 0.01169276 - time (sec): 17.21 - samples/sec: 491.26 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-15 04:32:38,228 epoch 8 - iter 178/894 - loss 0.02044703 - time (sec): 35.73 - samples/sec: 515.58 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-15 04:32:54,827 epoch 8 - iter 267/894 - loss 0.01832686 - time (sec): 52.33 - samples/sec: 519.69 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-15 04:33:11,310 epoch 8 - iter 356/894 - loss 0.01846364 - time (sec): 68.81 - samples/sec: 517.95 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-15 04:33:27,641 epoch 8 - iter 445/894 - loss 0.01842268 - time (sec): 85.14 - samples/sec: 516.53 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-15 04:33:44,139 epoch 8 - iter 534/894 - loss 0.01721769 - time (sec): 101.64 - samples/sec: 513.08 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-15 04:34:01,049 epoch 8 - iter 623/894 - loss 0.01554985 - time (sec): 118.55 - samples/sec: 514.36 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-15 04:34:17,851 epoch 8 - iter 712/894 - loss 0.01536733 - time (sec): 135.35 - samples/sec: 516.76 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-15 04:34:34,020 epoch 8 - iter 801/894 - loss 0.01479547 - time (sec): 151.52 - samples/sec: 512.89 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-15 04:34:50,629 epoch 8 - iter 890/894 - loss 0.01443364 - time (sec): 168.13 - samples/sec: 513.01 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-15 04:34:51,277 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:34:51,277 EPOCH 8 done: loss 0.0144 - lr: 0.000033 |
|
2023-10-15 04:35:17,695 DEV : loss 0.23843105137348175 - f1-score (micro avg) 0.7599 |
|
2023-10-15 04:35:17,725 saving best model |
|
2023-10-15 04:35:20,871 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:35:37,980 epoch 9 - iter 89/894 - loss 0.01583546 - time (sec): 17.11 - samples/sec: 526.25 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-15 04:35:54,638 epoch 9 - iter 178/894 - loss 0.01093487 - time (sec): 33.77 - samples/sec: 526.20 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-15 04:36:11,458 epoch 9 - iter 267/894 - loss 0.00930857 - time (sec): 50.59 - samples/sec: 519.11 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-15 04:36:27,958 epoch 9 - iter 356/894 - loss 0.00861726 - time (sec): 67.09 - samples/sec: 515.16 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-15 04:36:44,252 epoch 9 - iter 445/894 - loss 0.00776441 - time (sec): 83.38 - samples/sec: 511.51 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-15 04:37:01,051 epoch 9 - iter 534/894 - loss 0.00737818 - time (sec): 100.18 - samples/sec: 511.52 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-15 04:37:18,832 epoch 9 - iter 623/894 - loss 0.00817716 - time (sec): 117.96 - samples/sec: 514.90 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-15 04:37:35,299 epoch 9 - iter 712/894 - loss 0.00840400 - time (sec): 134.43 - samples/sec: 511.00 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-15 04:37:53,566 epoch 9 - iter 801/894 - loss 0.01000822 - time (sec): 152.69 - samples/sec: 509.87 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-15 04:38:10,401 epoch 9 - iter 890/894 - loss 0.01001162 - time (sec): 169.53 - samples/sec: 508.91 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-15 04:38:11,057 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:38:11,057 EPOCH 9 done: loss 0.0100 - lr: 0.000017 |
|
2023-10-15 04:38:36,962 DEV : loss 0.24341297149658203 - f1-score (micro avg) 0.7577 |
|
2023-10-15 04:38:36,988 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:38:53,623 epoch 10 - iter 89/894 - loss 0.00631867 - time (sec): 16.63 - samples/sec: 496.94 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-15 04:39:10,822 epoch 10 - iter 178/894 - loss 0.00579084 - time (sec): 33.83 - samples/sec: 517.16 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-15 04:39:27,560 epoch 10 - iter 267/894 - loss 0.00739788 - time (sec): 50.57 - samples/sec: 521.17 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-15 04:39:46,005 epoch 10 - iter 356/894 - loss 0.00899480 - time (sec): 69.02 - samples/sec: 519.49 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-15 04:40:02,766 epoch 10 - iter 445/894 - loss 0.00779425 - time (sec): 85.78 - samples/sec: 512.73 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-15 04:40:18,916 epoch 10 - iter 534/894 - loss 0.00861217 - time (sec): 101.93 - samples/sec: 508.64 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-15 04:40:35,444 epoch 10 - iter 623/894 - loss 0.00785899 - time (sec): 118.45 - samples/sec: 512.67 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-15 04:40:51,915 epoch 10 - iter 712/894 - loss 0.00783975 - time (sec): 134.93 - samples/sec: 511.39 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-15 04:41:08,151 epoch 10 - iter 801/894 - loss 0.00785490 - time (sec): 151.16 - samples/sec: 507.30 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-15 04:41:25,522 epoch 10 - iter 890/894 - loss 0.00775187 - time (sec): 168.53 - samples/sec: 511.62 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-15 04:41:26,204 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:41:26,204 EPOCH 10 done: loss 0.0077 - lr: 0.000000 |
|
2023-10-15 04:41:52,334 DEV : loss 0.24051038920879364 - f1-score (micro avg) 0.7591 |
|
2023-10-15 04:41:52,948 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 04:41:52,949 Loading model from best epoch ... |
|
2023-10-15 04:42:00,352 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time |
|
2023-10-15 04:42:22,999 |
|
Results: |
|
- F-score (micro) 0.7589 |
|
- F-score (macro) 0.6502 |
|
- Accuracy 0.629 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.8484 0.8641 0.8562 596 |
|
pers 0.6733 0.8168 0.7381 333 |
|
org 0.5600 0.5303 0.5447 132 |
|
prod 0.5439 0.4697 0.5041 66 |
|
time 0.5849 0.6327 0.6078 49 |
|
|
|
micro avg 0.7376 0.7815 0.7589 1176 |
|
macro avg 0.6421 0.6627 0.6502 1176 |
|
weighted avg 0.7384 0.7815 0.7577 1176 |
|
|
|
2023-10-15 04:42:22,999 ---------------------------------------------------------------------------------------------------- |
|
|