|
2023-10-14 20:04:40,573 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:04:40,575 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-14 20:04:40,575 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:04:40,575 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences |
|
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator |
|
2023-10-14 20:04:40,575 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:04:40,575 Train: 3575 sentences |
|
2023-10-14 20:04:40,575 (train_with_dev=False, train_with_test=False) |
|
2023-10-14 20:04:40,575 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:04:40,575 Training Params: |
|
2023-10-14 20:04:40,575 - learning_rate: "0.00016" |
|
2023-10-14 20:04:40,575 - mini_batch_size: "4" |
|
2023-10-14 20:04:40,575 - max_epochs: "10" |
|
2023-10-14 20:04:40,575 - shuffle: "True" |
|
2023-10-14 20:04:40,575 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:04:40,575 Plugins: |
|
2023-10-14 20:04:40,575 - TensorboardLogger |
|
2023-10-14 20:04:40,575 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-14 20:04:40,575 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:04:40,575 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-14 20:04:40,575 - metric: "('micro avg', 'f1-score')" |
|
2023-10-14 20:04:40,575 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:04:40,575 Computation: |
|
2023-10-14 20:04:40,575 - compute on device: cuda:0 |
|
2023-10-14 20:04:40,575 - embedding storage: none |
|
2023-10-14 20:04:40,575 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:04:40,576 Model training base path: "hmbench-hipe2020/de-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-14 20:04:40,576 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:04:40,576 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:04:40,576 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-14 20:04:57,508 epoch 1 - iter 89/894 - loss 3.04433325 - time (sec): 16.93 - samples/sec: 546.74 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-14 20:05:14,096 epoch 1 - iter 178/894 - loss 3.00697322 - time (sec): 33.52 - samples/sec: 521.10 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-14 20:05:30,548 epoch 1 - iter 267/894 - loss 2.85458665 - time (sec): 49.97 - samples/sec: 514.10 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-14 20:05:47,119 epoch 1 - iter 356/894 - loss 2.63070465 - time (sec): 66.54 - samples/sec: 516.75 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-14 20:06:02,771 epoch 1 - iter 445/894 - loss 2.41357097 - time (sec): 82.19 - samples/sec: 508.03 - lr: 0.000079 - momentum: 0.000000 |
|
2023-10-14 20:06:19,218 epoch 1 - iter 534/894 - loss 2.15809760 - time (sec): 98.64 - samples/sec: 507.30 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-14 20:06:36,432 epoch 1 - iter 623/894 - loss 1.90629951 - time (sec): 115.86 - samples/sec: 512.68 - lr: 0.000111 - momentum: 0.000000 |
|
2023-10-14 20:06:52,947 epoch 1 - iter 712/894 - loss 1.73655924 - time (sec): 132.37 - samples/sec: 513.88 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-14 20:07:11,799 epoch 1 - iter 801/894 - loss 1.57462985 - time (sec): 151.22 - samples/sec: 516.25 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-14 20:07:28,037 epoch 1 - iter 890/894 - loss 1.46293211 - time (sec): 167.46 - samples/sec: 514.09 - lr: 0.000159 - momentum: 0.000000 |
|
2023-10-14 20:07:28,794 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:07:28,794 EPOCH 1 done: loss 1.4575 - lr: 0.000159 |
|
2023-10-14 20:07:51,443 DEV : loss 0.339751273393631 - f1-score (micro avg) 0.0234 |
|
2023-10-14 20:07:51,469 saving best model |
|
2023-10-14 20:07:52,081 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:08:08,569 epoch 2 - iter 89/894 - loss 0.36784261 - time (sec): 16.49 - samples/sec: 517.02 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-14 20:08:25,152 epoch 2 - iter 178/894 - loss 0.35533195 - time (sec): 33.07 - samples/sec: 517.57 - lr: 0.000156 - momentum: 0.000000 |
|
2023-10-14 20:08:42,147 epoch 2 - iter 267/894 - loss 0.33472166 - time (sec): 50.06 - samples/sec: 529.77 - lr: 0.000155 - momentum: 0.000000 |
|
2023-10-14 20:09:00,581 epoch 2 - iter 356/894 - loss 0.32234839 - time (sec): 68.50 - samples/sec: 525.59 - lr: 0.000153 - momentum: 0.000000 |
|
2023-10-14 20:09:17,317 epoch 2 - iter 445/894 - loss 0.30927331 - time (sec): 85.23 - samples/sec: 523.45 - lr: 0.000151 - momentum: 0.000000 |
|
2023-10-14 20:09:34,281 epoch 2 - iter 534/894 - loss 0.29646500 - time (sec): 102.20 - samples/sec: 523.10 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-14 20:09:50,684 epoch 2 - iter 623/894 - loss 0.29525988 - time (sec): 118.60 - samples/sec: 518.76 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-14 20:10:07,451 epoch 2 - iter 712/894 - loss 0.29082523 - time (sec): 135.37 - samples/sec: 517.81 - lr: 0.000146 - momentum: 0.000000 |
|
2023-10-14 20:10:24,559 epoch 2 - iter 801/894 - loss 0.28139580 - time (sec): 152.48 - samples/sec: 516.17 - lr: 0.000144 - momentum: 0.000000 |
|
2023-10-14 20:10:40,763 epoch 2 - iter 890/894 - loss 0.27716837 - time (sec): 168.68 - samples/sec: 511.28 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-14 20:10:41,447 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:10:41,448 EPOCH 2 done: loss 0.2772 - lr: 0.000142 |
|
2023-10-14 20:11:06,565 DEV : loss 0.19352570176124573 - f1-score (micro avg) 0.6235 |
|
2023-10-14 20:11:06,591 saving best model |
|
2023-10-14 20:11:11,217 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:11:27,822 epoch 3 - iter 89/894 - loss 0.20345638 - time (sec): 16.60 - samples/sec: 498.08 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-14 20:11:44,065 epoch 3 - iter 178/894 - loss 0.18152700 - time (sec): 32.85 - samples/sec: 500.96 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-14 20:12:01,084 epoch 3 - iter 267/894 - loss 0.17490578 - time (sec): 49.87 - samples/sec: 504.95 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-14 20:12:17,425 epoch 3 - iter 356/894 - loss 0.17202406 - time (sec): 66.21 - samples/sec: 508.41 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-14 20:12:36,005 epoch 3 - iter 445/894 - loss 0.16603411 - time (sec): 84.79 - samples/sec: 517.44 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-14 20:12:52,303 epoch 3 - iter 534/894 - loss 0.16465909 - time (sec): 101.08 - samples/sec: 516.75 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-14 20:13:08,435 epoch 3 - iter 623/894 - loss 0.15551952 - time (sec): 117.22 - samples/sec: 513.59 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-14 20:13:24,463 epoch 3 - iter 712/894 - loss 0.14985602 - time (sec): 133.24 - samples/sec: 512.13 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-14 20:13:41,373 epoch 3 - iter 801/894 - loss 0.14468106 - time (sec): 150.15 - samples/sec: 515.69 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-14 20:13:57,801 epoch 3 - iter 890/894 - loss 0.14031154 - time (sec): 166.58 - samples/sec: 516.59 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-14 20:13:58,586 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:13:58,587 EPOCH 3 done: loss 0.1400 - lr: 0.000125 |
|
2023-10-14 20:14:23,740 DEV : loss 0.16954360902309418 - f1-score (micro avg) 0.6643 |
|
2023-10-14 20:14:23,767 saving best model |
|
2023-10-14 20:14:27,052 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:14:43,421 epoch 4 - iter 89/894 - loss 0.08846701 - time (sec): 16.37 - samples/sec: 515.92 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-14 20:14:59,309 epoch 4 - iter 178/894 - loss 0.09164804 - time (sec): 32.26 - samples/sec: 501.01 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-14 20:15:15,593 epoch 4 - iter 267/894 - loss 0.08809052 - time (sec): 48.54 - samples/sec: 501.81 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-14 20:15:32,110 epoch 4 - iter 356/894 - loss 0.09020029 - time (sec): 65.06 - samples/sec: 501.19 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-14 20:15:48,299 epoch 4 - iter 445/894 - loss 0.08499223 - time (sec): 81.24 - samples/sec: 500.34 - lr: 0.000116 - momentum: 0.000000 |
|
2023-10-14 20:16:05,170 epoch 4 - iter 534/894 - loss 0.08110597 - time (sec): 98.12 - samples/sec: 506.20 - lr: 0.000114 - momentum: 0.000000 |
|
2023-10-14 20:16:21,605 epoch 4 - iter 623/894 - loss 0.07754561 - time (sec): 114.55 - samples/sec: 505.87 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-14 20:16:38,148 epoch 4 - iter 712/894 - loss 0.07756119 - time (sec): 131.09 - samples/sec: 504.72 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-14 20:16:57,069 epoch 4 - iter 801/894 - loss 0.07686532 - time (sec): 150.01 - samples/sec: 508.09 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-14 20:17:15,151 epoch 4 - iter 890/894 - loss 0.07342823 - time (sec): 168.10 - samples/sec: 512.26 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-14 20:17:15,919 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:17:15,920 EPOCH 4 done: loss 0.0732 - lr: 0.000107 |
|
2023-10-14 20:17:41,184 DEV : loss 0.17608195543289185 - f1-score (micro avg) 0.7396 |
|
2023-10-14 20:17:41,211 saving best model |
|
2023-10-14 20:17:41,886 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:17:58,085 epoch 5 - iter 89/894 - loss 0.04773510 - time (sec): 16.20 - samples/sec: 480.18 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-14 20:18:14,911 epoch 5 - iter 178/894 - loss 0.04110256 - time (sec): 33.02 - samples/sec: 489.92 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-14 20:18:32,101 epoch 5 - iter 267/894 - loss 0.04134861 - time (sec): 50.21 - samples/sec: 500.72 - lr: 0.000101 - momentum: 0.000000 |
|
2023-10-14 20:18:48,790 epoch 5 - iter 356/894 - loss 0.04668182 - time (sec): 66.90 - samples/sec: 503.49 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-14 20:19:05,227 epoch 5 - iter 445/894 - loss 0.04405493 - time (sec): 83.34 - samples/sec: 503.64 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-14 20:19:23,909 epoch 5 - iter 534/894 - loss 0.04580574 - time (sec): 102.02 - samples/sec: 505.88 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-14 20:19:40,166 epoch 5 - iter 623/894 - loss 0.04527345 - time (sec): 118.28 - samples/sec: 505.78 - lr: 0.000094 - momentum: 0.000000 |
|
2023-10-14 20:19:56,921 epoch 5 - iter 712/894 - loss 0.04590300 - time (sec): 135.03 - samples/sec: 509.19 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-14 20:20:13,926 epoch 5 - iter 801/894 - loss 0.04752093 - time (sec): 152.04 - samples/sec: 509.41 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-14 20:20:30,503 epoch 5 - iter 890/894 - loss 0.04824072 - time (sec): 168.62 - samples/sec: 510.41 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-14 20:20:31,281 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:20:31,281 EPOCH 5 done: loss 0.0486 - lr: 0.000089 |
|
2023-10-14 20:20:56,057 DEV : loss 0.20735777914524078 - f1-score (micro avg) 0.7229 |
|
2023-10-14 20:20:56,085 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:21:12,577 epoch 6 - iter 89/894 - loss 0.01695411 - time (sec): 16.49 - samples/sec: 525.25 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-14 20:21:28,712 epoch 6 - iter 178/894 - loss 0.02446819 - time (sec): 32.63 - samples/sec: 520.04 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-14 20:21:45,112 epoch 6 - iter 267/894 - loss 0.02659230 - time (sec): 49.03 - samples/sec: 518.96 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-14 20:22:01,419 epoch 6 - iter 356/894 - loss 0.02527246 - time (sec): 65.33 - samples/sec: 520.98 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-14 20:22:17,554 epoch 6 - iter 445/894 - loss 0.02492951 - time (sec): 81.47 - samples/sec: 517.28 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-14 20:22:35,796 epoch 6 - iter 534/894 - loss 0.02839357 - time (sec): 99.71 - samples/sec: 519.78 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-14 20:22:52,541 epoch 6 - iter 623/894 - loss 0.02825206 - time (sec): 116.45 - samples/sec: 523.70 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-14 20:23:09,200 epoch 6 - iter 712/894 - loss 0.02825234 - time (sec): 133.11 - samples/sec: 521.37 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-14 20:23:25,181 epoch 6 - iter 801/894 - loss 0.03010817 - time (sec): 149.10 - samples/sec: 518.59 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-14 20:23:41,905 epoch 6 - iter 890/894 - loss 0.03021347 - time (sec): 165.82 - samples/sec: 519.76 - lr: 0.000071 - momentum: 0.000000 |
|
2023-10-14 20:23:42,597 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:23:42,597 EPOCH 6 done: loss 0.0302 - lr: 0.000071 |
|
2023-10-14 20:24:07,543 DEV : loss 0.2202872484922409 - f1-score (micro avg) 0.7455 |
|
2023-10-14 20:24:07,569 saving best model |
|
2023-10-14 20:24:11,367 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:24:29,869 epoch 7 - iter 89/894 - loss 0.02780046 - time (sec): 18.50 - samples/sec: 527.16 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-14 20:24:46,557 epoch 7 - iter 178/894 - loss 0.02662654 - time (sec): 35.19 - samples/sec: 523.71 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-14 20:25:03,009 epoch 7 - iter 267/894 - loss 0.03213013 - time (sec): 51.64 - samples/sec: 510.98 - lr: 0.000066 - momentum: 0.000000 |
|
2023-10-14 20:25:19,405 epoch 7 - iter 356/894 - loss 0.02646254 - time (sec): 68.03 - samples/sec: 511.03 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-14 20:25:36,170 epoch 7 - iter 445/894 - loss 0.02557549 - time (sec): 84.80 - samples/sec: 512.87 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-14 20:25:53,249 epoch 7 - iter 534/894 - loss 0.02364430 - time (sec): 101.88 - samples/sec: 514.82 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-14 20:26:09,746 epoch 7 - iter 623/894 - loss 0.02404203 - time (sec): 118.38 - samples/sec: 513.81 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-14 20:26:26,140 epoch 7 - iter 712/894 - loss 0.02297158 - time (sec): 134.77 - samples/sec: 514.00 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-14 20:26:42,777 epoch 7 - iter 801/894 - loss 0.02189655 - time (sec): 151.41 - samples/sec: 514.26 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-14 20:26:59,291 epoch 7 - iter 890/894 - loss 0.02080018 - time (sec): 167.92 - samples/sec: 513.25 - lr: 0.000053 - momentum: 0.000000 |
|
2023-10-14 20:27:00,016 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:27:00,016 EPOCH 7 done: loss 0.0208 - lr: 0.000053 |
|
2023-10-14 20:27:25,243 DEV : loss 0.24006003141403198 - f1-score (micro avg) 0.7596 |
|
2023-10-14 20:27:25,270 saving best model |
|
2023-10-14 20:27:29,580 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:27:45,981 epoch 8 - iter 89/894 - loss 0.01819202 - time (sec): 16.40 - samples/sec: 505.35 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-14 20:28:02,806 epoch 8 - iter 178/894 - loss 0.02002471 - time (sec): 33.22 - samples/sec: 508.31 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-14 20:28:18,996 epoch 8 - iter 267/894 - loss 0.01725648 - time (sec): 49.41 - samples/sec: 504.36 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-14 20:28:36,157 epoch 8 - iter 356/894 - loss 0.01616480 - time (sec): 66.57 - samples/sec: 518.25 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-14 20:28:53,072 epoch 8 - iter 445/894 - loss 0.01677385 - time (sec): 83.49 - samples/sec: 523.17 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-14 20:29:09,355 epoch 8 - iter 534/894 - loss 0.01546777 - time (sec): 99.77 - samples/sec: 518.34 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-14 20:29:27,470 epoch 8 - iter 623/894 - loss 0.01582854 - time (sec): 117.89 - samples/sec: 514.01 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-14 20:29:44,191 epoch 8 - iter 712/894 - loss 0.01615447 - time (sec): 134.61 - samples/sec: 513.46 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-14 20:30:00,775 epoch 8 - iter 801/894 - loss 0.01593930 - time (sec): 151.19 - samples/sec: 511.91 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-14 20:30:17,616 epoch 8 - iter 890/894 - loss 0.01516070 - time (sec): 168.03 - samples/sec: 513.74 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-14 20:30:18,252 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:30:18,252 EPOCH 8 done: loss 0.0151 - lr: 0.000036 |
|
2023-10-14 20:30:43,109 DEV : loss 0.23652133345603943 - f1-score (micro avg) 0.7519 |
|
2023-10-14 20:30:43,135 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:31:01,634 epoch 9 - iter 89/894 - loss 0.01814129 - time (sec): 18.50 - samples/sec: 529.30 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-14 20:31:18,702 epoch 9 - iter 178/894 - loss 0.01179305 - time (sec): 35.57 - samples/sec: 533.30 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-14 20:31:35,620 epoch 9 - iter 267/894 - loss 0.01076027 - time (sec): 52.48 - samples/sec: 527.10 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-14 20:31:52,398 epoch 9 - iter 356/894 - loss 0.00961319 - time (sec): 69.26 - samples/sec: 528.25 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-14 20:32:08,429 epoch 9 - iter 445/894 - loss 0.01201137 - time (sec): 85.29 - samples/sec: 520.40 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 20:32:24,799 epoch 9 - iter 534/894 - loss 0.01096827 - time (sec): 101.66 - samples/sec: 517.67 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-14 20:32:41,012 epoch 9 - iter 623/894 - loss 0.01012014 - time (sec): 117.87 - samples/sec: 513.10 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-14 20:32:57,699 epoch 9 - iter 712/894 - loss 0.01039388 - time (sec): 134.56 - samples/sec: 513.70 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-14 20:33:14,257 epoch 9 - iter 801/894 - loss 0.01006521 - time (sec): 151.12 - samples/sec: 512.95 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-14 20:33:31,112 epoch 9 - iter 890/894 - loss 0.01011928 - time (sec): 167.98 - samples/sec: 513.55 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-14 20:33:31,781 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:33:31,781 EPOCH 9 done: loss 0.0101 - lr: 0.000018 |
|
2023-10-14 20:33:57,240 DEV : loss 0.25627970695495605 - f1-score (micro avg) 0.7519 |
|
2023-10-14 20:33:57,266 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:34:14,039 epoch 10 - iter 89/894 - loss 0.01200696 - time (sec): 16.77 - samples/sec: 526.50 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-14 20:34:30,197 epoch 10 - iter 178/894 - loss 0.00857499 - time (sec): 32.93 - samples/sec: 504.69 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-14 20:34:46,697 epoch 10 - iter 267/894 - loss 0.00703922 - time (sec): 49.43 - samples/sec: 506.28 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-14 20:35:04,040 epoch 10 - iter 356/894 - loss 0.00654044 - time (sec): 66.77 - samples/sec: 513.17 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-14 20:35:22,595 epoch 10 - iter 445/894 - loss 0.00710455 - time (sec): 85.33 - samples/sec: 517.56 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 20:35:39,208 epoch 10 - iter 534/894 - loss 0.00709956 - time (sec): 101.94 - samples/sec: 517.20 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-14 20:35:55,393 epoch 10 - iter 623/894 - loss 0.00690966 - time (sec): 118.13 - samples/sec: 512.29 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-14 20:36:11,258 epoch 10 - iter 712/894 - loss 0.00715382 - time (sec): 133.99 - samples/sec: 509.56 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-14 20:36:28,507 epoch 10 - iter 801/894 - loss 0.00646675 - time (sec): 151.24 - samples/sec: 513.66 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-14 20:36:44,895 epoch 10 - iter 890/894 - loss 0.00727492 - time (sec): 167.63 - samples/sec: 514.63 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-14 20:36:45,547 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:36:45,548 EPOCH 10 done: loss 0.0073 - lr: 0.000000 |
|
2023-10-14 20:37:10,711 DEV : loss 0.26143890619277954 - f1-score (micro avg) 0.75 |
|
2023-10-14 20:37:11,338 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:37:11,339 Loading model from best epoch ... |
|
2023-10-14 20:37:13,530 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time |
|
2023-10-14 20:37:35,355 |
|
Results: |
|
- F-score (micro) 0.759 |
|
- F-score (macro) 0.6755 |
|
- Accuracy 0.6254 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.8396 0.8607 0.8500 596 |
|
pers 0.6815 0.7838 0.7291 333 |
|
org 0.5397 0.5152 0.5271 132 |
|
prod 0.6140 0.5303 0.5691 66 |
|
time 0.7333 0.6735 0.7021 49 |
|
|
|
micro avg 0.7447 0.7738 0.7590 1176 |
|
macro avg 0.6816 0.6727 0.6755 1176 |
|
weighted avg 0.7441 0.7738 0.7576 1176 |
|
|
|
2023-10-14 20:37:35,355 ---------------------------------------------------------------------------------------------------- |
|
|