|
2023-10-14 19:31:38,478 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:31:38,479 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-14 19:31:38,479 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:31:38,479 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences |
|
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator |
|
2023-10-14 19:31:38,480 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:31:38,480 Train: 3575 sentences |
|
2023-10-14 19:31:38,480 (train_with_dev=False, train_with_test=False) |
|
2023-10-14 19:31:38,480 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:31:38,480 Training Params: |
|
2023-10-14 19:31:38,480 - learning_rate: "0.00015" |
|
2023-10-14 19:31:38,480 - mini_batch_size: "4" |
|
2023-10-14 19:31:38,480 - max_epochs: "10" |
|
2023-10-14 19:31:38,480 - shuffle: "True" |
|
2023-10-14 19:31:38,480 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:31:38,480 Plugins: |
|
2023-10-14 19:31:38,480 - TensorboardLogger |
|
2023-10-14 19:31:38,480 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-14 19:31:38,480 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:31:38,480 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-14 19:31:38,480 - metric: "('micro avg', 'f1-score')" |
|
2023-10-14 19:31:38,480 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:31:38,480 Computation: |
|
2023-10-14 19:31:38,480 - compute on device: cuda:0 |
|
2023-10-14 19:31:38,480 - embedding storage: none |
|
2023-10-14 19:31:38,480 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:31:38,480 Model training base path: "hmbench-hipe2020/de-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-14 19:31:38,480 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:31:38,481 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:31:38,481 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-14 19:31:55,583 epoch 1 - iter 89/894 - loss 3.04489387 - time (sec): 17.10 - samples/sec: 541.29 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-14 19:32:11,920 epoch 1 - iter 178/894 - loss 3.01135120 - time (sec): 33.44 - samples/sec: 522.36 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-14 19:32:28,430 epoch 1 - iter 267/894 - loss 2.86985611 - time (sec): 49.95 - samples/sec: 514.33 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-14 19:32:45,243 epoch 1 - iter 356/894 - loss 2.65587997 - time (sec): 66.76 - samples/sec: 515.06 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-14 19:33:01,082 epoch 1 - iter 445/894 - loss 2.44878120 - time (sec): 82.60 - samples/sec: 505.53 - lr: 0.000074 - momentum: 0.000000 |
|
2023-10-14 19:33:17,616 epoch 1 - iter 534/894 - loss 2.20154310 - time (sec): 99.13 - samples/sec: 504.78 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-14 19:33:34,891 epoch 1 - iter 623/894 - loss 1.94916014 - time (sec): 116.41 - samples/sec: 510.23 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-14 19:33:51,511 epoch 1 - iter 712/894 - loss 1.77451894 - time (sec): 133.03 - samples/sec: 511.33 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-14 19:34:10,506 epoch 1 - iter 801/894 - loss 1.61163646 - time (sec): 152.02 - samples/sec: 513.52 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-14 19:34:26,780 epoch 1 - iter 890/894 - loss 1.49843666 - time (sec): 168.30 - samples/sec: 511.53 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-14 19:34:27,544 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:34:27,545 EPOCH 1 done: loss 1.4929 - lr: 0.000149 |
|
2023-10-14 19:34:50,649 DEV : loss 0.3523489832878113 - f1-score (micro avg) 0.0556 |
|
2023-10-14 19:34:50,675 saving best model |
|
2023-10-14 19:34:51,285 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:35:07,818 epoch 2 - iter 89/894 - loss 0.38002707 - time (sec): 16.53 - samples/sec: 515.60 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-14 19:35:24,321 epoch 2 - iter 178/894 - loss 0.36652988 - time (sec): 33.04 - samples/sec: 518.11 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-14 19:35:41,326 epoch 2 - iter 267/894 - loss 0.34661944 - time (sec): 50.04 - samples/sec: 530.03 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-14 19:35:59,999 epoch 2 - iter 356/894 - loss 0.33679946 - time (sec): 68.71 - samples/sec: 523.95 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-14 19:36:16,710 epoch 2 - iter 445/894 - loss 0.32450464 - time (sec): 85.42 - samples/sec: 522.29 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-14 19:36:33,787 epoch 2 - iter 534/894 - loss 0.31215668 - time (sec): 102.50 - samples/sec: 521.55 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-14 19:36:50,184 epoch 2 - iter 623/894 - loss 0.31216230 - time (sec): 118.90 - samples/sec: 517.47 - lr: 0.000138 - momentum: 0.000000 |
|
2023-10-14 19:37:06,813 epoch 2 - iter 712/894 - loss 0.30747611 - time (sec): 135.53 - samples/sec: 517.21 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-14 19:37:23,698 epoch 2 - iter 801/894 - loss 0.29758927 - time (sec): 152.41 - samples/sec: 516.39 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-14 19:37:39,603 epoch 2 - iter 890/894 - loss 0.29413445 - time (sec): 168.32 - samples/sec: 512.38 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-14 19:37:40,270 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:37:40,270 EPOCH 2 done: loss 0.2940 - lr: 0.000133 |
|
2023-10-14 19:38:05,408 DEV : loss 0.2095714658498764 - f1-score (micro avg) 0.5995 |
|
2023-10-14 19:38:05,434 saving best model |
|
2023-10-14 19:38:10,190 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:38:26,702 epoch 3 - iter 89/894 - loss 0.21975580 - time (sec): 16.51 - samples/sec: 500.92 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-14 19:38:43,023 epoch 3 - iter 178/894 - loss 0.19576679 - time (sec): 32.83 - samples/sec: 501.22 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-14 19:39:00,122 epoch 3 - iter 267/894 - loss 0.19131051 - time (sec): 49.93 - samples/sec: 504.32 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-14 19:39:16,560 epoch 3 - iter 356/894 - loss 0.19069530 - time (sec): 66.37 - samples/sec: 507.18 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-14 19:39:35,222 epoch 3 - iter 445/894 - loss 0.18443738 - time (sec): 85.03 - samples/sec: 515.97 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-14 19:39:51,748 epoch 3 - iter 534/894 - loss 0.18034120 - time (sec): 101.56 - samples/sec: 514.35 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-14 19:40:07,984 epoch 3 - iter 623/894 - loss 0.17040918 - time (sec): 117.79 - samples/sec: 511.08 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-14 19:40:24,143 epoch 3 - iter 712/894 - loss 0.16379017 - time (sec): 133.95 - samples/sec: 509.43 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-14 19:40:41,141 epoch 3 - iter 801/894 - loss 0.15879809 - time (sec): 150.95 - samples/sec: 512.98 - lr: 0.000118 - momentum: 0.000000 |
|
2023-10-14 19:40:57,726 epoch 3 - iter 890/894 - loss 0.15341354 - time (sec): 167.53 - samples/sec: 513.65 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-14 19:40:58,514 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:40:58,515 EPOCH 3 done: loss 0.1532 - lr: 0.000117 |
|
2023-10-14 19:41:23,911 DEV : loss 0.17346230149269104 - f1-score (micro avg) 0.7064 |
|
2023-10-14 19:41:23,937 saving best model |
|
2023-10-14 19:41:28,273 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:41:45,109 epoch 4 - iter 89/894 - loss 0.09844153 - time (sec): 16.83 - samples/sec: 501.59 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-14 19:42:01,109 epoch 4 - iter 178/894 - loss 0.10260624 - time (sec): 32.83 - samples/sec: 492.17 - lr: 0.000113 - momentum: 0.000000 |
|
2023-10-14 19:42:17,395 epoch 4 - iter 267/894 - loss 0.09857008 - time (sec): 49.12 - samples/sec: 495.86 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-14 19:42:33,912 epoch 4 - iter 356/894 - loss 0.09918121 - time (sec): 65.64 - samples/sec: 496.75 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-14 19:42:50,110 epoch 4 - iter 445/894 - loss 0.09246905 - time (sec): 81.84 - samples/sec: 496.73 - lr: 0.000108 - momentum: 0.000000 |
|
2023-10-14 19:43:06,959 epoch 4 - iter 534/894 - loss 0.08940730 - time (sec): 98.68 - samples/sec: 503.28 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-14 19:43:23,411 epoch 4 - iter 623/894 - loss 0.08504805 - time (sec): 115.14 - samples/sec: 503.30 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-14 19:43:39,743 epoch 4 - iter 712/894 - loss 0.08499619 - time (sec): 131.47 - samples/sec: 503.29 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-14 19:43:58,439 epoch 4 - iter 801/894 - loss 0.08586194 - time (sec): 150.17 - samples/sec: 507.58 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-14 19:44:16,248 epoch 4 - iter 890/894 - loss 0.08240795 - time (sec): 167.97 - samples/sec: 512.64 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-14 19:44:17,005 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:44:17,006 EPOCH 4 done: loss 0.0821 - lr: 0.000100 |
|
2023-10-14 19:44:42,068 DEV : loss 0.1738642454147339 - f1-score (micro avg) 0.7313 |
|
2023-10-14 19:44:42,095 saving best model |
|
2023-10-14 19:44:45,496 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:45:01,636 epoch 5 - iter 89/894 - loss 0.04568145 - time (sec): 16.14 - samples/sec: 481.94 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-14 19:45:18,218 epoch 5 - iter 178/894 - loss 0.04121839 - time (sec): 32.72 - samples/sec: 494.46 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-14 19:45:35,198 epoch 5 - iter 267/894 - loss 0.04001861 - time (sec): 49.70 - samples/sec: 505.89 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-14 19:45:51,772 epoch 5 - iter 356/894 - loss 0.04722101 - time (sec): 66.28 - samples/sec: 508.26 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-14 19:46:08,157 epoch 5 - iter 445/894 - loss 0.04470699 - time (sec): 82.66 - samples/sec: 507.78 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-14 19:46:26,783 epoch 5 - iter 534/894 - loss 0.04750186 - time (sec): 101.29 - samples/sec: 509.56 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-14 19:46:43,080 epoch 5 - iter 623/894 - loss 0.04827928 - time (sec): 117.58 - samples/sec: 508.77 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-14 19:46:59,856 epoch 5 - iter 712/894 - loss 0.05022362 - time (sec): 134.36 - samples/sec: 511.75 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-14 19:47:16,471 epoch 5 - iter 801/894 - loss 0.05136390 - time (sec): 150.97 - samples/sec: 513.00 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-14 19:47:32,967 epoch 5 - iter 890/894 - loss 0.05218871 - time (sec): 167.47 - samples/sec: 513.90 - lr: 0.000083 - momentum: 0.000000 |
|
2023-10-14 19:47:33,747 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:47:33,747 EPOCH 5 done: loss 0.0532 - lr: 0.000083 |
|
2023-10-14 19:47:58,630 DEV : loss 0.22192847728729248 - f1-score (micro avg) 0.7529 |
|
2023-10-14 19:47:58,656 saving best model |
|
2023-10-14 19:48:01,734 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:48:18,652 epoch 6 - iter 89/894 - loss 0.02019978 - time (sec): 16.92 - samples/sec: 512.06 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-14 19:48:34,931 epoch 6 - iter 178/894 - loss 0.02444540 - time (sec): 33.19 - samples/sec: 511.14 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-14 19:48:51,471 epoch 6 - iter 267/894 - loss 0.02913359 - time (sec): 49.73 - samples/sec: 511.58 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-14 19:49:07,947 epoch 6 - iter 356/894 - loss 0.02770752 - time (sec): 66.21 - samples/sec: 514.07 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-14 19:49:24,294 epoch 6 - iter 445/894 - loss 0.02840373 - time (sec): 82.56 - samples/sec: 510.46 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-14 19:49:42,688 epoch 6 - iter 534/894 - loss 0.03156363 - time (sec): 100.95 - samples/sec: 513.38 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-14 19:49:59,604 epoch 6 - iter 623/894 - loss 0.03216127 - time (sec): 117.87 - samples/sec: 517.42 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-14 19:50:16,354 epoch 6 - iter 712/894 - loss 0.03207254 - time (sec): 134.62 - samples/sec: 515.54 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-14 19:50:32,553 epoch 6 - iter 801/894 - loss 0.03414003 - time (sec): 150.82 - samples/sec: 512.67 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-14 19:50:49,459 epoch 6 - iter 890/894 - loss 0.03363307 - time (sec): 167.72 - samples/sec: 513.86 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-14 19:50:50,160 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:50:50,160 EPOCH 6 done: loss 0.0336 - lr: 0.000067 |
|
2023-10-14 19:51:15,260 DEV : loss 0.2039400190114975 - f1-score (micro avg) 0.734 |
|
2023-10-14 19:51:15,286 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:51:33,717 epoch 7 - iter 89/894 - loss 0.03261519 - time (sec): 18.43 - samples/sec: 529.16 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-14 19:51:50,276 epoch 7 - iter 178/894 - loss 0.02744893 - time (sec): 34.99 - samples/sec: 526.69 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-14 19:52:06,313 epoch 7 - iter 267/894 - loss 0.03297580 - time (sec): 51.03 - samples/sec: 517.14 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-14 19:52:22,564 epoch 7 - iter 356/894 - loss 0.02788303 - time (sec): 67.28 - samples/sec: 516.79 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-14 19:52:39,137 epoch 7 - iter 445/894 - loss 0.02518628 - time (sec): 83.85 - samples/sec: 518.68 - lr: 0.000058 - momentum: 0.000000 |
|
2023-10-14 19:52:56,093 epoch 7 - iter 534/894 - loss 0.02351345 - time (sec): 100.81 - samples/sec: 520.30 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-14 19:53:12,453 epoch 7 - iter 623/894 - loss 0.02380859 - time (sec): 117.17 - samples/sec: 519.12 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-14 19:53:28,739 epoch 7 - iter 712/894 - loss 0.02328976 - time (sec): 133.45 - samples/sec: 519.08 - lr: 0.000053 - momentum: 0.000000 |
|
2023-10-14 19:53:45,255 epoch 7 - iter 801/894 - loss 0.02298144 - time (sec): 149.97 - samples/sec: 519.20 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-14 19:54:01,703 epoch 7 - iter 890/894 - loss 0.02165727 - time (sec): 166.42 - samples/sec: 517.90 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-14 19:54:02,416 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:54:02,416 EPOCH 7 done: loss 0.0218 - lr: 0.000050 |
|
2023-10-14 19:54:27,377 DEV : loss 0.22965174913406372 - f1-score (micro avg) 0.7541 |
|
2023-10-14 19:54:27,404 saving best model |
|
2023-10-14 19:54:31,666 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:54:48,327 epoch 8 - iter 89/894 - loss 0.02163776 - time (sec): 16.66 - samples/sec: 497.44 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-14 19:55:05,147 epoch 8 - iter 178/894 - loss 0.02358524 - time (sec): 33.48 - samples/sec: 504.43 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-14 19:55:21,010 epoch 8 - iter 267/894 - loss 0.01955824 - time (sec): 49.34 - samples/sec: 505.08 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-14 19:55:37,848 epoch 8 - iter 356/894 - loss 0.01772872 - time (sec): 66.18 - samples/sec: 521.34 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-14 19:55:54,766 epoch 8 - iter 445/894 - loss 0.01662355 - time (sec): 83.10 - samples/sec: 525.63 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-14 19:56:10,901 epoch 8 - iter 534/894 - loss 0.01594968 - time (sec): 99.23 - samples/sec: 521.15 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-14 19:56:28,814 epoch 8 - iter 623/894 - loss 0.01663856 - time (sec): 117.15 - samples/sec: 517.27 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-14 19:56:45,270 epoch 8 - iter 712/894 - loss 0.01649056 - time (sec): 133.60 - samples/sec: 517.33 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-14 19:57:01,811 epoch 8 - iter 801/894 - loss 0.01565139 - time (sec): 150.14 - samples/sec: 515.49 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-14 19:57:18,575 epoch 8 - iter 890/894 - loss 0.01524975 - time (sec): 166.91 - samples/sec: 517.20 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-14 19:57:19,212 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:57:19,212 EPOCH 8 done: loss 0.0152 - lr: 0.000033 |
|
2023-10-14 19:57:44,067 DEV : loss 0.24145588278770447 - f1-score (micro avg) 0.7493 |
|
2023-10-14 19:57:44,093 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 19:58:02,689 epoch 9 - iter 89/894 - loss 0.01188183 - time (sec): 18.60 - samples/sec: 526.54 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-14 19:58:19,786 epoch 9 - iter 178/894 - loss 0.01216627 - time (sec): 35.69 - samples/sec: 531.41 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-14 19:58:36,437 epoch 9 - iter 267/894 - loss 0.00981157 - time (sec): 52.34 - samples/sec: 528.51 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-14 19:58:53,092 epoch 9 - iter 356/894 - loss 0.01043168 - time (sec): 69.00 - samples/sec: 530.26 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 19:59:09,041 epoch 9 - iter 445/894 - loss 0.01268618 - time (sec): 84.95 - samples/sec: 522.52 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-14 19:59:25,420 epoch 9 - iter 534/894 - loss 0.01178358 - time (sec): 101.33 - samples/sec: 519.39 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-14 19:59:41,540 epoch 9 - iter 623/894 - loss 0.01061545 - time (sec): 117.45 - samples/sec: 514.98 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-14 19:59:58,141 epoch 9 - iter 712/894 - loss 0.01084988 - time (sec): 134.05 - samples/sec: 515.67 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-14 20:00:14,586 epoch 9 - iter 801/894 - loss 0.01043690 - time (sec): 150.49 - samples/sec: 515.10 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-14 20:00:31,323 epoch 9 - iter 890/894 - loss 0.01119584 - time (sec): 167.23 - samples/sec: 515.84 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-14 20:00:31,987 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:00:31,987 EPOCH 9 done: loss 0.0112 - lr: 0.000017 |
|
2023-10-14 20:00:57,595 DEV : loss 0.2552002966403961 - f1-score (micro avg) 0.7442 |
|
2023-10-14 20:00:57,621 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:01:14,415 epoch 10 - iter 89/894 - loss 0.01578049 - time (sec): 16.79 - samples/sec: 525.81 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-14 20:01:30,506 epoch 10 - iter 178/894 - loss 0.01155094 - time (sec): 32.88 - samples/sec: 505.39 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-14 20:01:47,360 epoch 10 - iter 267/894 - loss 0.00867733 - time (sec): 49.74 - samples/sec: 503.14 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-14 20:02:04,763 epoch 10 - iter 356/894 - loss 0.00811223 - time (sec): 67.14 - samples/sec: 510.36 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-14 20:02:23,485 epoch 10 - iter 445/894 - loss 0.00905890 - time (sec): 85.86 - samples/sec: 514.33 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-14 20:02:40,436 epoch 10 - iter 534/894 - loss 0.00886633 - time (sec): 102.81 - samples/sec: 512.81 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-14 20:02:56,856 epoch 10 - iter 623/894 - loss 0.00827956 - time (sec): 119.23 - samples/sec: 507.53 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-14 20:03:12,609 epoch 10 - iter 712/894 - loss 0.00920467 - time (sec): 134.99 - samples/sec: 505.80 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-14 20:03:29,677 epoch 10 - iter 801/894 - loss 0.00867601 - time (sec): 152.05 - samples/sec: 510.91 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-14 20:03:45,854 epoch 10 - iter 890/894 - loss 0.01003975 - time (sec): 168.23 - samples/sec: 512.78 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-14 20:03:46,499 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:03:46,499 EPOCH 10 done: loss 0.0100 - lr: 0.000000 |
|
2023-10-14 20:04:11,876 DEV : loss 0.2602584660053253 - f1-score (micro avg) 0.7463 |
|
2023-10-14 20:04:12,521 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:04:12,523 Loading model from best epoch ... |
|
2023-10-14 20:04:14,869 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time |
|
2023-10-14 20:04:36,455 |
|
Results: |
|
- F-score (micro) 0.7586 |
|
- F-score (macro) 0.6527 |
|
- Accuracy 0.6278 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.8445 0.8658 0.8550 596 |
|
pers 0.6979 0.8048 0.7476 333 |
|
org 0.4964 0.5227 0.5092 132 |
|
prod 0.6000 0.5000 0.5455 66 |
|
time 0.6000 0.6122 0.6061 49 |
|
|
|
micro avg 0.7393 0.7789 0.7586 1176 |
|
macro avg 0.6478 0.6611 0.6527 1176 |
|
weighted avg 0.7400 0.7789 0.7580 1176 |
|
|
|
2023-10-14 20:04:36,456 ---------------------------------------------------------------------------------------------------- |
|
|