|
2023-10-12 17:24:51,784 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 17:24:51,787 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-12 17:24:51,787 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 17:24:51,787 MultiCorpus: 7936 train + 992 dev + 992 test sentences |
|
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr |
|
2023-10-12 17:24:51,787 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 17:24:51,787 Train: 7936 sentences |
|
2023-10-12 17:24:51,787 (train_with_dev=False, train_with_test=False) |
|
2023-10-12 17:24:51,787 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 17:24:51,787 Training Params: |
|
2023-10-12 17:24:51,787 - learning_rate: "0.00015" |
|
2023-10-12 17:24:51,787 - mini_batch_size: "4" |
|
2023-10-12 17:24:51,788 - max_epochs: "10" |
|
2023-10-12 17:24:51,788 - shuffle: "True" |
|
2023-10-12 17:24:51,788 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 17:24:51,788 Plugins: |
|
2023-10-12 17:24:51,788 - TensorboardLogger |
|
2023-10-12 17:24:51,788 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-12 17:24:51,788 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 17:24:51,788 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-12 17:24:51,788 - metric: "('micro avg', 'f1-score')" |
|
2023-10-12 17:24:51,788 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 17:24:51,788 Computation: |
|
2023-10-12 17:24:51,788 - compute on device: cuda:0 |
|
2023-10-12 17:24:51,788 - embedding storage: none |
|
2023-10-12 17:24:51,788 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 17:24:51,788 Model training base path: "hmbench-icdar/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-12 17:24:51,789 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 17:24:51,789 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 17:24:51,789 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-12 17:25:42,845 epoch 1 - iter 198/1984 - loss 2.55536644 - time (sec): 51.05 - samples/sec: 322.35 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-12 17:26:33,679 epoch 1 - iter 396/1984 - loss 2.41138788 - time (sec): 101.89 - samples/sec: 306.71 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-12 17:27:25,852 epoch 1 - iter 594/1984 - loss 2.08406574 - time (sec): 154.06 - samples/sec: 311.46 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-12 17:28:17,880 epoch 1 - iter 792/1984 - loss 1.76238592 - time (sec): 206.09 - samples/sec: 309.43 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-12 17:29:09,576 epoch 1 - iter 990/1984 - loss 1.49063858 - time (sec): 257.79 - samples/sec: 311.56 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-12 17:30:03,660 epoch 1 - iter 1188/1984 - loss 1.28530974 - time (sec): 311.87 - samples/sec: 308.92 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-12 17:30:54,807 epoch 1 - iter 1386/1984 - loss 1.13019078 - time (sec): 363.02 - samples/sec: 313.91 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-12 17:31:46,000 epoch 1 - iter 1584/1984 - loss 1.01185667 - time (sec): 414.21 - samples/sec: 316.00 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-12 17:32:36,962 epoch 1 - iter 1782/1984 - loss 0.92502162 - time (sec): 465.17 - samples/sec: 315.93 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-12 17:33:30,270 epoch 1 - iter 1980/1984 - loss 0.84977154 - time (sec): 518.48 - samples/sec: 315.76 - lr: 0.000150 - momentum: 0.000000 |
|
2023-10-12 17:33:31,299 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 17:33:31,300 EPOCH 1 done: loss 0.8485 - lr: 0.000150 |
|
2023-10-12 17:33:56,893 DEV : loss 0.16075488924980164 - f1-score (micro avg) 0.5739 |
|
2023-10-12 17:33:56,945 saving best model |
|
2023-10-12 17:33:57,979 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 17:34:51,494 epoch 2 - iter 198/1984 - loss 0.19639994 - time (sec): 53.51 - samples/sec: 310.19 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-12 17:35:42,055 epoch 2 - iter 396/1984 - loss 0.17680493 - time (sec): 104.07 - samples/sec: 313.66 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-12 17:36:33,064 epoch 2 - iter 594/1984 - loss 0.16946243 - time (sec): 155.08 - samples/sec: 316.75 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-12 17:37:25,539 epoch 2 - iter 792/1984 - loss 0.16300567 - time (sec): 207.56 - samples/sec: 314.80 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-12 17:38:22,072 epoch 2 - iter 990/1984 - loss 0.15336618 - time (sec): 264.09 - samples/sec: 312.60 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-12 17:39:17,838 epoch 2 - iter 1188/1984 - loss 0.14671314 - time (sec): 319.86 - samples/sec: 308.67 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-12 17:40:13,646 epoch 2 - iter 1386/1984 - loss 0.14319007 - time (sec): 375.66 - samples/sec: 304.39 - lr: 0.000138 - momentum: 0.000000 |
|
2023-10-12 17:41:10,943 epoch 2 - iter 1584/1984 - loss 0.13786458 - time (sec): 432.96 - samples/sec: 304.59 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-12 17:42:07,865 epoch 2 - iter 1782/1984 - loss 0.13502095 - time (sec): 489.88 - samples/sec: 303.54 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-12 17:43:01,351 epoch 2 - iter 1980/1984 - loss 0.13266822 - time (sec): 543.37 - samples/sec: 301.33 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-12 17:43:02,387 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 17:43:02,387 EPOCH 2 done: loss 0.1326 - lr: 0.000133 |
|
2023-10-12 17:43:28,049 DEV : loss 0.08952224254608154 - f1-score (micro avg) 0.7284 |
|
2023-10-12 17:43:28,095 saving best model |
|
2023-10-12 17:43:38,922 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 17:44:36,088 epoch 3 - iter 198/1984 - loss 0.08127744 - time (sec): 57.16 - samples/sec: 306.32 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-12 17:45:31,021 epoch 3 - iter 396/1984 - loss 0.08641176 - time (sec): 112.09 - samples/sec: 313.84 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-12 17:46:25,458 epoch 3 - iter 594/1984 - loss 0.08807778 - time (sec): 166.53 - samples/sec: 305.34 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-12 17:47:20,350 epoch 3 - iter 792/1984 - loss 0.08782288 - time (sec): 221.42 - samples/sec: 302.47 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-12 17:48:15,286 epoch 3 - iter 990/1984 - loss 0.08496675 - time (sec): 276.36 - samples/sec: 300.64 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-12 17:49:07,276 epoch 3 - iter 1188/1984 - loss 0.08486362 - time (sec): 328.35 - samples/sec: 301.13 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-12 17:50:03,093 epoch 3 - iter 1386/1984 - loss 0.08414590 - time (sec): 384.16 - samples/sec: 301.44 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-12 17:50:55,630 epoch 3 - iter 1584/1984 - loss 0.08148581 - time (sec): 436.70 - samples/sec: 304.47 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-12 17:51:51,927 epoch 3 - iter 1782/1984 - loss 0.08077785 - time (sec): 493.00 - samples/sec: 301.11 - lr: 0.000118 - momentum: 0.000000 |
|
2023-10-12 17:52:47,909 epoch 3 - iter 1980/1984 - loss 0.08075089 - time (sec): 548.98 - samples/sec: 298.09 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-12 17:52:49,212 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 17:52:49,213 EPOCH 3 done: loss 0.0808 - lr: 0.000117 |
|
2023-10-12 17:53:19,578 DEV : loss 0.10335122048854828 - f1-score (micro avg) 0.7415 |
|
2023-10-12 17:53:19,624 saving best model |
|
2023-10-12 17:53:22,236 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 17:54:20,197 epoch 4 - iter 198/1984 - loss 0.04733732 - time (sec): 57.95 - samples/sec: 283.02 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-12 17:55:13,014 epoch 4 - iter 396/1984 - loss 0.05655464 - time (sec): 110.77 - samples/sec: 298.62 - lr: 0.000113 - momentum: 0.000000 |
|
2023-10-12 17:56:08,456 epoch 4 - iter 594/1984 - loss 0.05633847 - time (sec): 166.21 - samples/sec: 301.27 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-12 17:57:00,472 epoch 4 - iter 792/1984 - loss 0.05909280 - time (sec): 218.23 - samples/sec: 301.30 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-12 17:57:54,142 epoch 4 - iter 990/1984 - loss 0.06026255 - time (sec): 271.90 - samples/sec: 306.35 - lr: 0.000108 - momentum: 0.000000 |
|
2023-10-12 17:58:46,218 epoch 4 - iter 1188/1984 - loss 0.05861280 - time (sec): 323.97 - samples/sec: 309.30 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-12 17:59:38,647 epoch 4 - iter 1386/1984 - loss 0.05877059 - time (sec): 376.40 - samples/sec: 307.34 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-12 18:00:33,949 epoch 4 - iter 1584/1984 - loss 0.05806944 - time (sec): 431.71 - samples/sec: 306.80 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-12 18:01:25,964 epoch 4 - iter 1782/1984 - loss 0.05889933 - time (sec): 483.72 - samples/sec: 305.46 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-12 18:02:18,642 epoch 4 - iter 1980/1984 - loss 0.05870714 - time (sec): 536.40 - samples/sec: 304.90 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-12 18:02:19,847 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 18:02:19,848 EPOCH 4 done: loss 0.0587 - lr: 0.000100 |
|
2023-10-12 18:02:44,009 DEV : loss 0.12530608475208282 - f1-score (micro avg) 0.7549 |
|
2023-10-12 18:02:44,047 saving best model |
|
2023-10-12 18:02:47,075 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 18:03:45,754 epoch 5 - iter 198/1984 - loss 0.03684763 - time (sec): 58.67 - samples/sec: 278.92 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-12 18:04:42,683 epoch 5 - iter 396/1984 - loss 0.03542942 - time (sec): 115.60 - samples/sec: 283.18 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-12 18:05:33,118 epoch 5 - iter 594/1984 - loss 0.03923047 - time (sec): 166.04 - samples/sec: 291.45 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-12 18:06:24,269 epoch 5 - iter 792/1984 - loss 0.04070365 - time (sec): 217.19 - samples/sec: 297.96 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-12 18:07:16,126 epoch 5 - iter 990/1984 - loss 0.03866158 - time (sec): 269.05 - samples/sec: 300.33 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-12 18:08:11,590 epoch 5 - iter 1188/1984 - loss 0.04048969 - time (sec): 324.51 - samples/sec: 300.36 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-12 18:09:06,534 epoch 5 - iter 1386/1984 - loss 0.04244996 - time (sec): 379.45 - samples/sec: 301.32 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-12 18:10:03,788 epoch 5 - iter 1584/1984 - loss 0.04340599 - time (sec): 436.71 - samples/sec: 299.41 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-12 18:10:55,718 epoch 5 - iter 1782/1984 - loss 0.04308350 - time (sec): 488.64 - samples/sec: 302.17 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-12 18:11:49,526 epoch 5 - iter 1980/1984 - loss 0.04432984 - time (sec): 542.44 - samples/sec: 301.87 - lr: 0.000083 - momentum: 0.000000 |
|
2023-10-12 18:11:50,632 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 18:11:50,632 EPOCH 5 done: loss 0.0443 - lr: 0.000083 |
|
2023-10-12 18:12:18,411 DEV : loss 0.1436772644519806 - f1-score (micro avg) 0.7392 |
|
2023-10-12 18:12:18,452 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 18:13:10,838 epoch 6 - iter 198/1984 - loss 0.03726587 - time (sec): 52.38 - samples/sec: 316.11 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-12 18:14:06,410 epoch 6 - iter 396/1984 - loss 0.02956098 - time (sec): 107.96 - samples/sec: 303.22 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-12 18:14:58,937 epoch 6 - iter 594/1984 - loss 0.03136481 - time (sec): 160.48 - samples/sec: 306.32 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-12 18:15:54,784 epoch 6 - iter 792/1984 - loss 0.03209960 - time (sec): 216.33 - samples/sec: 303.47 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-12 18:16:47,593 epoch 6 - iter 990/1984 - loss 0.03287748 - time (sec): 269.14 - samples/sec: 306.09 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-12 18:17:44,743 epoch 6 - iter 1188/1984 - loss 0.03282659 - time (sec): 326.29 - samples/sec: 300.73 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-12 18:18:38,567 epoch 6 - iter 1386/1984 - loss 0.03292277 - time (sec): 380.11 - samples/sec: 301.47 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-12 18:19:30,537 epoch 6 - iter 1584/1984 - loss 0.03367855 - time (sec): 432.08 - samples/sec: 302.95 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-12 18:20:27,404 epoch 6 - iter 1782/1984 - loss 0.03441227 - time (sec): 488.95 - samples/sec: 301.11 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-12 18:21:20,899 epoch 6 - iter 1980/1984 - loss 0.03431832 - time (sec): 542.45 - samples/sec: 301.61 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-12 18:21:21,962 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 18:21:21,962 EPOCH 6 done: loss 0.0342 - lr: 0.000067 |
|
2023-10-12 18:21:48,190 DEV : loss 0.18953415751457214 - f1-score (micro avg) 0.7546 |
|
2023-10-12 18:21:48,236 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 18:22:42,513 epoch 7 - iter 198/1984 - loss 0.01880897 - time (sec): 54.27 - samples/sec: 300.69 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-12 18:23:36,788 epoch 7 - iter 396/1984 - loss 0.02129238 - time (sec): 108.55 - samples/sec: 310.17 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-12 18:24:27,787 epoch 7 - iter 594/1984 - loss 0.02038739 - time (sec): 159.55 - samples/sec: 308.46 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-12 18:25:18,964 epoch 7 - iter 792/1984 - loss 0.02267150 - time (sec): 210.73 - samples/sec: 305.18 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-12 18:26:11,225 epoch 7 - iter 990/1984 - loss 0.02325419 - time (sec): 262.99 - samples/sec: 308.16 - lr: 0.000058 - momentum: 0.000000 |
|
2023-10-12 18:27:02,156 epoch 7 - iter 1188/1984 - loss 0.02443802 - time (sec): 313.92 - samples/sec: 310.04 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-12 18:27:53,992 epoch 7 - iter 1386/1984 - loss 0.02362788 - time (sec): 365.75 - samples/sec: 310.65 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-12 18:28:46,404 epoch 7 - iter 1584/1984 - loss 0.02445248 - time (sec): 418.17 - samples/sec: 310.92 - lr: 0.000053 - momentum: 0.000000 |
|
2023-10-12 18:29:42,092 epoch 7 - iter 1782/1984 - loss 0.02416577 - time (sec): 473.85 - samples/sec: 310.85 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-12 18:30:37,599 epoch 7 - iter 1980/1984 - loss 0.02497708 - time (sec): 529.36 - samples/sec: 308.90 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-12 18:30:38,861 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 18:30:38,862 EPOCH 7 done: loss 0.0251 - lr: 0.000050 |
|
2023-10-12 18:31:11,064 DEV : loss 0.2033829540014267 - f1-score (micro avg) 0.735 |
|
2023-10-12 18:31:11,110 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 18:32:08,080 epoch 8 - iter 198/1984 - loss 0.01615564 - time (sec): 56.97 - samples/sec: 290.72 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-12 18:33:04,407 epoch 8 - iter 396/1984 - loss 0.01873509 - time (sec): 113.29 - samples/sec: 297.31 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-12 18:34:00,564 epoch 8 - iter 594/1984 - loss 0.01784955 - time (sec): 169.45 - samples/sec: 304.92 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-12 18:34:54,773 epoch 8 - iter 792/1984 - loss 0.01716047 - time (sec): 223.66 - samples/sec: 301.53 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-12 18:35:48,316 epoch 8 - iter 990/1984 - loss 0.01676986 - time (sec): 277.20 - samples/sec: 299.97 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-12 18:36:43,833 epoch 8 - iter 1188/1984 - loss 0.01653262 - time (sec): 332.72 - samples/sec: 296.71 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-12 18:37:37,320 epoch 8 - iter 1386/1984 - loss 0.01775393 - time (sec): 386.21 - samples/sec: 295.96 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-12 18:38:30,408 epoch 8 - iter 1584/1984 - loss 0.01695689 - time (sec): 439.30 - samples/sec: 296.91 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-12 18:39:23,967 epoch 8 - iter 1782/1984 - loss 0.01771032 - time (sec): 492.85 - samples/sec: 298.27 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-12 18:40:16,256 epoch 8 - iter 1980/1984 - loss 0.01950296 - time (sec): 545.14 - samples/sec: 299.99 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-12 18:40:17,576 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 18:40:17,577 EPOCH 8 done: loss 0.0195 - lr: 0.000033 |
|
2023-10-12 18:40:48,348 DEV : loss 0.20685595273971558 - f1-score (micro avg) 0.7506 |
|
2023-10-12 18:40:48,388 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 18:41:43,210 epoch 9 - iter 198/1984 - loss 0.01469186 - time (sec): 54.82 - samples/sec: 297.92 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-12 18:42:34,820 epoch 9 - iter 396/1984 - loss 0.01231871 - time (sec): 106.43 - samples/sec: 307.79 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-12 18:43:28,721 epoch 9 - iter 594/1984 - loss 0.01418213 - time (sec): 160.33 - samples/sec: 308.89 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-12 18:44:21,248 epoch 9 - iter 792/1984 - loss 0.01449414 - time (sec): 212.86 - samples/sec: 309.93 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-12 18:45:15,929 epoch 9 - iter 990/1984 - loss 0.01310674 - time (sec): 267.54 - samples/sec: 311.53 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-12 18:46:08,149 epoch 9 - iter 1188/1984 - loss 0.01385134 - time (sec): 319.76 - samples/sec: 312.20 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-12 18:46:58,627 epoch 9 - iter 1386/1984 - loss 0.01328832 - time (sec): 370.24 - samples/sec: 313.82 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-12 18:47:51,195 epoch 9 - iter 1584/1984 - loss 0.01329775 - time (sec): 422.80 - samples/sec: 313.30 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-12 18:48:46,351 epoch 9 - iter 1782/1984 - loss 0.01366045 - time (sec): 477.96 - samples/sec: 309.53 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-12 18:49:40,513 epoch 9 - iter 1980/1984 - loss 0.01397599 - time (sec): 532.12 - samples/sec: 307.31 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-12 18:49:41,587 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 18:49:41,588 EPOCH 9 done: loss 0.0140 - lr: 0.000017 |
|
2023-10-12 18:50:07,293 DEV : loss 0.2238703966140747 - f1-score (micro avg) 0.7411 |
|
2023-10-12 18:50:07,343 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 18:51:01,512 epoch 10 - iter 198/1984 - loss 0.01050438 - time (sec): 54.17 - samples/sec: 309.79 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-12 18:51:53,307 epoch 10 - iter 396/1984 - loss 0.00772763 - time (sec): 105.96 - samples/sec: 305.80 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-12 18:52:48,019 epoch 10 - iter 594/1984 - loss 0.01012820 - time (sec): 160.67 - samples/sec: 299.59 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-12 18:53:41,519 epoch 10 - iter 792/1984 - loss 0.00947119 - time (sec): 214.17 - samples/sec: 301.96 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-12 18:54:40,509 epoch 10 - iter 990/1984 - loss 0.01022609 - time (sec): 273.16 - samples/sec: 300.88 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-12 18:55:34,282 epoch 10 - iter 1188/1984 - loss 0.00989781 - time (sec): 326.94 - samples/sec: 302.53 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-12 18:56:29,254 epoch 10 - iter 1386/1984 - loss 0.00983913 - time (sec): 381.91 - samples/sec: 300.51 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-12 18:57:21,637 epoch 10 - iter 1584/1984 - loss 0.01002305 - time (sec): 434.29 - samples/sec: 301.31 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-12 18:58:16,687 epoch 10 - iter 1782/1984 - loss 0.00997923 - time (sec): 489.34 - samples/sec: 301.28 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-12 18:59:09,635 epoch 10 - iter 1980/1984 - loss 0.01053401 - time (sec): 542.29 - samples/sec: 301.94 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-12 18:59:10,685 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 18:59:10,685 EPOCH 10 done: loss 0.0105 - lr: 0.000000 |
|
2023-10-12 18:59:40,730 DEV : loss 0.22921979427337646 - f1-score (micro avg) 0.7538 |
|
2023-10-12 18:59:41,817 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 18:59:41,819 Loading model from best epoch ... |
|
2023-10-12 18:59:46,867 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-12 19:00:16,825 |
|
Results: |
|
- F-score (micro) 0.763 |
|
- F-score (macro) 0.6832 |
|
- Accuracy 0.6361 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.7887 0.8489 0.8176 655 |
|
PER 0.7465 0.7265 0.7364 223 |
|
ORG 0.5421 0.4567 0.4957 127 |
|
|
|
micro avg 0.7541 0.7721 0.7630 1005 |
|
macro avg 0.6924 0.6773 0.6832 1005 |
|
weighted avg 0.7481 0.7721 0.7589 1005 |
|
|
|
2023-10-12 19:00:16,826 ---------------------------------------------------------------------------------------------------- |
|
|