stefan-it's picture
Upload folder using huggingface_hub
f526174
2023-10-14 00:40:53,759 ----------------------------------------------------------------------------------------------------
2023-10-14 00:40:53,762 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-14 00:40:53,762 ----------------------------------------------------------------------------------------------------
2023-10-14 00:40:53,762 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-14 00:40:53,762 ----------------------------------------------------------------------------------------------------
2023-10-14 00:40:53,763 Train: 6183 sentences
2023-10-14 00:40:53,763 (train_with_dev=False, train_with_test=False)
2023-10-14 00:40:53,763 ----------------------------------------------------------------------------------------------------
2023-10-14 00:40:53,763 Training Params:
2023-10-14 00:40:53,763 - learning_rate: "0.00015"
2023-10-14 00:40:53,763 - mini_batch_size: "4"
2023-10-14 00:40:53,763 - max_epochs: "10"
2023-10-14 00:40:53,763 - shuffle: "True"
2023-10-14 00:40:53,763 ----------------------------------------------------------------------------------------------------
2023-10-14 00:40:53,763 Plugins:
2023-10-14 00:40:53,763 - TensorboardLogger
2023-10-14 00:40:53,763 - LinearScheduler | warmup_fraction: '0.1'
2023-10-14 00:40:53,763 ----------------------------------------------------------------------------------------------------
2023-10-14 00:40:53,764 Final evaluation on model from best epoch (best-model.pt)
2023-10-14 00:40:53,764 - metric: "('micro avg', 'f1-score')"
2023-10-14 00:40:53,764 ----------------------------------------------------------------------------------------------------
2023-10-14 00:40:53,764 Computation:
2023-10-14 00:40:53,764 - compute on device: cuda:0
2023-10-14 00:40:53,764 - embedding storage: none
2023-10-14 00:40:53,764 ----------------------------------------------------------------------------------------------------
2023-10-14 00:40:53,764 Model training base path: "hmbench-topres19th/en-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-4"
2023-10-14 00:40:53,764 ----------------------------------------------------------------------------------------------------
2023-10-14 00:40:53,764 ----------------------------------------------------------------------------------------------------
2023-10-14 00:40:53,764 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-14 00:41:37,315 epoch 1 - iter 154/1546 - loss 2.56873151 - time (sec): 43.55 - samples/sec: 288.51 - lr: 0.000015 - momentum: 0.000000
2023-10-14 00:42:21,594 epoch 1 - iter 308/1546 - loss 2.43379623 - time (sec): 87.83 - samples/sec: 291.12 - lr: 0.000030 - momentum: 0.000000
2023-10-14 00:43:04,591 epoch 1 - iter 462/1546 - loss 2.16358210 - time (sec): 130.82 - samples/sec: 293.17 - lr: 0.000045 - momentum: 0.000000
2023-10-14 00:43:47,729 epoch 1 - iter 616/1546 - loss 1.89703690 - time (sec): 173.96 - samples/sec: 286.04 - lr: 0.000060 - momentum: 0.000000
2023-10-14 00:44:31,467 epoch 1 - iter 770/1546 - loss 1.61404026 - time (sec): 217.70 - samples/sec: 285.41 - lr: 0.000075 - momentum: 0.000000
2023-10-14 00:45:15,241 epoch 1 - iter 924/1546 - loss 1.39110007 - time (sec): 261.47 - samples/sec: 282.67 - lr: 0.000090 - momentum: 0.000000
2023-10-14 00:45:59,472 epoch 1 - iter 1078/1546 - loss 1.22539657 - time (sec): 305.71 - samples/sec: 282.34 - lr: 0.000104 - momentum: 0.000000
2023-10-14 00:46:43,473 epoch 1 - iter 1232/1546 - loss 1.09447369 - time (sec): 349.71 - samples/sec: 282.98 - lr: 0.000119 - momentum: 0.000000
2023-10-14 00:47:26,775 epoch 1 - iter 1386/1546 - loss 0.98722789 - time (sec): 393.01 - samples/sec: 284.48 - lr: 0.000134 - momentum: 0.000000
2023-10-14 00:48:09,530 epoch 1 - iter 1540/1546 - loss 0.90394722 - time (sec): 435.76 - samples/sec: 284.17 - lr: 0.000149 - momentum: 0.000000
2023-10-14 00:48:11,100 ----------------------------------------------------------------------------------------------------
2023-10-14 00:48:11,100 EPOCH 1 done: loss 0.9011 - lr: 0.000149
2023-10-14 00:48:28,371 DEV : loss 0.0812714695930481 - f1-score (micro avg) 0.5821
2023-10-14 00:48:28,400 saving best model
2023-10-14 00:48:29,339 ----------------------------------------------------------------------------------------------------
2023-10-14 00:49:11,816 epoch 2 - iter 154/1546 - loss 0.10318258 - time (sec): 42.47 - samples/sec: 258.58 - lr: 0.000148 - momentum: 0.000000
2023-10-14 00:49:55,140 epoch 2 - iter 308/1546 - loss 0.10981228 - time (sec): 85.80 - samples/sec: 280.28 - lr: 0.000147 - momentum: 0.000000
2023-10-14 00:50:38,438 epoch 2 - iter 462/1546 - loss 0.10677416 - time (sec): 129.10 - samples/sec: 279.42 - lr: 0.000145 - momentum: 0.000000
2023-10-14 00:51:21,839 epoch 2 - iter 616/1546 - loss 0.10494738 - time (sec): 172.50 - samples/sec: 283.09 - lr: 0.000143 - momentum: 0.000000
2023-10-14 00:52:05,915 epoch 2 - iter 770/1546 - loss 0.10120027 - time (sec): 216.57 - samples/sec: 285.19 - lr: 0.000142 - momentum: 0.000000
2023-10-14 00:52:49,780 epoch 2 - iter 924/1546 - loss 0.09745622 - time (sec): 260.44 - samples/sec: 287.52 - lr: 0.000140 - momentum: 0.000000
2023-10-14 00:53:33,706 epoch 2 - iter 1078/1546 - loss 0.09614656 - time (sec): 304.36 - samples/sec: 285.46 - lr: 0.000138 - momentum: 0.000000
2023-10-14 00:54:16,839 epoch 2 - iter 1232/1546 - loss 0.09374842 - time (sec): 347.50 - samples/sec: 285.24 - lr: 0.000137 - momentum: 0.000000
2023-10-14 00:54:59,232 epoch 2 - iter 1386/1546 - loss 0.09264200 - time (sec): 389.89 - samples/sec: 284.46 - lr: 0.000135 - momentum: 0.000000
2023-10-14 00:55:42,457 epoch 2 - iter 1540/1546 - loss 0.09188658 - time (sec): 433.12 - samples/sec: 285.66 - lr: 0.000133 - momentum: 0.000000
2023-10-14 00:55:44,125 ----------------------------------------------------------------------------------------------------
2023-10-14 00:55:44,125 EPOCH 2 done: loss 0.0918 - lr: 0.000133
2023-10-14 00:56:01,105 DEV : loss 0.05896108224987984 - f1-score (micro avg) 0.753
2023-10-14 00:56:01,138 saving best model
2023-10-14 00:56:02,107 ----------------------------------------------------------------------------------------------------
2023-10-14 00:56:45,802 epoch 3 - iter 154/1546 - loss 0.03817025 - time (sec): 43.69 - samples/sec: 290.81 - lr: 0.000132 - momentum: 0.000000
2023-10-14 00:57:30,409 epoch 3 - iter 308/1546 - loss 0.04755001 - time (sec): 88.30 - samples/sec: 285.37 - lr: 0.000130 - momentum: 0.000000
2023-10-14 00:58:14,419 epoch 3 - iter 462/1546 - loss 0.05104447 - time (sec): 132.31 - samples/sec: 287.17 - lr: 0.000128 - momentum: 0.000000
2023-10-14 00:58:59,278 epoch 3 - iter 616/1546 - loss 0.05222974 - time (sec): 177.17 - samples/sec: 284.35 - lr: 0.000127 - momentum: 0.000000
2023-10-14 00:59:42,378 epoch 3 - iter 770/1546 - loss 0.05664685 - time (sec): 220.27 - samples/sec: 283.57 - lr: 0.000125 - momentum: 0.000000
2023-10-14 01:00:26,466 epoch 3 - iter 924/1546 - loss 0.05615839 - time (sec): 264.36 - samples/sec: 283.26 - lr: 0.000123 - momentum: 0.000000
2023-10-14 01:01:10,213 epoch 3 - iter 1078/1546 - loss 0.05563682 - time (sec): 308.10 - samples/sec: 283.56 - lr: 0.000122 - momentum: 0.000000
2023-10-14 01:01:53,935 epoch 3 - iter 1232/1546 - loss 0.05496907 - time (sec): 351.83 - samples/sec: 283.05 - lr: 0.000120 - momentum: 0.000000
2023-10-14 01:02:36,567 epoch 3 - iter 1386/1546 - loss 0.05420399 - time (sec): 394.46 - samples/sec: 282.68 - lr: 0.000118 - momentum: 0.000000
2023-10-14 01:03:19,488 epoch 3 - iter 1540/1546 - loss 0.05358357 - time (sec): 437.38 - samples/sec: 283.24 - lr: 0.000117 - momentum: 0.000000
2023-10-14 01:03:21,125 ----------------------------------------------------------------------------------------------------
2023-10-14 01:03:21,126 EPOCH 3 done: loss 0.0535 - lr: 0.000117
2023-10-14 01:03:38,668 DEV : loss 0.05659706890583038 - f1-score (micro avg) 0.8127
2023-10-14 01:03:38,696 saving best model
2023-10-14 01:03:39,676 ----------------------------------------------------------------------------------------------------
2023-10-14 01:04:22,928 epoch 4 - iter 154/1546 - loss 0.02880123 - time (sec): 43.25 - samples/sec: 282.77 - lr: 0.000115 - momentum: 0.000000
2023-10-14 01:05:05,591 epoch 4 - iter 308/1546 - loss 0.03320371 - time (sec): 85.91 - samples/sec: 281.37 - lr: 0.000113 - momentum: 0.000000
2023-10-14 01:05:47,828 epoch 4 - iter 462/1546 - loss 0.03355342 - time (sec): 128.15 - samples/sec: 277.43 - lr: 0.000112 - momentum: 0.000000
2023-10-14 01:06:32,557 epoch 4 - iter 616/1546 - loss 0.03292566 - time (sec): 172.88 - samples/sec: 281.16 - lr: 0.000110 - momentum: 0.000000
2023-10-14 01:07:16,430 epoch 4 - iter 770/1546 - loss 0.03538356 - time (sec): 216.75 - samples/sec: 283.12 - lr: 0.000108 - momentum: 0.000000
2023-10-14 01:08:00,152 epoch 4 - iter 924/1546 - loss 0.03499173 - time (sec): 260.47 - samples/sec: 282.29 - lr: 0.000107 - momentum: 0.000000
2023-10-14 01:08:43,630 epoch 4 - iter 1078/1546 - loss 0.03358414 - time (sec): 303.95 - samples/sec: 282.46 - lr: 0.000105 - momentum: 0.000000
2023-10-14 01:09:26,021 epoch 4 - iter 1232/1546 - loss 0.03409530 - time (sec): 346.34 - samples/sec: 282.45 - lr: 0.000103 - momentum: 0.000000
2023-10-14 01:10:10,598 epoch 4 - iter 1386/1546 - loss 0.03269230 - time (sec): 390.92 - samples/sec: 284.77 - lr: 0.000102 - momentum: 0.000000
2023-10-14 01:10:54,547 epoch 4 - iter 1540/1546 - loss 0.03249523 - time (sec): 434.87 - samples/sec: 284.58 - lr: 0.000100 - momentum: 0.000000
2023-10-14 01:10:56,205 ----------------------------------------------------------------------------------------------------
2023-10-14 01:10:56,206 EPOCH 4 done: loss 0.0326 - lr: 0.000100
2023-10-14 01:11:14,223 DEV : loss 0.06543166935443878 - f1-score (micro avg) 0.8296
2023-10-14 01:11:14,257 saving best model
2023-10-14 01:11:16,978 ----------------------------------------------------------------------------------------------------
2023-10-14 01:12:01,955 epoch 5 - iter 154/1546 - loss 0.02269561 - time (sec): 44.97 - samples/sec: 277.46 - lr: 0.000098 - momentum: 0.000000
2023-10-14 01:12:45,258 epoch 5 - iter 308/1546 - loss 0.01877881 - time (sec): 88.28 - samples/sec: 282.06 - lr: 0.000097 - momentum: 0.000000
2023-10-14 01:13:28,365 epoch 5 - iter 462/1546 - loss 0.01896534 - time (sec): 131.38 - samples/sec: 284.70 - lr: 0.000095 - momentum: 0.000000
2023-10-14 01:14:12,356 epoch 5 - iter 616/1546 - loss 0.01903863 - time (sec): 175.37 - samples/sec: 282.49 - lr: 0.000093 - momentum: 0.000000
2023-10-14 01:14:55,863 epoch 5 - iter 770/1546 - loss 0.01875930 - time (sec): 218.88 - samples/sec: 285.22 - lr: 0.000092 - momentum: 0.000000
2023-10-14 01:15:39,889 epoch 5 - iter 924/1546 - loss 0.01806369 - time (sec): 262.91 - samples/sec: 285.58 - lr: 0.000090 - momentum: 0.000000
2023-10-14 01:16:24,044 epoch 5 - iter 1078/1546 - loss 0.01918173 - time (sec): 307.06 - samples/sec: 285.19 - lr: 0.000088 - momentum: 0.000000
2023-10-14 01:17:07,380 epoch 5 - iter 1232/1546 - loss 0.01910152 - time (sec): 350.40 - samples/sec: 282.90 - lr: 0.000087 - momentum: 0.000000
2023-10-14 01:17:50,259 epoch 5 - iter 1386/1546 - loss 0.02046787 - time (sec): 393.28 - samples/sec: 283.93 - lr: 0.000085 - momentum: 0.000000
2023-10-14 01:18:34,363 epoch 5 - iter 1540/1546 - loss 0.02101615 - time (sec): 437.38 - samples/sec: 282.89 - lr: 0.000083 - momentum: 0.000000
2023-10-14 01:18:36,049 ----------------------------------------------------------------------------------------------------
2023-10-14 01:18:36,049 EPOCH 5 done: loss 0.0210 - lr: 0.000083
2023-10-14 01:18:53,025 DEV : loss 0.07295508682727814 - f1-score (micro avg) 0.8114
2023-10-14 01:18:53,055 ----------------------------------------------------------------------------------------------------
2023-10-14 01:19:37,032 epoch 6 - iter 154/1546 - loss 0.01805609 - time (sec): 43.98 - samples/sec: 280.23 - lr: 0.000082 - momentum: 0.000000
2023-10-14 01:20:20,915 epoch 6 - iter 308/1546 - loss 0.01358401 - time (sec): 87.86 - samples/sec: 282.61 - lr: 0.000080 - momentum: 0.000000
2023-10-14 01:21:04,910 epoch 6 - iter 462/1546 - loss 0.01234981 - time (sec): 131.85 - samples/sec: 286.36 - lr: 0.000078 - momentum: 0.000000
2023-10-14 01:21:48,433 epoch 6 - iter 616/1546 - loss 0.01241203 - time (sec): 175.38 - samples/sec: 286.78 - lr: 0.000077 - momentum: 0.000000
2023-10-14 01:22:31,732 epoch 6 - iter 770/1546 - loss 0.01401568 - time (sec): 218.67 - samples/sec: 283.93 - lr: 0.000075 - momentum: 0.000000
2023-10-14 01:23:15,331 epoch 6 - iter 924/1546 - loss 0.01424007 - time (sec): 262.27 - samples/sec: 282.40 - lr: 0.000073 - momentum: 0.000000
2023-10-14 01:23:59,294 epoch 6 - iter 1078/1546 - loss 0.01451504 - time (sec): 306.24 - samples/sec: 282.27 - lr: 0.000072 - momentum: 0.000000
2023-10-14 01:24:42,465 epoch 6 - iter 1232/1546 - loss 0.01474960 - time (sec): 349.41 - samples/sec: 281.30 - lr: 0.000070 - momentum: 0.000000
2023-10-14 01:25:25,815 epoch 6 - iter 1386/1546 - loss 0.01493525 - time (sec): 392.76 - samples/sec: 281.51 - lr: 0.000068 - momentum: 0.000000
2023-10-14 01:26:09,605 epoch 6 - iter 1540/1546 - loss 0.01413867 - time (sec): 436.55 - samples/sec: 283.39 - lr: 0.000067 - momentum: 0.000000
2023-10-14 01:26:11,277 ----------------------------------------------------------------------------------------------------
2023-10-14 01:26:11,277 EPOCH 6 done: loss 0.0143 - lr: 0.000067
2023-10-14 01:26:29,293 DEV : loss 0.07670143991708755 - f1-score (micro avg) 0.831
2023-10-14 01:26:29,335 saving best model
2023-10-14 01:26:31,945 ----------------------------------------------------------------------------------------------------
2023-10-14 01:27:18,143 epoch 7 - iter 154/1546 - loss 0.00975317 - time (sec): 46.19 - samples/sec: 297.57 - lr: 0.000065 - momentum: 0.000000
2023-10-14 01:28:00,843 epoch 7 - iter 308/1546 - loss 0.00891956 - time (sec): 88.89 - samples/sec: 292.11 - lr: 0.000063 - momentum: 0.000000
2023-10-14 01:28:44,318 epoch 7 - iter 462/1546 - loss 0.00997728 - time (sec): 132.37 - samples/sec: 290.55 - lr: 0.000062 - momentum: 0.000000
2023-10-14 01:29:26,892 epoch 7 - iter 616/1546 - loss 0.00930809 - time (sec): 174.94 - samples/sec: 290.48 - lr: 0.000060 - momentum: 0.000000
2023-10-14 01:30:09,203 epoch 7 - iter 770/1546 - loss 0.00916108 - time (sec): 217.25 - samples/sec: 287.44 - lr: 0.000058 - momentum: 0.000000
2023-10-14 01:30:51,937 epoch 7 - iter 924/1546 - loss 0.00924439 - time (sec): 259.99 - samples/sec: 290.46 - lr: 0.000057 - momentum: 0.000000
2023-10-14 01:31:33,959 epoch 7 - iter 1078/1546 - loss 0.01049403 - time (sec): 302.01 - samples/sec: 290.64 - lr: 0.000055 - momentum: 0.000000
2023-10-14 01:32:16,483 epoch 7 - iter 1232/1546 - loss 0.01043974 - time (sec): 344.53 - samples/sec: 288.74 - lr: 0.000053 - momentum: 0.000000
2023-10-14 01:32:59,452 epoch 7 - iter 1386/1546 - loss 0.00997770 - time (sec): 387.50 - samples/sec: 287.00 - lr: 0.000052 - momentum: 0.000000
2023-10-14 01:33:42,672 epoch 7 - iter 1540/1546 - loss 0.00960048 - time (sec): 430.72 - samples/sec: 287.24 - lr: 0.000050 - momentum: 0.000000
2023-10-14 01:33:44,315 ----------------------------------------------------------------------------------------------------
2023-10-14 01:33:44,316 EPOCH 7 done: loss 0.0096 - lr: 0.000050
2023-10-14 01:34:01,530 DEV : loss 0.0880291685461998 - f1-score (micro avg) 0.8364
2023-10-14 01:34:01,560 saving best model
2023-10-14 01:34:04,174 ----------------------------------------------------------------------------------------------------
2023-10-14 01:34:47,366 epoch 8 - iter 154/1546 - loss 0.00390031 - time (sec): 43.19 - samples/sec: 293.43 - lr: 0.000048 - momentum: 0.000000
2023-10-14 01:35:29,570 epoch 8 - iter 308/1546 - loss 0.00354778 - time (sec): 85.39 - samples/sec: 283.34 - lr: 0.000047 - momentum: 0.000000
2023-10-14 01:36:12,267 epoch 8 - iter 462/1546 - loss 0.00558009 - time (sec): 128.09 - samples/sec: 285.33 - lr: 0.000045 - momentum: 0.000000
2023-10-14 01:36:56,668 epoch 8 - iter 616/1546 - loss 0.00613750 - time (sec): 172.49 - samples/sec: 280.88 - lr: 0.000043 - momentum: 0.000000
2023-10-14 01:37:43,215 epoch 8 - iter 770/1546 - loss 0.00600191 - time (sec): 219.04 - samples/sec: 278.43 - lr: 0.000042 - momentum: 0.000000
2023-10-14 01:38:29,251 epoch 8 - iter 924/1546 - loss 0.00583790 - time (sec): 265.07 - samples/sec: 277.56 - lr: 0.000040 - momentum: 0.000000
2023-10-14 01:39:14,114 epoch 8 - iter 1078/1546 - loss 0.00603055 - time (sec): 309.94 - samples/sec: 274.06 - lr: 0.000038 - momentum: 0.000000
2023-10-14 01:40:01,383 epoch 8 - iter 1232/1546 - loss 0.00653849 - time (sec): 357.21 - samples/sec: 274.08 - lr: 0.000037 - momentum: 0.000000
2023-10-14 01:40:49,090 epoch 8 - iter 1386/1546 - loss 0.00610073 - time (sec): 404.91 - samples/sec: 274.75 - lr: 0.000035 - momentum: 0.000000
2023-10-14 01:41:35,518 epoch 8 - iter 1540/1546 - loss 0.00595387 - time (sec): 451.34 - samples/sec: 274.45 - lr: 0.000033 - momentum: 0.000000
2023-10-14 01:41:37,155 ----------------------------------------------------------------------------------------------------
2023-10-14 01:41:37,155 EPOCH 8 done: loss 0.0059 - lr: 0.000033
2023-10-14 01:41:54,940 DEV : loss 0.09291724860668182 - f1-score (micro avg) 0.832
2023-10-14 01:41:54,969 ----------------------------------------------------------------------------------------------------
2023-10-14 01:42:39,414 epoch 9 - iter 154/1546 - loss 0.00195984 - time (sec): 44.44 - samples/sec: 283.42 - lr: 0.000032 - momentum: 0.000000
2023-10-14 01:43:23,371 epoch 9 - iter 308/1546 - loss 0.00226573 - time (sec): 88.40 - samples/sec: 280.80 - lr: 0.000030 - momentum: 0.000000
2023-10-14 01:44:07,873 epoch 9 - iter 462/1546 - loss 0.00286662 - time (sec): 132.90 - samples/sec: 284.35 - lr: 0.000028 - momentum: 0.000000
2023-10-14 01:44:52,315 epoch 9 - iter 616/1546 - loss 0.00407165 - time (sec): 177.34 - samples/sec: 284.00 - lr: 0.000027 - momentum: 0.000000
2023-10-14 01:45:35,291 epoch 9 - iter 770/1546 - loss 0.00463367 - time (sec): 220.32 - samples/sec: 282.55 - lr: 0.000025 - momentum: 0.000000
2023-10-14 01:46:18,120 epoch 9 - iter 924/1546 - loss 0.00464126 - time (sec): 263.15 - samples/sec: 283.13 - lr: 0.000023 - momentum: 0.000000
2023-10-14 01:46:58,607 epoch 9 - iter 1078/1546 - loss 0.00453876 - time (sec): 303.64 - samples/sec: 288.14 - lr: 0.000022 - momentum: 0.000000
2023-10-14 01:47:38,679 epoch 9 - iter 1232/1546 - loss 0.00471355 - time (sec): 343.71 - samples/sec: 290.15 - lr: 0.000020 - momentum: 0.000000
2023-10-14 01:48:19,657 epoch 9 - iter 1386/1546 - loss 0.00446859 - time (sec): 384.69 - samples/sec: 292.58 - lr: 0.000018 - momentum: 0.000000
2023-10-14 01:49:01,929 epoch 9 - iter 1540/1546 - loss 0.00448726 - time (sec): 426.96 - samples/sec: 289.78 - lr: 0.000017 - momentum: 0.000000
2023-10-14 01:49:03,704 ----------------------------------------------------------------------------------------------------
2023-10-14 01:49:03,705 EPOCH 9 done: loss 0.0045 - lr: 0.000017
2023-10-14 01:49:20,835 DEV : loss 0.10131476074457169 - f1-score (micro avg) 0.8276
2023-10-14 01:49:20,865 ----------------------------------------------------------------------------------------------------
2023-10-14 01:50:04,766 epoch 10 - iter 154/1546 - loss 0.00119822 - time (sec): 43.90 - samples/sec: 288.50 - lr: 0.000015 - momentum: 0.000000
2023-10-14 01:50:47,340 epoch 10 - iter 308/1546 - loss 0.00178207 - time (sec): 86.47 - samples/sec: 277.32 - lr: 0.000013 - momentum: 0.000000
2023-10-14 01:51:30,796 epoch 10 - iter 462/1546 - loss 0.00153142 - time (sec): 129.93 - samples/sec: 283.13 - lr: 0.000012 - momentum: 0.000000
2023-10-14 01:52:14,216 epoch 10 - iter 616/1546 - loss 0.00162289 - time (sec): 173.35 - samples/sec: 286.02 - lr: 0.000010 - momentum: 0.000000
2023-10-14 01:52:57,272 epoch 10 - iter 770/1546 - loss 0.00187635 - time (sec): 216.40 - samples/sec: 285.41 - lr: 0.000008 - momentum: 0.000000
2023-10-14 01:53:41,326 epoch 10 - iter 924/1546 - loss 0.00182691 - time (sec): 260.46 - samples/sec: 284.76 - lr: 0.000007 - momentum: 0.000000
2023-10-14 01:54:24,005 epoch 10 - iter 1078/1546 - loss 0.00196491 - time (sec): 303.14 - samples/sec: 284.52 - lr: 0.000005 - momentum: 0.000000
2023-10-14 01:55:07,582 epoch 10 - iter 1232/1546 - loss 0.00209036 - time (sec): 346.71 - samples/sec: 285.53 - lr: 0.000003 - momentum: 0.000000
2023-10-14 01:55:51,488 epoch 10 - iter 1386/1546 - loss 0.00233145 - time (sec): 390.62 - samples/sec: 283.48 - lr: 0.000002 - momentum: 0.000000
2023-10-14 01:56:35,504 epoch 10 - iter 1540/1546 - loss 0.00253840 - time (sec): 434.64 - samples/sec: 284.86 - lr: 0.000000 - momentum: 0.000000
2023-10-14 01:56:37,127 ----------------------------------------------------------------------------------------------------
2023-10-14 01:56:37,127 EPOCH 10 done: loss 0.0025 - lr: 0.000000
2023-10-14 01:56:55,037 DEV : loss 0.10494109988212585 - f1-score (micro avg) 0.8259
2023-10-14 01:56:55,989 ----------------------------------------------------------------------------------------------------
2023-10-14 01:56:55,991 Loading model from best epoch ...
2023-10-14 01:57:00,432 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-14 01:57:55,195
Results:
- F-score (micro) 0.7978
- F-score (macro) 0.713
- Accuracy 0.6828
By class:
precision recall f1-score support
LOC 0.8436 0.8552 0.8493 946
BUILDING 0.5588 0.5135 0.5352 185
STREET 0.7414 0.7679 0.7544 56
micro avg 0.7978 0.7978 0.7978 1187
macro avg 0.7146 0.7122 0.7130 1187
weighted avg 0.7944 0.7978 0.7959 1187
2023-10-14 01:57:55,195 ----------------------------------------------------------------------------------------------------