stefan-it's picture
Upload folder using huggingface_hub
c154c2a
2023-10-08 21:48:25,144 ----------------------------------------------------------------------------------------------------
2023-10-08 21:48:25,145 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-08 21:48:25,145 ----------------------------------------------------------------------------------------------------
2023-10-08 21:48:25,145 MultiCorpus: 966 train + 219 dev + 204 test sentences
- NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /app/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator
2023-10-08 21:48:25,145 ----------------------------------------------------------------------------------------------------
2023-10-08 21:48:25,145 Train: 966 sentences
2023-10-08 21:48:25,145 (train_with_dev=False, train_with_test=False)
2023-10-08 21:48:25,145 ----------------------------------------------------------------------------------------------------
2023-10-08 21:48:25,146 Training Params:
2023-10-08 21:48:25,146 - learning_rate: "0.00015"
2023-10-08 21:48:25,146 - mini_batch_size: "4"
2023-10-08 21:48:25,146 - max_epochs: "10"
2023-10-08 21:48:25,146 - shuffle: "True"
2023-10-08 21:48:25,146 ----------------------------------------------------------------------------------------------------
2023-10-08 21:48:25,146 Plugins:
2023-10-08 21:48:25,146 - TensorboardLogger
2023-10-08 21:48:25,146 - LinearScheduler | warmup_fraction: '0.1'
2023-10-08 21:48:25,146 ----------------------------------------------------------------------------------------------------
2023-10-08 21:48:25,146 Final evaluation on model from best epoch (best-model.pt)
2023-10-08 21:48:25,146 - metric: "('micro avg', 'f1-score')"
2023-10-08 21:48:25,146 ----------------------------------------------------------------------------------------------------
2023-10-08 21:48:25,146 Computation:
2023-10-08 21:48:25,146 - compute on device: cuda:0
2023-10-08 21:48:25,146 - embedding storage: none
2023-10-08 21:48:25,146 ----------------------------------------------------------------------------------------------------
2023-10-08 21:48:25,146 Model training base path: "hmbench-ajmc/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-4"
2023-10-08 21:48:25,146 ----------------------------------------------------------------------------------------------------
2023-10-08 21:48:25,146 ----------------------------------------------------------------------------------------------------
2023-10-08 21:48:25,147 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-08 21:48:35,261 epoch 1 - iter 24/242 - loss 3.24936976 - time (sec): 10.11 - samples/sec: 248.78 - lr: 0.000014 - momentum: 0.000000
2023-10-08 21:48:44,932 epoch 1 - iter 48/242 - loss 3.24066758 - time (sec): 19.78 - samples/sec: 243.27 - lr: 0.000029 - momentum: 0.000000
2023-10-08 21:48:55,354 epoch 1 - iter 72/242 - loss 3.22032822 - time (sec): 30.21 - samples/sec: 246.04 - lr: 0.000044 - momentum: 0.000000
2023-10-08 21:49:05,319 epoch 1 - iter 96/242 - loss 3.17744610 - time (sec): 40.17 - samples/sec: 245.10 - lr: 0.000059 - momentum: 0.000000
2023-10-08 21:49:15,721 epoch 1 - iter 120/242 - loss 3.09588783 - time (sec): 50.57 - samples/sec: 246.51 - lr: 0.000074 - momentum: 0.000000
2023-10-08 21:49:25,624 epoch 1 - iter 144/242 - loss 2.99712605 - time (sec): 60.48 - samples/sec: 246.36 - lr: 0.000089 - momentum: 0.000000
2023-10-08 21:49:35,626 epoch 1 - iter 168/242 - loss 2.88979612 - time (sec): 70.48 - samples/sec: 245.73 - lr: 0.000104 - momentum: 0.000000
2023-10-08 21:49:45,530 epoch 1 - iter 192/242 - loss 2.78290090 - time (sec): 80.38 - samples/sec: 244.31 - lr: 0.000118 - momentum: 0.000000
2023-10-08 21:49:55,928 epoch 1 - iter 216/242 - loss 2.65496565 - time (sec): 90.78 - samples/sec: 245.28 - lr: 0.000133 - momentum: 0.000000
2023-10-08 21:50:05,679 epoch 1 - iter 240/242 - loss 2.53106870 - time (sec): 100.53 - samples/sec: 245.14 - lr: 0.000148 - momentum: 0.000000
2023-10-08 21:50:06,229 ----------------------------------------------------------------------------------------------------
2023-10-08 21:50:06,230 EPOCH 1 done: loss 2.5259 - lr: 0.000148
2023-10-08 21:50:12,701 DEV : loss 1.1719764471054077 - f1-score (micro avg) 0.0
2023-10-08 21:50:12,707 ----------------------------------------------------------------------------------------------------
2023-10-08 21:50:22,760 epoch 2 - iter 24/242 - loss 1.11332494 - time (sec): 10.05 - samples/sec: 243.75 - lr: 0.000148 - momentum: 0.000000
2023-10-08 21:50:33,542 epoch 2 - iter 48/242 - loss 0.95655996 - time (sec): 20.83 - samples/sec: 247.10 - lr: 0.000147 - momentum: 0.000000
2023-10-08 21:50:43,013 epoch 2 - iter 72/242 - loss 0.89066832 - time (sec): 30.30 - samples/sec: 242.01 - lr: 0.000145 - momentum: 0.000000
2023-10-08 21:50:53,116 epoch 2 - iter 96/242 - loss 0.82372175 - time (sec): 40.41 - samples/sec: 245.55 - lr: 0.000143 - momentum: 0.000000
2023-10-08 21:51:03,053 epoch 2 - iter 120/242 - loss 0.79572040 - time (sec): 50.34 - samples/sec: 246.74 - lr: 0.000142 - momentum: 0.000000
2023-10-08 21:51:12,702 epoch 2 - iter 144/242 - loss 0.74397531 - time (sec): 59.99 - samples/sec: 245.94 - lr: 0.000140 - momentum: 0.000000
2023-10-08 21:51:23,071 epoch 2 - iter 168/242 - loss 0.71229270 - time (sec): 70.36 - samples/sec: 246.20 - lr: 0.000139 - momentum: 0.000000
2023-10-08 21:51:33,943 epoch 2 - iter 192/242 - loss 0.67570002 - time (sec): 81.23 - samples/sec: 246.48 - lr: 0.000137 - momentum: 0.000000
2023-10-08 21:51:43,578 epoch 2 - iter 216/242 - loss 0.65912798 - time (sec): 90.87 - samples/sec: 245.00 - lr: 0.000135 - momentum: 0.000000
2023-10-08 21:51:53,476 epoch 2 - iter 240/242 - loss 0.63104209 - time (sec): 100.77 - samples/sec: 244.18 - lr: 0.000134 - momentum: 0.000000
2023-10-08 21:51:54,066 ----------------------------------------------------------------------------------------------------
2023-10-08 21:51:54,066 EPOCH 2 done: loss 0.6322 - lr: 0.000134
2023-10-08 21:52:00,544 DEV : loss 0.3960220217704773 - f1-score (micro avg) 0.0135
2023-10-08 21:52:00,550 saving best model
2023-10-08 21:52:01,401 ----------------------------------------------------------------------------------------------------
2023-10-08 21:52:10,877 epoch 3 - iter 24/242 - loss 0.40188551 - time (sec): 9.47 - samples/sec: 235.37 - lr: 0.000132 - momentum: 0.000000
2023-10-08 21:52:20,706 epoch 3 - iter 48/242 - loss 0.37672379 - time (sec): 19.30 - samples/sec: 236.06 - lr: 0.000130 - momentum: 0.000000
2023-10-08 21:52:31,476 epoch 3 - iter 72/242 - loss 0.35427374 - time (sec): 30.07 - samples/sec: 239.48 - lr: 0.000128 - momentum: 0.000000
2023-10-08 21:52:40,852 epoch 3 - iter 96/242 - loss 0.34923620 - time (sec): 39.45 - samples/sec: 240.10 - lr: 0.000127 - momentum: 0.000000
2023-10-08 21:52:50,923 epoch 3 - iter 120/242 - loss 0.33922801 - time (sec): 49.52 - samples/sec: 241.13 - lr: 0.000125 - momentum: 0.000000
2023-10-08 21:53:01,236 epoch 3 - iter 144/242 - loss 0.33523826 - time (sec): 59.83 - samples/sec: 243.12 - lr: 0.000124 - momentum: 0.000000
2023-10-08 21:53:12,104 epoch 3 - iter 168/242 - loss 0.32710972 - time (sec): 70.70 - samples/sec: 245.21 - lr: 0.000122 - momentum: 0.000000
2023-10-08 21:53:23,097 epoch 3 - iter 192/242 - loss 0.31052856 - time (sec): 81.69 - samples/sec: 246.04 - lr: 0.000120 - momentum: 0.000000
2023-10-08 21:53:32,472 epoch 3 - iter 216/242 - loss 0.30284166 - time (sec): 91.07 - samples/sec: 243.57 - lr: 0.000119 - momentum: 0.000000
2023-10-08 21:53:42,481 epoch 3 - iter 240/242 - loss 0.29493426 - time (sec): 101.08 - samples/sec: 243.64 - lr: 0.000117 - momentum: 0.000000
2023-10-08 21:53:43,061 ----------------------------------------------------------------------------------------------------
2023-10-08 21:53:43,061 EPOCH 3 done: loss 0.2940 - lr: 0.000117
2023-10-08 21:53:49,517 DEV : loss 0.24857072532176971 - f1-score (micro avg) 0.5387
2023-10-08 21:53:49,522 saving best model
2023-10-08 21:53:53,567 ----------------------------------------------------------------------------------------------------
2023-10-08 21:54:03,189 epoch 4 - iter 24/242 - loss 0.28695901 - time (sec): 9.62 - samples/sec: 244.51 - lr: 0.000115 - momentum: 0.000000
2023-10-08 21:54:13,509 epoch 4 - iter 48/242 - loss 0.25050755 - time (sec): 19.94 - samples/sec: 250.21 - lr: 0.000113 - momentum: 0.000000
2023-10-08 21:54:24,332 epoch 4 - iter 72/242 - loss 0.22200255 - time (sec): 30.76 - samples/sec: 250.20 - lr: 0.000112 - momentum: 0.000000
2023-10-08 21:54:33,704 epoch 4 - iter 96/242 - loss 0.21233290 - time (sec): 40.13 - samples/sec: 250.38 - lr: 0.000110 - momentum: 0.000000
2023-10-08 21:54:43,601 epoch 4 - iter 120/242 - loss 0.20391357 - time (sec): 50.03 - samples/sec: 253.24 - lr: 0.000109 - momentum: 0.000000
2023-10-08 21:54:53,848 epoch 4 - iter 144/242 - loss 0.19149682 - time (sec): 60.28 - samples/sec: 256.35 - lr: 0.000107 - momentum: 0.000000
2023-10-08 21:55:03,302 epoch 4 - iter 168/242 - loss 0.18804341 - time (sec): 69.73 - samples/sec: 256.97 - lr: 0.000105 - momentum: 0.000000
2023-10-08 21:55:12,735 epoch 4 - iter 192/242 - loss 0.18596665 - time (sec): 79.17 - samples/sec: 256.59 - lr: 0.000104 - momentum: 0.000000
2023-10-08 21:55:21,601 epoch 4 - iter 216/242 - loss 0.18153213 - time (sec): 88.03 - samples/sec: 256.09 - lr: 0.000102 - momentum: 0.000000
2023-10-08 21:55:30,210 epoch 4 - iter 240/242 - loss 0.18106988 - time (sec): 96.64 - samples/sec: 255.32 - lr: 0.000100 - momentum: 0.000000
2023-10-08 21:55:30,667 ----------------------------------------------------------------------------------------------------
2023-10-08 21:55:30,667 EPOCH 4 done: loss 0.1812 - lr: 0.000100
2023-10-08 21:55:36,479 DEV : loss 0.16338133811950684 - f1-score (micro avg) 0.7583
2023-10-08 21:55:36,485 saving best model
2023-10-08 21:55:40,871 ----------------------------------------------------------------------------------------------------
2023-10-08 21:55:49,856 epoch 5 - iter 24/242 - loss 0.11498730 - time (sec): 8.98 - samples/sec: 256.37 - lr: 0.000098 - momentum: 0.000000
2023-10-08 21:55:59,088 epoch 5 - iter 48/242 - loss 0.11312218 - time (sec): 18.22 - samples/sec: 257.20 - lr: 0.000097 - momentum: 0.000000
2023-10-08 21:56:08,767 epoch 5 - iter 72/242 - loss 0.11664110 - time (sec): 27.89 - samples/sec: 263.71 - lr: 0.000095 - momentum: 0.000000
2023-10-08 21:56:18,162 epoch 5 - iter 96/242 - loss 0.12130217 - time (sec): 37.29 - samples/sec: 261.06 - lr: 0.000094 - momentum: 0.000000
2023-10-08 21:56:27,699 epoch 5 - iter 120/242 - loss 0.11678047 - time (sec): 46.83 - samples/sec: 263.16 - lr: 0.000092 - momentum: 0.000000
2023-10-08 21:56:37,072 epoch 5 - iter 144/242 - loss 0.11466138 - time (sec): 56.20 - samples/sec: 262.65 - lr: 0.000090 - momentum: 0.000000
2023-10-08 21:56:46,589 epoch 5 - iter 168/242 - loss 0.12088344 - time (sec): 65.72 - samples/sec: 263.34 - lr: 0.000089 - momentum: 0.000000
2023-10-08 21:56:56,269 epoch 5 - iter 192/242 - loss 0.12137348 - time (sec): 75.40 - samples/sec: 262.53 - lr: 0.000087 - momentum: 0.000000
2023-10-08 21:57:04,904 epoch 5 - iter 216/242 - loss 0.11826361 - time (sec): 84.03 - samples/sec: 260.83 - lr: 0.000085 - momentum: 0.000000
2023-10-08 21:57:14,393 epoch 5 - iter 240/242 - loss 0.11764155 - time (sec): 93.52 - samples/sec: 261.72 - lr: 0.000084 - momentum: 0.000000
2023-10-08 21:57:15,315 ----------------------------------------------------------------------------------------------------
2023-10-08 21:57:15,315 EPOCH 5 done: loss 0.1168 - lr: 0.000084
2023-10-08 21:57:21,153 DEV : loss 0.1525898575782776 - f1-score (micro avg) 0.8204
2023-10-08 21:57:21,159 saving best model
2023-10-08 21:57:25,522 ----------------------------------------------------------------------------------------------------
2023-10-08 21:57:34,579 epoch 6 - iter 24/242 - loss 0.09059692 - time (sec): 9.06 - samples/sec: 263.15 - lr: 0.000082 - momentum: 0.000000
2023-10-08 21:57:44,205 epoch 6 - iter 48/242 - loss 0.08537644 - time (sec): 18.68 - samples/sec: 264.17 - lr: 0.000080 - momentum: 0.000000
2023-10-08 21:57:53,875 epoch 6 - iter 72/242 - loss 0.09074220 - time (sec): 28.35 - samples/sec: 265.31 - lr: 0.000079 - momentum: 0.000000
2023-10-08 21:58:02,944 epoch 6 - iter 96/242 - loss 0.08440456 - time (sec): 37.42 - samples/sec: 260.39 - lr: 0.000077 - momentum: 0.000000
2023-10-08 21:58:12,425 epoch 6 - iter 120/242 - loss 0.08569440 - time (sec): 46.90 - samples/sec: 260.53 - lr: 0.000075 - momentum: 0.000000
2023-10-08 21:58:21,866 epoch 6 - iter 144/242 - loss 0.08689621 - time (sec): 56.34 - samples/sec: 259.68 - lr: 0.000074 - momentum: 0.000000
2023-10-08 21:58:31,346 epoch 6 - iter 168/242 - loss 0.08364927 - time (sec): 65.82 - samples/sec: 260.12 - lr: 0.000072 - momentum: 0.000000
2023-10-08 21:58:41,144 epoch 6 - iter 192/242 - loss 0.08216564 - time (sec): 75.62 - samples/sec: 260.71 - lr: 0.000070 - momentum: 0.000000
2023-10-08 21:58:50,301 epoch 6 - iter 216/242 - loss 0.08163705 - time (sec): 84.78 - samples/sec: 259.85 - lr: 0.000069 - momentum: 0.000000
2023-10-08 21:58:59,914 epoch 6 - iter 240/242 - loss 0.08225899 - time (sec): 94.39 - samples/sec: 259.36 - lr: 0.000067 - momentum: 0.000000
2023-10-08 21:59:00,801 ----------------------------------------------------------------------------------------------------
2023-10-08 21:59:00,801 EPOCH 6 done: loss 0.0823 - lr: 0.000067
2023-10-08 21:59:06,845 DEV : loss 0.13395951688289642 - f1-score (micro avg) 0.8059
2023-10-08 21:59:06,851 ----------------------------------------------------------------------------------------------------
2023-10-08 21:59:17,232 epoch 7 - iter 24/242 - loss 0.04308971 - time (sec): 10.38 - samples/sec: 256.84 - lr: 0.000065 - momentum: 0.000000
2023-10-08 21:59:26,196 epoch 7 - iter 48/242 - loss 0.06023306 - time (sec): 19.34 - samples/sec: 248.86 - lr: 0.000064 - momentum: 0.000000
2023-10-08 21:59:35,484 epoch 7 - iter 72/242 - loss 0.05954315 - time (sec): 28.63 - samples/sec: 248.39 - lr: 0.000062 - momentum: 0.000000
2023-10-08 21:59:45,423 epoch 7 - iter 96/242 - loss 0.05624249 - time (sec): 38.57 - samples/sec: 247.16 - lr: 0.000060 - momentum: 0.000000
2023-10-08 21:59:54,968 epoch 7 - iter 120/242 - loss 0.05760461 - time (sec): 48.12 - samples/sec: 248.34 - lr: 0.000059 - momentum: 0.000000
2023-10-08 22:00:04,706 epoch 7 - iter 144/242 - loss 0.05798252 - time (sec): 57.85 - samples/sec: 248.27 - lr: 0.000057 - momentum: 0.000000
2023-10-08 22:00:14,766 epoch 7 - iter 168/242 - loss 0.05678083 - time (sec): 67.91 - samples/sec: 247.73 - lr: 0.000055 - momentum: 0.000000
2023-10-08 22:00:24,261 epoch 7 - iter 192/242 - loss 0.05577012 - time (sec): 77.41 - samples/sec: 247.08 - lr: 0.000054 - momentum: 0.000000
2023-10-08 22:00:35,124 epoch 7 - iter 216/242 - loss 0.06009719 - time (sec): 88.27 - samples/sec: 248.94 - lr: 0.000052 - momentum: 0.000000
2023-10-08 22:00:45,308 epoch 7 - iter 240/242 - loss 0.05937563 - time (sec): 98.46 - samples/sec: 250.43 - lr: 0.000050 - momentum: 0.000000
2023-10-08 22:00:45,842 ----------------------------------------------------------------------------------------------------
2023-10-08 22:00:45,842 EPOCH 7 done: loss 0.0594 - lr: 0.000050
2023-10-08 22:00:51,855 DEV : loss 0.15543989837169647 - f1-score (micro avg) 0.8116
2023-10-08 22:00:51,861 ----------------------------------------------------------------------------------------------------
2023-10-08 22:01:01,652 epoch 8 - iter 24/242 - loss 0.03467015 - time (sec): 9.79 - samples/sec: 265.70 - lr: 0.000049 - momentum: 0.000000
2023-10-08 22:01:10,803 epoch 8 - iter 48/242 - loss 0.04028766 - time (sec): 18.94 - samples/sec: 251.68 - lr: 0.000047 - momentum: 0.000000
2023-10-08 22:01:20,474 epoch 8 - iter 72/242 - loss 0.04085905 - time (sec): 28.61 - samples/sec: 258.92 - lr: 0.000045 - momentum: 0.000000
2023-10-08 22:01:29,605 epoch 8 - iter 96/242 - loss 0.04660420 - time (sec): 37.74 - samples/sec: 258.75 - lr: 0.000044 - momentum: 0.000000
2023-10-08 22:01:38,566 epoch 8 - iter 120/242 - loss 0.04973784 - time (sec): 46.70 - samples/sec: 256.38 - lr: 0.000042 - momentum: 0.000000
2023-10-08 22:01:48,046 epoch 8 - iter 144/242 - loss 0.05222928 - time (sec): 56.18 - samples/sec: 258.17 - lr: 0.000040 - momentum: 0.000000
2023-10-08 22:01:57,321 epoch 8 - iter 168/242 - loss 0.05142681 - time (sec): 65.46 - samples/sec: 258.20 - lr: 0.000039 - momentum: 0.000000
2023-10-08 22:02:06,790 epoch 8 - iter 192/242 - loss 0.04961141 - time (sec): 74.93 - samples/sec: 259.95 - lr: 0.000037 - momentum: 0.000000
2023-10-08 22:02:16,400 epoch 8 - iter 216/242 - loss 0.05054334 - time (sec): 84.54 - samples/sec: 260.36 - lr: 0.000035 - momentum: 0.000000
2023-10-08 22:02:25,934 epoch 8 - iter 240/242 - loss 0.04947282 - time (sec): 94.07 - samples/sec: 260.69 - lr: 0.000034 - momentum: 0.000000
2023-10-08 22:02:26,663 ----------------------------------------------------------------------------------------------------
2023-10-08 22:02:26,663 EPOCH 8 done: loss 0.0493 - lr: 0.000034
2023-10-08 22:02:32,428 DEV : loss 0.14887449145317078 - f1-score (micro avg) 0.8124
2023-10-08 22:02:32,434 ----------------------------------------------------------------------------------------------------
2023-10-08 22:02:43,185 epoch 9 - iter 24/242 - loss 0.04471556 - time (sec): 10.75 - samples/sec: 276.10 - lr: 0.000032 - momentum: 0.000000
2023-10-08 22:02:52,172 epoch 9 - iter 48/242 - loss 0.04892032 - time (sec): 19.74 - samples/sec: 269.49 - lr: 0.000030 - momentum: 0.000000
2023-10-08 22:03:00,937 epoch 9 - iter 72/242 - loss 0.04441681 - time (sec): 28.50 - samples/sec: 267.32 - lr: 0.000029 - momentum: 0.000000
2023-10-08 22:03:10,117 epoch 9 - iter 96/242 - loss 0.04682017 - time (sec): 37.68 - samples/sec: 265.64 - lr: 0.000027 - momentum: 0.000000
2023-10-08 22:03:19,439 epoch 9 - iter 120/242 - loss 0.04473883 - time (sec): 47.00 - samples/sec: 265.00 - lr: 0.000025 - momentum: 0.000000
2023-10-08 22:03:28,177 epoch 9 - iter 144/242 - loss 0.04405006 - time (sec): 55.74 - samples/sec: 263.03 - lr: 0.000024 - momentum: 0.000000
2023-10-08 22:03:38,026 epoch 9 - iter 168/242 - loss 0.04346054 - time (sec): 65.59 - samples/sec: 263.00 - lr: 0.000022 - momentum: 0.000000
2023-10-08 22:03:47,375 epoch 9 - iter 192/242 - loss 0.04284052 - time (sec): 74.94 - samples/sec: 263.32 - lr: 0.000020 - momentum: 0.000000
2023-10-08 22:03:56,583 epoch 9 - iter 216/242 - loss 0.04282043 - time (sec): 84.15 - samples/sec: 263.05 - lr: 0.000019 - momentum: 0.000000
2023-10-08 22:04:06,057 epoch 9 - iter 240/242 - loss 0.04294806 - time (sec): 93.62 - samples/sec: 262.79 - lr: 0.000017 - momentum: 0.000000
2023-10-08 22:04:06,655 ----------------------------------------------------------------------------------------------------
2023-10-08 22:04:06,655 EPOCH 9 done: loss 0.0427 - lr: 0.000017
2023-10-08 22:04:12,490 DEV : loss 0.15138302743434906 - f1-score (micro avg) 0.8165
2023-10-08 22:04:12,496 ----------------------------------------------------------------------------------------------------
2023-10-08 22:04:22,629 epoch 10 - iter 24/242 - loss 0.04445960 - time (sec): 10.13 - samples/sec: 277.85 - lr: 0.000015 - momentum: 0.000000
2023-10-08 22:04:31,391 epoch 10 - iter 48/242 - loss 0.03988212 - time (sec): 18.89 - samples/sec: 259.34 - lr: 0.000014 - momentum: 0.000000
2023-10-08 22:04:40,645 epoch 10 - iter 72/242 - loss 0.03622270 - time (sec): 28.15 - samples/sec: 260.27 - lr: 0.000012 - momentum: 0.000000
2023-10-08 22:04:49,980 epoch 10 - iter 96/242 - loss 0.03801687 - time (sec): 37.48 - samples/sec: 259.62 - lr: 0.000010 - momentum: 0.000000
2023-10-08 22:04:59,614 epoch 10 - iter 120/242 - loss 0.03744072 - time (sec): 47.12 - samples/sec: 257.94 - lr: 0.000009 - momentum: 0.000000
2023-10-08 22:05:09,252 epoch 10 - iter 144/242 - loss 0.03611060 - time (sec): 56.75 - samples/sec: 259.94 - lr: 0.000007 - momentum: 0.000000
2023-10-08 22:05:18,478 epoch 10 - iter 168/242 - loss 0.03672125 - time (sec): 65.98 - samples/sec: 260.53 - lr: 0.000005 - momentum: 0.000000
2023-10-08 22:05:27,822 epoch 10 - iter 192/242 - loss 0.03701310 - time (sec): 75.32 - samples/sec: 259.52 - lr: 0.000004 - momentum: 0.000000
2023-10-08 22:05:37,777 epoch 10 - iter 216/242 - loss 0.03824803 - time (sec): 85.28 - samples/sec: 260.66 - lr: 0.000002 - momentum: 0.000000
2023-10-08 22:05:46,925 epoch 10 - iter 240/242 - loss 0.03663811 - time (sec): 94.43 - samples/sec: 259.78 - lr: 0.000000 - momentum: 0.000000
2023-10-08 22:05:47,692 ----------------------------------------------------------------------------------------------------
2023-10-08 22:05:47,692 EPOCH 10 done: loss 0.0373 - lr: 0.000000
2023-10-08 22:05:53,853 DEV : loss 0.15766310691833496 - f1-score (micro avg) 0.8115
2023-10-08 22:05:54,761 ----------------------------------------------------------------------------------------------------
2023-10-08 22:05:54,763 Loading model from best epoch ...
2023-10-08 22:05:58,830 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-08 22:06:04,706
Results:
- F-score (micro) 0.7862
- F-score (macro) 0.4709
- Accuracy 0.6835
By class:
precision recall f1-score support
pers 0.8346 0.7986 0.8162 139
scope 0.8456 0.8915 0.8679 129
work 0.6146 0.7375 0.6705 80
loc 0.0000 0.0000 0.0000 9
date 0.0000 0.0000 0.0000 3
micro avg 0.7808 0.7917 0.7862 360
macro avg 0.4590 0.4855 0.4709 360
weighted avg 0.7618 0.7917 0.7751 360
2023-10-08 22:06:04,706 ----------------------------------------------------------------------------------------------------