File size: 24,166 Bytes
ba8cf81 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 |
2023-10-27 20:07:01,765 ----------------------------------------------------------------------------------------------------
2023-10-27 20:07:01,767 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): XLMRobertaModel(
(embeddings): XLMRobertaEmbeddings(
(word_embeddings): Embedding(250003, 1024)
(position_embeddings): Embedding(514, 1024, padding_idx=1)
(token_type_embeddings): Embedding(1, 1024)
(LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): XLMRobertaEncoder(
(layer): ModuleList(
(0-23): 24 x XLMRobertaLayer(
(attention): XLMRobertaAttention(
(self): XLMRobertaSelfAttention(
(query): Linear(in_features=1024, out_features=1024, bias=True)
(key): Linear(in_features=1024, out_features=1024, bias=True)
(value): Linear(in_features=1024, out_features=1024, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): XLMRobertaSelfOutput(
(dense): Linear(in_features=1024, out_features=1024, bias=True)
(LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): XLMRobertaIntermediate(
(dense): Linear(in_features=1024, out_features=4096, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): XLMRobertaOutput(
(dense): Linear(in_features=4096, out_features=1024, bias=True)
(LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): XLMRobertaPooler(
(dense): Linear(in_features=1024, out_features=1024, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1024, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-27 20:07:01,767 ----------------------------------------------------------------------------------------------------
2023-10-27 20:07:01,767 Corpus: 14903 train + 3449 dev + 3658 test sentences
2023-10-27 20:07:01,767 ----------------------------------------------------------------------------------------------------
2023-10-27 20:07:01,767 Train: 14903 sentences
2023-10-27 20:07:01,767 (train_with_dev=False, train_with_test=False)
2023-10-27 20:07:01,767 ----------------------------------------------------------------------------------------------------
2023-10-27 20:07:01,767 Training Params:
2023-10-27 20:07:01,767 - learning_rate: "5e-06"
2023-10-27 20:07:01,767 - mini_batch_size: "4"
2023-10-27 20:07:01,767 - max_epochs: "10"
2023-10-27 20:07:01,767 - shuffle: "True"
2023-10-27 20:07:01,767 ----------------------------------------------------------------------------------------------------
2023-10-27 20:07:01,768 Plugins:
2023-10-27 20:07:01,768 - TensorboardLogger
2023-10-27 20:07:01,768 - LinearScheduler | warmup_fraction: '0.1'
2023-10-27 20:07:01,768 ----------------------------------------------------------------------------------------------------
2023-10-27 20:07:01,768 Final evaluation on model from best epoch (best-model.pt)
2023-10-27 20:07:01,768 - metric: "('micro avg', 'f1-score')"
2023-10-27 20:07:01,768 ----------------------------------------------------------------------------------------------------
2023-10-27 20:07:01,768 Computation:
2023-10-27 20:07:01,768 - compute on device: cuda:0
2023-10-27 20:07:01,768 - embedding storage: none
2023-10-27 20:07:01,768 ----------------------------------------------------------------------------------------------------
2023-10-27 20:07:01,768 Model training base path: "flair-clean-conll-lr5e-06-bs4-5"
2023-10-27 20:07:01,768 ----------------------------------------------------------------------------------------------------
2023-10-27 20:07:01,768 ----------------------------------------------------------------------------------------------------
2023-10-27 20:07:01,768 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-27 20:07:47,370 epoch 1 - iter 372/3726 - loss 2.80593129 - time (sec): 45.60 - samples/sec: 438.05 - lr: 0.000000 - momentum: 0.000000
2023-10-27 20:08:32,644 epoch 1 - iter 744/3726 - loss 1.85094677 - time (sec): 90.87 - samples/sec: 444.69 - lr: 0.000001 - momentum: 0.000000
2023-10-27 20:09:17,954 epoch 1 - iter 1116/3726 - loss 1.42058830 - time (sec): 136.18 - samples/sec: 444.93 - lr: 0.000001 - momentum: 0.000000
2023-10-27 20:10:03,236 epoch 1 - iter 1488/3726 - loss 1.16393809 - time (sec): 181.47 - samples/sec: 444.38 - lr: 0.000002 - momentum: 0.000000
2023-10-27 20:10:48,524 epoch 1 - iter 1860/3726 - loss 0.98551923 - time (sec): 226.75 - samples/sec: 446.04 - lr: 0.000002 - momentum: 0.000000
2023-10-27 20:11:33,918 epoch 1 - iter 2232/3726 - loss 0.84790959 - time (sec): 272.15 - samples/sec: 450.27 - lr: 0.000003 - momentum: 0.000000
2023-10-27 20:12:19,596 epoch 1 - iter 2604/3726 - loss 0.74598188 - time (sec): 317.83 - samples/sec: 451.02 - lr: 0.000003 - momentum: 0.000000
2023-10-27 20:13:05,071 epoch 1 - iter 2976/3726 - loss 0.66540477 - time (sec): 363.30 - samples/sec: 451.42 - lr: 0.000004 - momentum: 0.000000
2023-10-27 20:13:50,701 epoch 1 - iter 3348/3726 - loss 0.60748783 - time (sec): 408.93 - samples/sec: 449.21 - lr: 0.000004 - momentum: 0.000000
2023-10-27 20:14:35,944 epoch 1 - iter 3720/3726 - loss 0.55800133 - time (sec): 454.17 - samples/sec: 449.55 - lr: 0.000005 - momentum: 0.000000
2023-10-27 20:14:36,662 ----------------------------------------------------------------------------------------------------
2023-10-27 20:14:36,663 EPOCH 1 done: loss 0.5571 - lr: 0.000005
2023-10-27 20:15:00,976 DEV : loss 0.08272701501846313 - f1-score (micro avg) 0.9305
2023-10-27 20:15:01,029 saving best model
2023-10-27 20:15:02,837 ----------------------------------------------------------------------------------------------------
2023-10-27 20:15:49,496 epoch 2 - iter 372/3726 - loss 0.09075236 - time (sec): 46.66 - samples/sec: 446.77 - lr: 0.000005 - momentum: 0.000000
2023-10-27 20:16:35,366 epoch 2 - iter 744/3726 - loss 0.09661752 - time (sec): 92.53 - samples/sec: 440.88 - lr: 0.000005 - momentum: 0.000000
2023-10-27 20:17:20,750 epoch 2 - iter 1116/3726 - loss 0.09592189 - time (sec): 137.91 - samples/sec: 442.98 - lr: 0.000005 - momentum: 0.000000
2023-10-27 20:18:05,959 epoch 2 - iter 1488/3726 - loss 0.09322980 - time (sec): 183.12 - samples/sec: 443.25 - lr: 0.000005 - momentum: 0.000000
2023-10-27 20:18:51,330 epoch 2 - iter 1860/3726 - loss 0.08941354 - time (sec): 228.49 - samples/sec: 443.63 - lr: 0.000005 - momentum: 0.000000
2023-10-27 20:19:36,865 epoch 2 - iter 2232/3726 - loss 0.08782526 - time (sec): 274.03 - samples/sec: 444.12 - lr: 0.000005 - momentum: 0.000000
2023-10-27 20:20:21,760 epoch 2 - iter 2604/3726 - loss 0.08763755 - time (sec): 318.92 - samples/sec: 447.07 - lr: 0.000005 - momentum: 0.000000
2023-10-27 20:21:07,307 epoch 2 - iter 2976/3726 - loss 0.08459677 - time (sec): 364.47 - samples/sec: 449.80 - lr: 0.000005 - momentum: 0.000000
2023-10-27 20:21:52,612 epoch 2 - iter 3348/3726 - loss 0.08200724 - time (sec): 409.77 - samples/sec: 449.39 - lr: 0.000005 - momentum: 0.000000
2023-10-27 20:22:37,887 epoch 2 - iter 3720/3726 - loss 0.08090953 - time (sec): 455.05 - samples/sec: 448.96 - lr: 0.000004 - momentum: 0.000000
2023-10-27 20:22:38,602 ----------------------------------------------------------------------------------------------------
2023-10-27 20:22:38,603 EPOCH 2 done: loss 0.0808 - lr: 0.000004
2023-10-27 20:23:01,816 DEV : loss 0.0558977946639061 - f1-score (micro avg) 0.9643
2023-10-27 20:23:01,871 saving best model
2023-10-27 20:23:04,562 ----------------------------------------------------------------------------------------------------
2023-10-27 20:23:50,075 epoch 3 - iter 372/3726 - loss 0.04925132 - time (sec): 45.51 - samples/sec: 436.29 - lr: 0.000004 - momentum: 0.000000
2023-10-27 20:24:35,505 epoch 3 - iter 744/3726 - loss 0.05096433 - time (sec): 90.94 - samples/sec: 441.79 - lr: 0.000004 - momentum: 0.000000
2023-10-27 20:25:21,376 epoch 3 - iter 1116/3726 - loss 0.05345821 - time (sec): 136.81 - samples/sec: 444.14 - lr: 0.000004 - momentum: 0.000000
2023-10-27 20:26:06,796 epoch 3 - iter 1488/3726 - loss 0.05364040 - time (sec): 182.23 - samples/sec: 444.47 - lr: 0.000004 - momentum: 0.000000
2023-10-27 20:26:52,221 epoch 3 - iter 1860/3726 - loss 0.05380637 - time (sec): 227.66 - samples/sec: 447.62 - lr: 0.000004 - momentum: 0.000000
2023-10-27 20:27:37,903 epoch 3 - iter 2232/3726 - loss 0.05332169 - time (sec): 273.34 - samples/sec: 448.66 - lr: 0.000004 - momentum: 0.000000
2023-10-27 20:28:24,443 epoch 3 - iter 2604/3726 - loss 0.05365144 - time (sec): 319.88 - samples/sec: 446.15 - lr: 0.000004 - momentum: 0.000000
2023-10-27 20:29:09,620 epoch 3 - iter 2976/3726 - loss 0.05262140 - time (sec): 365.06 - samples/sec: 447.36 - lr: 0.000004 - momentum: 0.000000
2023-10-27 20:29:55,368 epoch 3 - iter 3348/3726 - loss 0.05205947 - time (sec): 410.80 - samples/sec: 448.23 - lr: 0.000004 - momentum: 0.000000
2023-10-27 20:30:40,872 epoch 3 - iter 3720/3726 - loss 0.05122877 - time (sec): 456.31 - samples/sec: 447.79 - lr: 0.000004 - momentum: 0.000000
2023-10-27 20:30:41,552 ----------------------------------------------------------------------------------------------------
2023-10-27 20:30:41,553 EPOCH 3 done: loss 0.0512 - lr: 0.000004
2023-10-27 20:31:04,451 DEV : loss 0.04910625144839287 - f1-score (micro avg) 0.969
2023-10-27 20:31:04,505 saving best model
2023-10-27 20:31:07,070 ----------------------------------------------------------------------------------------------------
2023-10-27 20:31:52,530 epoch 4 - iter 372/3726 - loss 0.03342383 - time (sec): 45.46 - samples/sec: 455.54 - lr: 0.000004 - momentum: 0.000000
2023-10-27 20:32:37,688 epoch 4 - iter 744/3726 - loss 0.03205166 - time (sec): 90.62 - samples/sec: 456.58 - lr: 0.000004 - momentum: 0.000000
2023-10-27 20:33:24,378 epoch 4 - iter 1116/3726 - loss 0.03357460 - time (sec): 137.31 - samples/sec: 451.87 - lr: 0.000004 - momentum: 0.000000
2023-10-27 20:34:10,145 epoch 4 - iter 1488/3726 - loss 0.03617981 - time (sec): 183.07 - samples/sec: 454.34 - lr: 0.000004 - momentum: 0.000000
2023-10-27 20:34:56,203 epoch 4 - iter 1860/3726 - loss 0.03561724 - time (sec): 229.13 - samples/sec: 450.05 - lr: 0.000004 - momentum: 0.000000
2023-10-27 20:35:42,349 epoch 4 - iter 2232/3726 - loss 0.03513961 - time (sec): 275.28 - samples/sec: 447.95 - lr: 0.000004 - momentum: 0.000000
2023-10-27 20:36:27,872 epoch 4 - iter 2604/3726 - loss 0.03560568 - time (sec): 320.80 - samples/sec: 446.14 - lr: 0.000004 - momentum: 0.000000
2023-10-27 20:37:13,373 epoch 4 - iter 2976/3726 - loss 0.03572121 - time (sec): 366.30 - samples/sec: 445.85 - lr: 0.000003 - momentum: 0.000000
2023-10-27 20:37:59,045 epoch 4 - iter 3348/3726 - loss 0.03483182 - time (sec): 411.97 - samples/sec: 446.11 - lr: 0.000003 - momentum: 0.000000
2023-10-27 20:38:44,300 epoch 4 - iter 3720/3726 - loss 0.03483957 - time (sec): 457.23 - samples/sec: 447.06 - lr: 0.000003 - momentum: 0.000000
2023-10-27 20:38:44,988 ----------------------------------------------------------------------------------------------------
2023-10-27 20:38:44,988 EPOCH 4 done: loss 0.0348 - lr: 0.000003
2023-10-27 20:39:07,891 DEV : loss 0.04652674123644829 - f1-score (micro avg) 0.9705
2023-10-27 20:39:07,943 saving best model
2023-10-27 20:39:10,583 ----------------------------------------------------------------------------------------------------
2023-10-27 20:39:56,189 epoch 5 - iter 372/3726 - loss 0.03173966 - time (sec): 45.60 - samples/sec: 442.22 - lr: 0.000003 - momentum: 0.000000
2023-10-27 20:40:42,588 epoch 5 - iter 744/3726 - loss 0.03324355 - time (sec): 92.00 - samples/sec: 441.81 - lr: 0.000003 - momentum: 0.000000
2023-10-27 20:41:28,491 epoch 5 - iter 1116/3726 - loss 0.03176114 - time (sec): 137.91 - samples/sec: 446.35 - lr: 0.000003 - momentum: 0.000000
2023-10-27 20:42:14,724 epoch 5 - iter 1488/3726 - loss 0.02967819 - time (sec): 184.14 - samples/sec: 446.19 - lr: 0.000003 - momentum: 0.000000
2023-10-27 20:42:59,527 epoch 5 - iter 1860/3726 - loss 0.03079490 - time (sec): 228.94 - samples/sec: 446.65 - lr: 0.000003 - momentum: 0.000000
2023-10-27 20:43:44,886 epoch 5 - iter 2232/3726 - loss 0.02966149 - time (sec): 274.30 - samples/sec: 445.62 - lr: 0.000003 - momentum: 0.000000
2023-10-27 20:44:30,379 epoch 5 - iter 2604/3726 - loss 0.03065570 - time (sec): 319.79 - samples/sec: 446.65 - lr: 0.000003 - momentum: 0.000000
2023-10-27 20:45:15,785 epoch 5 - iter 2976/3726 - loss 0.03042225 - time (sec): 365.20 - samples/sec: 446.89 - lr: 0.000003 - momentum: 0.000000
2023-10-27 20:46:01,755 epoch 5 - iter 3348/3726 - loss 0.03020845 - time (sec): 411.17 - samples/sec: 446.33 - lr: 0.000003 - momentum: 0.000000
2023-10-27 20:46:47,177 epoch 5 - iter 3720/3726 - loss 0.02983678 - time (sec): 456.59 - samples/sec: 447.55 - lr: 0.000003 - momentum: 0.000000
2023-10-27 20:46:47,923 ----------------------------------------------------------------------------------------------------
2023-10-27 20:46:47,924 EPOCH 5 done: loss 0.0299 - lr: 0.000003
2023-10-27 20:47:10,884 DEV : loss 0.050089359283447266 - f1-score (micro avg) 0.9712
2023-10-27 20:47:10,938 saving best model
2023-10-27 20:47:13,597 ----------------------------------------------------------------------------------------------------
2023-10-27 20:47:59,020 epoch 6 - iter 372/3726 - loss 0.02248010 - time (sec): 45.42 - samples/sec: 453.45 - lr: 0.000003 - momentum: 0.000000
2023-10-27 20:48:44,494 epoch 6 - iter 744/3726 - loss 0.01889729 - time (sec): 90.89 - samples/sec: 449.45 - lr: 0.000003 - momentum: 0.000000
2023-10-27 20:49:30,409 epoch 6 - iter 1116/3726 - loss 0.01902198 - time (sec): 136.81 - samples/sec: 446.61 - lr: 0.000003 - momentum: 0.000000
2023-10-27 20:50:16,297 epoch 6 - iter 1488/3726 - loss 0.02004643 - time (sec): 182.70 - samples/sec: 443.97 - lr: 0.000003 - momentum: 0.000000
2023-10-27 20:51:01,691 epoch 6 - iter 1860/3726 - loss 0.01983142 - time (sec): 228.09 - samples/sec: 444.93 - lr: 0.000003 - momentum: 0.000000
2023-10-27 20:51:47,098 epoch 6 - iter 2232/3726 - loss 0.02028738 - time (sec): 273.50 - samples/sec: 446.24 - lr: 0.000002 - momentum: 0.000000
2023-10-27 20:52:33,330 epoch 6 - iter 2604/3726 - loss 0.02002001 - time (sec): 319.73 - samples/sec: 446.18 - lr: 0.000002 - momentum: 0.000000
2023-10-27 20:53:19,605 epoch 6 - iter 2976/3726 - loss 0.02081204 - time (sec): 366.01 - samples/sec: 445.36 - lr: 0.000002 - momentum: 0.000000
2023-10-27 20:54:05,958 epoch 6 - iter 3348/3726 - loss 0.02032741 - time (sec): 412.36 - samples/sec: 445.83 - lr: 0.000002 - momentum: 0.000000
2023-10-27 20:54:52,761 epoch 6 - iter 3720/3726 - loss 0.02044719 - time (sec): 459.16 - samples/sec: 444.72 - lr: 0.000002 - momentum: 0.000000
2023-10-27 20:54:53,517 ----------------------------------------------------------------------------------------------------
2023-10-27 20:54:53,517 EPOCH 6 done: loss 0.0205 - lr: 0.000002
2023-10-27 20:55:17,099 DEV : loss 0.04764683172106743 - f1-score (micro avg) 0.9742
2023-10-27 20:55:17,154 saving best model
2023-10-27 20:55:19,755 ----------------------------------------------------------------------------------------------------
2023-10-27 20:56:05,378 epoch 7 - iter 372/3726 - loss 0.02366929 - time (sec): 45.62 - samples/sec: 447.27 - lr: 0.000002 - momentum: 0.000000
2023-10-27 20:56:51,118 epoch 7 - iter 744/3726 - loss 0.02311710 - time (sec): 91.36 - samples/sec: 439.82 - lr: 0.000002 - momentum: 0.000000
2023-10-27 20:57:36,527 epoch 7 - iter 1116/3726 - loss 0.02129467 - time (sec): 136.77 - samples/sec: 445.39 - lr: 0.000002 - momentum: 0.000000
2023-10-27 20:58:21,642 epoch 7 - iter 1488/3726 - loss 0.02001426 - time (sec): 181.88 - samples/sec: 447.89 - lr: 0.000002 - momentum: 0.000000
2023-10-27 20:59:08,061 epoch 7 - iter 1860/3726 - loss 0.01894813 - time (sec): 228.30 - samples/sec: 445.16 - lr: 0.000002 - momentum: 0.000000
2023-10-27 20:59:53,051 epoch 7 - iter 2232/3726 - loss 0.01829151 - time (sec): 273.29 - samples/sec: 443.22 - lr: 0.000002 - momentum: 0.000000
2023-10-27 21:00:38,919 epoch 7 - iter 2604/3726 - loss 0.01783981 - time (sec): 319.16 - samples/sec: 442.58 - lr: 0.000002 - momentum: 0.000000
2023-10-27 21:01:26,075 epoch 7 - iter 2976/3726 - loss 0.01776618 - time (sec): 366.32 - samples/sec: 442.68 - lr: 0.000002 - momentum: 0.000000
2023-10-27 21:02:13,955 epoch 7 - iter 3348/3726 - loss 0.01772398 - time (sec): 414.20 - samples/sec: 442.24 - lr: 0.000002 - momentum: 0.000000
2023-10-27 21:03:01,418 epoch 7 - iter 3720/3726 - loss 0.01723102 - time (sec): 461.66 - samples/sec: 442.49 - lr: 0.000002 - momentum: 0.000000
2023-10-27 21:03:02,183 ----------------------------------------------------------------------------------------------------
2023-10-27 21:03:02,184 EPOCH 7 done: loss 0.0177 - lr: 0.000002
2023-10-27 21:03:26,361 DEV : loss 0.05419960245490074 - f1-score (micro avg) 0.9746
2023-10-27 21:03:26,416 saving best model
2023-10-27 21:03:29,497 ----------------------------------------------------------------------------------------------------
2023-10-27 21:04:16,746 epoch 8 - iter 372/3726 - loss 0.01736122 - time (sec): 47.25 - samples/sec: 425.05 - lr: 0.000002 - momentum: 0.000000
2023-10-27 21:05:04,978 epoch 8 - iter 744/3726 - loss 0.01398385 - time (sec): 95.48 - samples/sec: 422.25 - lr: 0.000002 - momentum: 0.000000
2023-10-27 21:05:52,318 epoch 8 - iter 1116/3726 - loss 0.01274088 - time (sec): 142.82 - samples/sec: 424.69 - lr: 0.000002 - momentum: 0.000000
2023-10-27 21:06:39,260 epoch 8 - iter 1488/3726 - loss 0.01328050 - time (sec): 189.76 - samples/sec: 424.24 - lr: 0.000001 - momentum: 0.000000
2023-10-27 21:07:26,410 epoch 8 - iter 1860/3726 - loss 0.01227844 - time (sec): 236.91 - samples/sec: 427.47 - lr: 0.000001 - momentum: 0.000000
2023-10-27 21:08:13,172 epoch 8 - iter 2232/3726 - loss 0.01171643 - time (sec): 283.67 - samples/sec: 428.79 - lr: 0.000001 - momentum: 0.000000
2023-10-27 21:09:00,982 epoch 8 - iter 2604/3726 - loss 0.01235731 - time (sec): 331.48 - samples/sec: 428.53 - lr: 0.000001 - momentum: 0.000000
2023-10-27 21:09:51,061 epoch 8 - iter 2976/3726 - loss 0.01221098 - time (sec): 381.56 - samples/sec: 426.25 - lr: 0.000001 - momentum: 0.000000
2023-10-27 21:10:41,073 epoch 8 - iter 3348/3726 - loss 0.01227301 - time (sec): 431.57 - samples/sec: 426.32 - lr: 0.000001 - momentum: 0.000000
2023-10-27 21:11:30,225 epoch 8 - iter 3720/3726 - loss 0.01200497 - time (sec): 480.72 - samples/sec: 424.99 - lr: 0.000001 - momentum: 0.000000
2023-10-27 21:11:31,013 ----------------------------------------------------------------------------------------------------
2023-10-27 21:11:31,013 EPOCH 8 done: loss 0.0120 - lr: 0.000001
2023-10-27 21:11:56,731 DEV : loss 0.05550903454422951 - f1-score (micro avg) 0.9746
2023-10-27 21:11:56,806 ----------------------------------------------------------------------------------------------------
2023-10-27 21:12:46,264 epoch 9 - iter 372/3726 - loss 0.00563766 - time (sec): 49.46 - samples/sec: 405.90 - lr: 0.000001 - momentum: 0.000000
2023-10-27 21:13:36,129 epoch 9 - iter 744/3726 - loss 0.00454582 - time (sec): 99.32 - samples/sec: 411.65 - lr: 0.000001 - momentum: 0.000000
2023-10-27 21:14:26,227 epoch 9 - iter 1116/3726 - loss 0.00553718 - time (sec): 149.42 - samples/sec: 408.79 - lr: 0.000001 - momentum: 0.000000
2023-10-27 21:15:15,608 epoch 9 - iter 1488/3726 - loss 0.00675128 - time (sec): 198.80 - samples/sec: 409.19 - lr: 0.000001 - momentum: 0.000000
2023-10-27 21:16:05,355 epoch 9 - iter 1860/3726 - loss 0.00722006 - time (sec): 248.55 - samples/sec: 412.39 - lr: 0.000001 - momentum: 0.000000
2023-10-27 21:16:54,643 epoch 9 - iter 2232/3726 - loss 0.00736249 - time (sec): 297.83 - samples/sec: 411.62 - lr: 0.000001 - momentum: 0.000000
2023-10-27 21:17:45,323 epoch 9 - iter 2604/3726 - loss 0.00786494 - time (sec): 348.51 - samples/sec: 410.57 - lr: 0.000001 - momentum: 0.000000
2023-10-27 21:18:35,586 epoch 9 - iter 2976/3726 - loss 0.00784383 - time (sec): 398.78 - samples/sec: 410.83 - lr: 0.000001 - momentum: 0.000000
2023-10-27 21:19:25,103 epoch 9 - iter 3348/3726 - loss 0.00763218 - time (sec): 448.29 - samples/sec: 409.68 - lr: 0.000001 - momentum: 0.000000
2023-10-27 21:20:15,011 epoch 9 - iter 3720/3726 - loss 0.00729659 - time (sec): 498.20 - samples/sec: 409.86 - lr: 0.000001 - momentum: 0.000000
2023-10-27 21:20:15,793 ----------------------------------------------------------------------------------------------------
2023-10-27 21:20:15,793 EPOCH 9 done: loss 0.0073 - lr: 0.000001
2023-10-27 21:20:41,491 DEV : loss 0.056521423161029816 - f1-score (micro avg) 0.9737
2023-10-27 21:20:41,559 ----------------------------------------------------------------------------------------------------
2023-10-27 21:21:31,209 epoch 10 - iter 372/3726 - loss 0.00974983 - time (sec): 49.65 - samples/sec: 405.18 - lr: 0.000001 - momentum: 0.000000
2023-10-27 21:22:20,871 epoch 10 - iter 744/3726 - loss 0.00598632 - time (sec): 99.31 - samples/sec: 409.05 - lr: 0.000000 - momentum: 0.000000
2023-10-27 21:23:10,531 epoch 10 - iter 1116/3726 - loss 0.00650431 - time (sec): 148.97 - samples/sec: 415.14 - lr: 0.000000 - momentum: 0.000000
2023-10-27 21:24:00,035 epoch 10 - iter 1488/3726 - loss 0.00622288 - time (sec): 198.47 - samples/sec: 415.10 - lr: 0.000000 - momentum: 0.000000
2023-10-27 21:24:50,574 epoch 10 - iter 1860/3726 - loss 0.00650968 - time (sec): 249.01 - samples/sec: 412.96 - lr: 0.000000 - momentum: 0.000000
2023-10-27 21:25:40,130 epoch 10 - iter 2232/3726 - loss 0.00707901 - time (sec): 298.57 - samples/sec: 412.19 - lr: 0.000000 - momentum: 0.000000
2023-10-27 21:26:30,069 epoch 10 - iter 2604/3726 - loss 0.00707633 - time (sec): 348.51 - samples/sec: 408.38 - lr: 0.000000 - momentum: 0.000000
2023-10-27 21:27:20,314 epoch 10 - iter 2976/3726 - loss 0.00672126 - time (sec): 398.75 - samples/sec: 409.67 - lr: 0.000000 - momentum: 0.000000
2023-10-27 21:28:09,594 epoch 10 - iter 3348/3726 - loss 0.00659107 - time (sec): 448.03 - samples/sec: 408.65 - lr: 0.000000 - momentum: 0.000000
2023-10-27 21:28:58,954 epoch 10 - iter 3720/3726 - loss 0.00650167 - time (sec): 497.39 - samples/sec: 410.70 - lr: 0.000000 - momentum: 0.000000
2023-10-27 21:28:59,752 ----------------------------------------------------------------------------------------------------
2023-10-27 21:28:59,752 EPOCH 10 done: loss 0.0065 - lr: 0.000000
2023-10-27 21:29:25,442 DEV : loss 0.05730742961168289 - f1-score (micro avg) 0.9744
2023-10-27 21:29:28,531 ----------------------------------------------------------------------------------------------------
2023-10-27 21:29:28,534 Loading model from best epoch ...
2023-10-27 21:29:38,713 SequenceTagger predicts: Dictionary with 17 tags: O, S-ORG, B-ORG, E-ORG, I-ORG, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-MISC, B-MISC, E-MISC, I-MISC
2023-10-27 21:30:03,801
Results:
- F-score (micro) 0.9699
- F-score (macro) 0.9647
- Accuracy 0.9567
By class:
precision recall f1-score support
ORG 0.9662 0.9738 0.9700 1909
PER 0.9956 0.9937 0.9947 1591
LOC 0.9701 0.9632 0.9666 1413
MISC 0.9170 0.9384 0.9276 812
micro avg 0.9682 0.9717 0.9699 5725
macro avg 0.9622 0.9673 0.9647 5725
weighted avg 0.9683 0.9717 0.9700 5725
2023-10-27 21:30:03,801 ----------------------------------------------------------------------------------------------------
|