stefan-it's picture
Upload folder using huggingface_hub
76b0cab
2023-10-17 10:05:15,184 ----------------------------------------------------------------------------------------------------
2023-10-17 10:05:15,185 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 10:05:15,185 ----------------------------------------------------------------------------------------------------
2023-10-17 10:05:15,185 MultiCorpus: 1214 train + 266 dev + 251 test sentences
- NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator
2023-10-17 10:05:15,185 ----------------------------------------------------------------------------------------------------
2023-10-17 10:05:15,185 Train: 1214 sentences
2023-10-17 10:05:15,185 (train_with_dev=False, train_with_test=False)
2023-10-17 10:05:15,185 ----------------------------------------------------------------------------------------------------
2023-10-17 10:05:15,185 Training Params:
2023-10-17 10:05:15,185 - learning_rate: "5e-05"
2023-10-17 10:05:15,185 - mini_batch_size: "4"
2023-10-17 10:05:15,185 - max_epochs: "10"
2023-10-17 10:05:15,186 - shuffle: "True"
2023-10-17 10:05:15,186 ----------------------------------------------------------------------------------------------------
2023-10-17 10:05:15,186 Plugins:
2023-10-17 10:05:15,186 - TensorboardLogger
2023-10-17 10:05:15,186 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 10:05:15,186 ----------------------------------------------------------------------------------------------------
2023-10-17 10:05:15,186 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 10:05:15,186 - metric: "('micro avg', 'f1-score')"
2023-10-17 10:05:15,186 ----------------------------------------------------------------------------------------------------
2023-10-17 10:05:15,186 Computation:
2023-10-17 10:05:15,186 - compute on device: cuda:0
2023-10-17 10:05:15,186 - embedding storage: none
2023-10-17 10:05:15,186 ----------------------------------------------------------------------------------------------------
2023-10-17 10:05:15,186 Model training base path: "hmbench-ajmc/en-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-17 10:05:15,186 ----------------------------------------------------------------------------------------------------
2023-10-17 10:05:15,186 ----------------------------------------------------------------------------------------------------
2023-10-17 10:05:15,186 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 10:05:16,663 epoch 1 - iter 30/304 - loss 3.78710679 - time (sec): 1.48 - samples/sec: 1925.88 - lr: 0.000005 - momentum: 0.000000
2023-10-17 10:05:18,116 epoch 1 - iter 60/304 - loss 2.73070745 - time (sec): 2.93 - samples/sec: 2052.26 - lr: 0.000010 - momentum: 0.000000
2023-10-17 10:05:19,510 epoch 1 - iter 90/304 - loss 2.06124605 - time (sec): 4.32 - samples/sec: 2082.46 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:05:20,933 epoch 1 - iter 120/304 - loss 1.63016199 - time (sec): 5.75 - samples/sec: 2108.39 - lr: 0.000020 - momentum: 0.000000
2023-10-17 10:05:22,307 epoch 1 - iter 150/304 - loss 1.38162276 - time (sec): 7.12 - samples/sec: 2111.78 - lr: 0.000025 - momentum: 0.000000
2023-10-17 10:05:23,699 epoch 1 - iter 180/304 - loss 1.20050216 - time (sec): 8.51 - samples/sec: 2136.27 - lr: 0.000029 - momentum: 0.000000
2023-10-17 10:05:25,113 epoch 1 - iter 210/304 - loss 1.05612794 - time (sec): 9.93 - samples/sec: 2177.89 - lr: 0.000034 - momentum: 0.000000
2023-10-17 10:05:26,524 epoch 1 - iter 240/304 - loss 0.95025081 - time (sec): 11.34 - samples/sec: 2179.86 - lr: 0.000039 - momentum: 0.000000
2023-10-17 10:05:27,891 epoch 1 - iter 270/304 - loss 0.86773041 - time (sec): 12.70 - samples/sec: 2194.21 - lr: 0.000044 - momentum: 0.000000
2023-10-17 10:05:29,262 epoch 1 - iter 300/304 - loss 0.80247575 - time (sec): 14.08 - samples/sec: 2180.36 - lr: 0.000049 - momentum: 0.000000
2023-10-17 10:05:29,450 ----------------------------------------------------------------------------------------------------
2023-10-17 10:05:29,451 EPOCH 1 done: loss 0.7956 - lr: 0.000049
2023-10-17 10:05:30,543 DEV : loss 0.18940171599388123 - f1-score (micro avg) 0.6293
2023-10-17 10:05:30,552 saving best model
2023-10-17 10:05:30,950 ----------------------------------------------------------------------------------------------------
2023-10-17 10:05:32,505 epoch 2 - iter 30/304 - loss 0.18278762 - time (sec): 1.55 - samples/sec: 2122.52 - lr: 0.000049 - momentum: 0.000000
2023-10-17 10:05:34,094 epoch 2 - iter 60/304 - loss 0.18645643 - time (sec): 3.14 - samples/sec: 1998.47 - lr: 0.000049 - momentum: 0.000000
2023-10-17 10:05:35,698 epoch 2 - iter 90/304 - loss 0.15996554 - time (sec): 4.75 - samples/sec: 1964.82 - lr: 0.000048 - momentum: 0.000000
2023-10-17 10:05:37,285 epoch 2 - iter 120/304 - loss 0.15490686 - time (sec): 6.33 - samples/sec: 1966.73 - lr: 0.000048 - momentum: 0.000000
2023-10-17 10:05:38,850 epoch 2 - iter 150/304 - loss 0.15786215 - time (sec): 7.90 - samples/sec: 1969.34 - lr: 0.000047 - momentum: 0.000000
2023-10-17 10:05:40,407 epoch 2 - iter 180/304 - loss 0.15644962 - time (sec): 9.46 - samples/sec: 1992.69 - lr: 0.000047 - momentum: 0.000000
2023-10-17 10:05:41,954 epoch 2 - iter 210/304 - loss 0.14967596 - time (sec): 11.00 - samples/sec: 1955.64 - lr: 0.000046 - momentum: 0.000000
2023-10-17 10:05:43,498 epoch 2 - iter 240/304 - loss 0.14580967 - time (sec): 12.55 - samples/sec: 1982.75 - lr: 0.000046 - momentum: 0.000000
2023-10-17 10:05:45,087 epoch 2 - iter 270/304 - loss 0.14222833 - time (sec): 14.14 - samples/sec: 1982.37 - lr: 0.000045 - momentum: 0.000000
2023-10-17 10:05:46,670 epoch 2 - iter 300/304 - loss 0.14333843 - time (sec): 15.72 - samples/sec: 1951.35 - lr: 0.000045 - momentum: 0.000000
2023-10-17 10:05:46,880 ----------------------------------------------------------------------------------------------------
2023-10-17 10:05:46,880 EPOCH 2 done: loss 0.1420 - lr: 0.000045
2023-10-17 10:05:47,853 DEV : loss 0.16976089775562286 - f1-score (micro avg) 0.7528
2023-10-17 10:05:47,861 saving best model
2023-10-17 10:05:48,445 ----------------------------------------------------------------------------------------------------
2023-10-17 10:05:50,000 epoch 3 - iter 30/304 - loss 0.09322033 - time (sec): 1.55 - samples/sec: 1975.53 - lr: 0.000044 - momentum: 0.000000
2023-10-17 10:05:51,580 epoch 3 - iter 60/304 - loss 0.09273253 - time (sec): 3.13 - samples/sec: 1950.78 - lr: 0.000043 - momentum: 0.000000
2023-10-17 10:05:53,014 epoch 3 - iter 90/304 - loss 0.09363777 - time (sec): 4.57 - samples/sec: 2034.82 - lr: 0.000043 - momentum: 0.000000
2023-10-17 10:05:54,412 epoch 3 - iter 120/304 - loss 0.09298961 - time (sec): 5.97 - samples/sec: 2059.56 - lr: 0.000042 - momentum: 0.000000
2023-10-17 10:05:55,819 epoch 3 - iter 150/304 - loss 0.08922597 - time (sec): 7.37 - samples/sec: 2050.07 - lr: 0.000042 - momentum: 0.000000
2023-10-17 10:05:57,231 epoch 3 - iter 180/304 - loss 0.08730806 - time (sec): 8.78 - samples/sec: 2042.21 - lr: 0.000041 - momentum: 0.000000
2023-10-17 10:05:58,620 epoch 3 - iter 210/304 - loss 0.09127310 - time (sec): 10.17 - samples/sec: 2060.70 - lr: 0.000041 - momentum: 0.000000
2023-10-17 10:05:59,997 epoch 3 - iter 240/304 - loss 0.09364956 - time (sec): 11.55 - samples/sec: 2105.83 - lr: 0.000040 - momentum: 0.000000
2023-10-17 10:06:01,421 epoch 3 - iter 270/304 - loss 0.09589965 - time (sec): 12.97 - samples/sec: 2123.72 - lr: 0.000040 - momentum: 0.000000
2023-10-17 10:06:02,803 epoch 3 - iter 300/304 - loss 0.09784084 - time (sec): 14.36 - samples/sec: 2129.08 - lr: 0.000039 - momentum: 0.000000
2023-10-17 10:06:02,981 ----------------------------------------------------------------------------------------------------
2023-10-17 10:06:02,981 EPOCH 3 done: loss 0.0969 - lr: 0.000039
2023-10-17 10:06:03,949 DEV : loss 0.17592370510101318 - f1-score (micro avg) 0.8123
2023-10-17 10:06:03,957 saving best model
2023-10-17 10:06:04,402 ----------------------------------------------------------------------------------------------------
2023-10-17 10:06:05,997 epoch 4 - iter 30/304 - loss 0.05487273 - time (sec): 1.59 - samples/sec: 2222.32 - lr: 0.000038 - momentum: 0.000000
2023-10-17 10:06:07,579 epoch 4 - iter 60/304 - loss 0.06998359 - time (sec): 3.17 - samples/sec: 2065.77 - lr: 0.000038 - momentum: 0.000000
2023-10-17 10:06:09,156 epoch 4 - iter 90/304 - loss 0.06857176 - time (sec): 4.75 - samples/sec: 2008.65 - lr: 0.000037 - momentum: 0.000000
2023-10-17 10:06:10,820 epoch 4 - iter 120/304 - loss 0.06728517 - time (sec): 6.42 - samples/sec: 1935.14 - lr: 0.000037 - momentum: 0.000000
2023-10-17 10:06:12,399 epoch 4 - iter 150/304 - loss 0.06480906 - time (sec): 7.99 - samples/sec: 1943.06 - lr: 0.000036 - momentum: 0.000000
2023-10-17 10:06:14,047 epoch 4 - iter 180/304 - loss 0.07017801 - time (sec): 9.64 - samples/sec: 1941.32 - lr: 0.000036 - momentum: 0.000000
2023-10-17 10:06:15,633 epoch 4 - iter 210/304 - loss 0.06745500 - time (sec): 11.23 - samples/sec: 1939.28 - lr: 0.000035 - momentum: 0.000000
2023-10-17 10:06:17,187 epoch 4 - iter 240/304 - loss 0.06506326 - time (sec): 12.78 - samples/sec: 1940.37 - lr: 0.000035 - momentum: 0.000000
2023-10-17 10:06:18,833 epoch 4 - iter 270/304 - loss 0.06434291 - time (sec): 14.43 - samples/sec: 1924.82 - lr: 0.000034 - momentum: 0.000000
2023-10-17 10:06:20,490 epoch 4 - iter 300/304 - loss 0.06777296 - time (sec): 16.09 - samples/sec: 1899.81 - lr: 0.000033 - momentum: 0.000000
2023-10-17 10:06:20,700 ----------------------------------------------------------------------------------------------------
2023-10-17 10:06:20,700 EPOCH 4 done: loss 0.0689 - lr: 0.000033
2023-10-17 10:06:21,695 DEV : loss 0.20245088636875153 - f1-score (micro avg) 0.8253
2023-10-17 10:06:21,702 saving best model
2023-10-17 10:06:22,162 ----------------------------------------------------------------------------------------------------
2023-10-17 10:06:23,745 epoch 5 - iter 30/304 - loss 0.04992004 - time (sec): 1.58 - samples/sec: 2010.11 - lr: 0.000033 - momentum: 0.000000
2023-10-17 10:06:25,299 epoch 5 - iter 60/304 - loss 0.05014883 - time (sec): 3.14 - samples/sec: 1944.68 - lr: 0.000032 - momentum: 0.000000
2023-10-17 10:06:26,951 epoch 5 - iter 90/304 - loss 0.05743268 - time (sec): 4.79 - samples/sec: 1976.34 - lr: 0.000032 - momentum: 0.000000
2023-10-17 10:06:28,553 epoch 5 - iter 120/304 - loss 0.05409120 - time (sec): 6.39 - samples/sec: 1965.05 - lr: 0.000031 - momentum: 0.000000
2023-10-17 10:06:30,078 epoch 5 - iter 150/304 - loss 0.05061556 - time (sec): 7.91 - samples/sec: 1980.26 - lr: 0.000031 - momentum: 0.000000
2023-10-17 10:06:31,630 epoch 5 - iter 180/304 - loss 0.05270231 - time (sec): 9.47 - samples/sec: 1993.33 - lr: 0.000030 - momentum: 0.000000
2023-10-17 10:06:33,168 epoch 5 - iter 210/304 - loss 0.05109864 - time (sec): 11.00 - samples/sec: 1972.62 - lr: 0.000030 - momentum: 0.000000
2023-10-17 10:06:34,744 epoch 5 - iter 240/304 - loss 0.05359537 - time (sec): 12.58 - samples/sec: 1985.50 - lr: 0.000029 - momentum: 0.000000
2023-10-17 10:06:36,266 epoch 5 - iter 270/304 - loss 0.05236408 - time (sec): 14.10 - samples/sec: 1973.88 - lr: 0.000028 - momentum: 0.000000
2023-10-17 10:06:37,793 epoch 5 - iter 300/304 - loss 0.04758594 - time (sec): 15.63 - samples/sec: 1964.99 - lr: 0.000028 - momentum: 0.000000
2023-10-17 10:06:37,991 ----------------------------------------------------------------------------------------------------
2023-10-17 10:06:37,991 EPOCH 5 done: loss 0.0475 - lr: 0.000028
2023-10-17 10:06:38,998 DEV : loss 0.19856637716293335 - f1-score (micro avg) 0.8415
2023-10-17 10:06:39,007 saving best model
2023-10-17 10:06:39,447 ----------------------------------------------------------------------------------------------------
2023-10-17 10:06:40,994 epoch 6 - iter 30/304 - loss 0.03401226 - time (sec): 1.54 - samples/sec: 1974.87 - lr: 0.000027 - momentum: 0.000000
2023-10-17 10:06:42,386 epoch 6 - iter 60/304 - loss 0.02475800 - time (sec): 2.94 - samples/sec: 2120.75 - lr: 0.000027 - momentum: 0.000000
2023-10-17 10:06:43,768 epoch 6 - iter 90/304 - loss 0.02273325 - time (sec): 4.32 - samples/sec: 2083.45 - lr: 0.000026 - momentum: 0.000000
2023-10-17 10:06:45,186 epoch 6 - iter 120/304 - loss 0.02585818 - time (sec): 5.74 - samples/sec: 2156.94 - lr: 0.000026 - momentum: 0.000000
2023-10-17 10:06:46,722 epoch 6 - iter 150/304 - loss 0.02334142 - time (sec): 7.27 - samples/sec: 2090.79 - lr: 0.000025 - momentum: 0.000000
2023-10-17 10:06:48,285 epoch 6 - iter 180/304 - loss 0.03260567 - time (sec): 8.84 - samples/sec: 2085.81 - lr: 0.000025 - momentum: 0.000000
2023-10-17 10:06:49,839 epoch 6 - iter 210/304 - loss 0.03249194 - time (sec): 10.39 - samples/sec: 2031.52 - lr: 0.000024 - momentum: 0.000000
2023-10-17 10:06:51,420 epoch 6 - iter 240/304 - loss 0.03342660 - time (sec): 11.97 - samples/sec: 2058.44 - lr: 0.000023 - momentum: 0.000000
2023-10-17 10:06:52,951 epoch 6 - iter 270/304 - loss 0.03313726 - time (sec): 13.50 - samples/sec: 2057.54 - lr: 0.000023 - momentum: 0.000000
2023-10-17 10:06:54,477 epoch 6 - iter 300/304 - loss 0.03388454 - time (sec): 15.03 - samples/sec: 2047.16 - lr: 0.000022 - momentum: 0.000000
2023-10-17 10:06:54,690 ----------------------------------------------------------------------------------------------------
2023-10-17 10:06:54,690 EPOCH 6 done: loss 0.0336 - lr: 0.000022
2023-10-17 10:06:55,645 DEV : loss 0.22854739427566528 - f1-score (micro avg) 0.8323
2023-10-17 10:06:55,653 ----------------------------------------------------------------------------------------------------
2023-10-17 10:06:57,180 epoch 7 - iter 30/304 - loss 0.02727526 - time (sec): 1.53 - samples/sec: 1960.61 - lr: 0.000022 - momentum: 0.000000
2023-10-17 10:06:58,727 epoch 7 - iter 60/304 - loss 0.01954119 - time (sec): 3.07 - samples/sec: 2039.41 - lr: 0.000021 - momentum: 0.000000
2023-10-17 10:07:00,245 epoch 7 - iter 90/304 - loss 0.01949465 - time (sec): 4.59 - samples/sec: 1945.84 - lr: 0.000021 - momentum: 0.000000
2023-10-17 10:07:01,757 epoch 7 - iter 120/304 - loss 0.02626849 - time (sec): 6.10 - samples/sec: 1989.20 - lr: 0.000020 - momentum: 0.000000
2023-10-17 10:07:03,297 epoch 7 - iter 150/304 - loss 0.03160573 - time (sec): 7.64 - samples/sec: 2000.29 - lr: 0.000020 - momentum: 0.000000
2023-10-17 10:07:04,812 epoch 7 - iter 180/304 - loss 0.03165944 - time (sec): 9.16 - samples/sec: 2008.54 - lr: 0.000019 - momentum: 0.000000
2023-10-17 10:07:06,357 epoch 7 - iter 210/304 - loss 0.02809873 - time (sec): 10.70 - samples/sec: 2013.64 - lr: 0.000018 - momentum: 0.000000
2023-10-17 10:07:07,907 epoch 7 - iter 240/304 - loss 0.02841292 - time (sec): 12.25 - samples/sec: 2014.18 - lr: 0.000018 - momentum: 0.000000
2023-10-17 10:07:09,413 epoch 7 - iter 270/304 - loss 0.02688111 - time (sec): 13.76 - samples/sec: 2005.89 - lr: 0.000017 - momentum: 0.000000
2023-10-17 10:07:10,930 epoch 7 - iter 300/304 - loss 0.02676099 - time (sec): 15.28 - samples/sec: 2008.25 - lr: 0.000017 - momentum: 0.000000
2023-10-17 10:07:11,141 ----------------------------------------------------------------------------------------------------
2023-10-17 10:07:11,141 EPOCH 7 done: loss 0.0265 - lr: 0.000017
2023-10-17 10:07:12,131 DEV : loss 0.22182397544384003 - f1-score (micro avg) 0.8469
2023-10-17 10:07:12,139 saving best model
2023-10-17 10:07:12,610 ----------------------------------------------------------------------------------------------------
2023-10-17 10:07:14,097 epoch 8 - iter 30/304 - loss 0.03427352 - time (sec): 1.49 - samples/sec: 1921.31 - lr: 0.000016 - momentum: 0.000000
2023-10-17 10:07:15,659 epoch 8 - iter 60/304 - loss 0.02059897 - time (sec): 3.05 - samples/sec: 1907.52 - lr: 0.000016 - momentum: 0.000000
2023-10-17 10:07:17,222 epoch 8 - iter 90/304 - loss 0.01370743 - time (sec): 4.61 - samples/sec: 1942.25 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:07:18,786 epoch 8 - iter 120/304 - loss 0.01479090 - time (sec): 6.17 - samples/sec: 1930.02 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:07:20,347 epoch 8 - iter 150/304 - loss 0.01431101 - time (sec): 7.73 - samples/sec: 1955.61 - lr: 0.000014 - momentum: 0.000000
2023-10-17 10:07:21,881 epoch 8 - iter 180/304 - loss 0.01225557 - time (sec): 9.27 - samples/sec: 1962.45 - lr: 0.000013 - momentum: 0.000000
2023-10-17 10:07:23,413 epoch 8 - iter 210/304 - loss 0.01286757 - time (sec): 10.80 - samples/sec: 1966.89 - lr: 0.000013 - momentum: 0.000000
2023-10-17 10:07:24,911 epoch 8 - iter 240/304 - loss 0.01251269 - time (sec): 12.30 - samples/sec: 1994.09 - lr: 0.000012 - momentum: 0.000000
2023-10-17 10:07:26,433 epoch 8 - iter 270/304 - loss 0.01859633 - time (sec): 13.82 - samples/sec: 1996.49 - lr: 0.000012 - momentum: 0.000000
2023-10-17 10:07:27,952 epoch 8 - iter 300/304 - loss 0.02160579 - time (sec): 15.34 - samples/sec: 1994.18 - lr: 0.000011 - momentum: 0.000000
2023-10-17 10:07:28,145 ----------------------------------------------------------------------------------------------------
2023-10-17 10:07:28,145 EPOCH 8 done: loss 0.0213 - lr: 0.000011
2023-10-17 10:07:29,140 DEV : loss 0.20801495015621185 - f1-score (micro avg) 0.8656
2023-10-17 10:07:29,147 saving best model
2023-10-17 10:07:29,614 ----------------------------------------------------------------------------------------------------
2023-10-17 10:07:31,111 epoch 9 - iter 30/304 - loss 0.01053882 - time (sec): 1.50 - samples/sec: 2019.53 - lr: 0.000011 - momentum: 0.000000
2023-10-17 10:07:32,684 epoch 9 - iter 60/304 - loss 0.01047128 - time (sec): 3.07 - samples/sec: 2017.31 - lr: 0.000010 - momentum: 0.000000
2023-10-17 10:07:34,224 epoch 9 - iter 90/304 - loss 0.00969806 - time (sec): 4.61 - samples/sec: 2031.08 - lr: 0.000010 - momentum: 0.000000
2023-10-17 10:07:35,758 epoch 9 - iter 120/304 - loss 0.01549194 - time (sec): 6.14 - samples/sec: 2010.30 - lr: 0.000009 - momentum: 0.000000
2023-10-17 10:07:37,316 epoch 9 - iter 150/304 - loss 0.01484816 - time (sec): 7.70 - samples/sec: 1969.53 - lr: 0.000008 - momentum: 0.000000
2023-10-17 10:07:38,887 epoch 9 - iter 180/304 - loss 0.01706246 - time (sec): 9.27 - samples/sec: 1980.43 - lr: 0.000008 - momentum: 0.000000
2023-10-17 10:07:40,550 epoch 9 - iter 210/304 - loss 0.01557972 - time (sec): 10.93 - samples/sec: 1966.99 - lr: 0.000007 - momentum: 0.000000
2023-10-17 10:07:42,090 epoch 9 - iter 240/304 - loss 0.01455481 - time (sec): 12.47 - samples/sec: 1960.58 - lr: 0.000007 - momentum: 0.000000
2023-10-17 10:07:43,704 epoch 9 - iter 270/304 - loss 0.01487115 - time (sec): 14.09 - samples/sec: 1954.38 - lr: 0.000006 - momentum: 0.000000
2023-10-17 10:07:45,290 epoch 9 - iter 300/304 - loss 0.01407558 - time (sec): 15.67 - samples/sec: 1953.43 - lr: 0.000006 - momentum: 0.000000
2023-10-17 10:07:45,480 ----------------------------------------------------------------------------------------------------
2023-10-17 10:07:45,480 EPOCH 9 done: loss 0.0139 - lr: 0.000006
2023-10-17 10:07:46,449 DEV : loss 0.2179786115884781 - f1-score (micro avg) 0.8626
2023-10-17 10:07:46,456 ----------------------------------------------------------------------------------------------------
2023-10-17 10:07:47,970 epoch 10 - iter 30/304 - loss 0.00340291 - time (sec): 1.51 - samples/sec: 2129.64 - lr: 0.000005 - momentum: 0.000000
2023-10-17 10:07:49,551 epoch 10 - iter 60/304 - loss 0.01231855 - time (sec): 3.09 - samples/sec: 1994.84 - lr: 0.000005 - momentum: 0.000000
2023-10-17 10:07:51,112 epoch 10 - iter 90/304 - loss 0.00844573 - time (sec): 4.65 - samples/sec: 1944.47 - lr: 0.000004 - momentum: 0.000000
2023-10-17 10:07:52,670 epoch 10 - iter 120/304 - loss 0.00791771 - time (sec): 6.21 - samples/sec: 1996.50 - lr: 0.000003 - momentum: 0.000000
2023-10-17 10:07:54,261 epoch 10 - iter 150/304 - loss 0.00628740 - time (sec): 7.80 - samples/sec: 2006.65 - lr: 0.000003 - momentum: 0.000000
2023-10-17 10:07:55,860 epoch 10 - iter 180/304 - loss 0.00646574 - time (sec): 9.40 - samples/sec: 1989.37 - lr: 0.000002 - momentum: 0.000000
2023-10-17 10:07:57,418 epoch 10 - iter 210/304 - loss 0.00841620 - time (sec): 10.96 - samples/sec: 1968.42 - lr: 0.000002 - momentum: 0.000000
2023-10-17 10:07:58,983 epoch 10 - iter 240/304 - loss 0.01040953 - time (sec): 12.53 - samples/sec: 1958.76 - lr: 0.000001 - momentum: 0.000000
2023-10-17 10:08:00,532 epoch 10 - iter 270/304 - loss 0.01133920 - time (sec): 14.07 - samples/sec: 1956.54 - lr: 0.000001 - momentum: 0.000000
2023-10-17 10:08:02,099 epoch 10 - iter 300/304 - loss 0.01129145 - time (sec): 15.64 - samples/sec: 1953.92 - lr: 0.000000 - momentum: 0.000000
2023-10-17 10:08:02,307 ----------------------------------------------------------------------------------------------------
2023-10-17 10:08:02,307 EPOCH 10 done: loss 0.0113 - lr: 0.000000
2023-10-17 10:08:03,294 DEV : loss 0.21762032806873322 - f1-score (micro avg) 0.8616
2023-10-17 10:08:03,629 ----------------------------------------------------------------------------------------------------
2023-10-17 10:08:03,631 Loading model from best epoch ...
2023-10-17 10:08:05,420 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object
2023-10-17 10:08:06,625
Results:
- F-score (micro) 0.8242
- F-score (macro) 0.6634
- Accuracy 0.7059
By class:
precision recall f1-score support
scope 0.7771 0.8079 0.7922 151
pers 0.8142 0.9583 0.8804 96
work 0.8077 0.8842 0.8442 95
date 0.0000 0.0000 0.0000 3
loc 1.0000 0.6667 0.8000 3
micro avg 0.7895 0.8621 0.8242 348
macro avg 0.6798 0.6634 0.6634 348
weighted avg 0.7909 0.8621 0.8240 348
2023-10-17 10:08:06,625 ----------------------------------------------------------------------------------------------------