Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- test.tsv +0 -0
- training.log +241 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b2a46f5dfe07a3828078a6ad589a70c820d9ac64ebd0f0828defb65c273f7e4b
|
3 |
+
size 443311175
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 23:38:25 0.0000 0.3270 0.0676 0.7342 0.6878 0.7102 0.5621
|
3 |
+
2 23:39:37 0.0000 0.0833 0.0614 0.7410 0.7848 0.7623 0.6242
|
4 |
+
3 23:40:49 0.0000 0.0563 0.0841 0.7229 0.7595 0.7407 0.6102
|
5 |
+
4 23:41:59 0.0000 0.0376 0.0839 0.7158 0.8397 0.7728 0.6482
|
6 |
+
5 23:43:11 0.0000 0.0278 0.1024 0.7519 0.8312 0.7896 0.6633
|
7 |
+
6 23:44:22 0.0000 0.0200 0.1068 0.7424 0.8270 0.7824 0.6577
|
8 |
+
7 23:45:33 0.0000 0.0128 0.1083 0.7654 0.8397 0.8008 0.6792
|
9 |
+
8 23:46:46 0.0000 0.0088 0.1109 0.7647 0.8228 0.7927 0.6747
|
10 |
+
9 23:47:58 0.0000 0.0048 0.1213 0.7751 0.8143 0.7942 0.6772
|
11 |
+
10 23:49:10 0.0000 0.0033 0.1214 0.7795 0.8354 0.8065 0.6899
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,241 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-16 23:37:15,287 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-16 23:37:15,288 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=768, out_features=13, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-16 23:37:15,288 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-16 23:37:15,288 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
|
52 |
+
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
|
53 |
+
2023-10-16 23:37:15,288 ----------------------------------------------------------------------------------------------------
|
54 |
+
2023-10-16 23:37:15,288 Train: 6183 sentences
|
55 |
+
2023-10-16 23:37:15,288 (train_with_dev=False, train_with_test=False)
|
56 |
+
2023-10-16 23:37:15,288 ----------------------------------------------------------------------------------------------------
|
57 |
+
2023-10-16 23:37:15,288 Training Params:
|
58 |
+
2023-10-16 23:37:15,288 - learning_rate: "3e-05"
|
59 |
+
2023-10-16 23:37:15,288 - mini_batch_size: "4"
|
60 |
+
2023-10-16 23:37:15,288 - max_epochs: "10"
|
61 |
+
2023-10-16 23:37:15,288 - shuffle: "True"
|
62 |
+
2023-10-16 23:37:15,288 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-16 23:37:15,288 Plugins:
|
64 |
+
2023-10-16 23:37:15,289 - LinearScheduler | warmup_fraction: '0.1'
|
65 |
+
2023-10-16 23:37:15,289 ----------------------------------------------------------------------------------------------------
|
66 |
+
2023-10-16 23:37:15,289 Final evaluation on model from best epoch (best-model.pt)
|
67 |
+
2023-10-16 23:37:15,289 - metric: "('micro avg', 'f1-score')"
|
68 |
+
2023-10-16 23:37:15,289 ----------------------------------------------------------------------------------------------------
|
69 |
+
2023-10-16 23:37:15,289 Computation:
|
70 |
+
2023-10-16 23:37:15,289 - compute on device: cuda:0
|
71 |
+
2023-10-16 23:37:15,289 - embedding storage: none
|
72 |
+
2023-10-16 23:37:15,289 ----------------------------------------------------------------------------------------------------
|
73 |
+
2023-10-16 23:37:15,289 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
|
74 |
+
2023-10-16 23:37:15,289 ----------------------------------------------------------------------------------------------------
|
75 |
+
2023-10-16 23:37:15,289 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-16 23:37:22,073 epoch 1 - iter 154/1546 - loss 1.92565161 - time (sec): 6.78 - samples/sec: 1898.22 - lr: 0.000003 - momentum: 0.000000
|
77 |
+
2023-10-16 23:37:28,810 epoch 1 - iter 308/1546 - loss 1.09983562 - time (sec): 13.52 - samples/sec: 1886.82 - lr: 0.000006 - momentum: 0.000000
|
78 |
+
2023-10-16 23:37:35,629 epoch 1 - iter 462/1546 - loss 0.79988559 - time (sec): 20.34 - samples/sec: 1855.62 - lr: 0.000009 - momentum: 0.000000
|
79 |
+
2023-10-16 23:37:42,402 epoch 1 - iter 616/1546 - loss 0.64149237 - time (sec): 27.11 - samples/sec: 1841.30 - lr: 0.000012 - momentum: 0.000000
|
80 |
+
2023-10-16 23:37:49,223 epoch 1 - iter 770/1546 - loss 0.53678387 - time (sec): 33.93 - samples/sec: 1842.84 - lr: 0.000015 - momentum: 0.000000
|
81 |
+
2023-10-16 23:37:56,159 epoch 1 - iter 924/1546 - loss 0.47121079 - time (sec): 40.87 - samples/sec: 1817.10 - lr: 0.000018 - momentum: 0.000000
|
82 |
+
2023-10-16 23:38:02,949 epoch 1 - iter 1078/1546 - loss 0.42227833 - time (sec): 47.66 - samples/sec: 1810.49 - lr: 0.000021 - momentum: 0.000000
|
83 |
+
2023-10-16 23:38:09,776 epoch 1 - iter 1232/1546 - loss 0.38574946 - time (sec): 54.49 - samples/sec: 1807.68 - lr: 0.000024 - momentum: 0.000000
|
84 |
+
2023-10-16 23:38:16,772 epoch 1 - iter 1386/1546 - loss 0.35423814 - time (sec): 61.48 - samples/sec: 1809.07 - lr: 0.000027 - momentum: 0.000000
|
85 |
+
2023-10-16 23:38:23,585 epoch 1 - iter 1540/1546 - loss 0.32780022 - time (sec): 68.30 - samples/sec: 1814.47 - lr: 0.000030 - momentum: 0.000000
|
86 |
+
2023-10-16 23:38:23,847 ----------------------------------------------------------------------------------------------------
|
87 |
+
2023-10-16 23:38:23,847 EPOCH 1 done: loss 0.3270 - lr: 0.000030
|
88 |
+
2023-10-16 23:38:25,866 DEV : loss 0.06755758821964264 - f1-score (micro avg) 0.7102
|
89 |
+
2023-10-16 23:38:25,894 saving best model
|
90 |
+
2023-10-16 23:38:26,224 ----------------------------------------------------------------------------------------------------
|
91 |
+
2023-10-16 23:38:33,203 epoch 2 - iter 154/1546 - loss 0.09083964 - time (sec): 6.98 - samples/sec: 1894.92 - lr: 0.000030 - momentum: 0.000000
|
92 |
+
2023-10-16 23:38:40,075 epoch 2 - iter 308/1546 - loss 0.08611689 - time (sec): 13.85 - samples/sec: 1864.36 - lr: 0.000029 - momentum: 0.000000
|
93 |
+
2023-10-16 23:38:46,893 epoch 2 - iter 462/1546 - loss 0.08435768 - time (sec): 20.67 - samples/sec: 1843.68 - lr: 0.000029 - momentum: 0.000000
|
94 |
+
2023-10-16 23:38:53,775 epoch 2 - iter 616/1546 - loss 0.08829396 - time (sec): 27.55 - samples/sec: 1815.83 - lr: 0.000029 - momentum: 0.000000
|
95 |
+
2023-10-16 23:39:00,586 epoch 2 - iter 770/1546 - loss 0.08874921 - time (sec): 34.36 - samples/sec: 1790.29 - lr: 0.000028 - momentum: 0.000000
|
96 |
+
2023-10-16 23:39:07,326 epoch 2 - iter 924/1546 - loss 0.08870459 - time (sec): 41.10 - samples/sec: 1801.42 - lr: 0.000028 - momentum: 0.000000
|
97 |
+
2023-10-16 23:39:14,168 epoch 2 - iter 1078/1546 - loss 0.08797980 - time (sec): 47.94 - samples/sec: 1809.00 - lr: 0.000028 - momentum: 0.000000
|
98 |
+
2023-10-16 23:39:21,230 epoch 2 - iter 1232/1546 - loss 0.08446684 - time (sec): 55.00 - samples/sec: 1812.34 - lr: 0.000027 - momentum: 0.000000
|
99 |
+
2023-10-16 23:39:28,023 epoch 2 - iter 1386/1546 - loss 0.08329303 - time (sec): 61.80 - samples/sec: 1803.61 - lr: 0.000027 - momentum: 0.000000
|
100 |
+
2023-10-16 23:39:34,880 epoch 2 - iter 1540/1546 - loss 0.08339448 - time (sec): 68.65 - samples/sec: 1805.87 - lr: 0.000027 - momentum: 0.000000
|
101 |
+
2023-10-16 23:39:35,138 ----------------------------------------------------------------------------------------------------
|
102 |
+
2023-10-16 23:39:35,138 EPOCH 2 done: loss 0.0833 - lr: 0.000027
|
103 |
+
2023-10-16 23:39:37,244 DEV : loss 0.06139129400253296 - f1-score (micro avg) 0.7623
|
104 |
+
2023-10-16 23:39:37,257 saving best model
|
105 |
+
2023-10-16 23:39:37,686 ----------------------------------------------------------------------------------------------------
|
106 |
+
2023-10-16 23:39:44,558 epoch 3 - iter 154/1546 - loss 0.04155378 - time (sec): 6.87 - samples/sec: 1859.19 - lr: 0.000026 - momentum: 0.000000
|
107 |
+
2023-10-16 23:39:51,450 epoch 3 - iter 308/1546 - loss 0.05960039 - time (sec): 13.76 - samples/sec: 1877.23 - lr: 0.000026 - momentum: 0.000000
|
108 |
+
2023-10-16 23:39:58,368 epoch 3 - iter 462/1546 - loss 0.05708094 - time (sec): 20.68 - samples/sec: 1894.00 - lr: 0.000026 - momentum: 0.000000
|
109 |
+
2023-10-16 23:40:05,183 epoch 3 - iter 616/1546 - loss 0.05432285 - time (sec): 27.50 - samples/sec: 1847.87 - lr: 0.000025 - momentum: 0.000000
|
110 |
+
2023-10-16 23:40:12,029 epoch 3 - iter 770/1546 - loss 0.05492115 - time (sec): 34.34 - samples/sec: 1830.11 - lr: 0.000025 - momentum: 0.000000
|
111 |
+
2023-10-16 23:40:18,815 epoch 3 - iter 924/1546 - loss 0.05565481 - time (sec): 41.13 - samples/sec: 1815.96 - lr: 0.000025 - momentum: 0.000000
|
112 |
+
2023-10-16 23:40:25,763 epoch 3 - iter 1078/1546 - loss 0.05690822 - time (sec): 48.08 - samples/sec: 1824.49 - lr: 0.000024 - momentum: 0.000000
|
113 |
+
2023-10-16 23:40:32,745 epoch 3 - iter 1232/1546 - loss 0.05538883 - time (sec): 55.06 - samples/sec: 1813.92 - lr: 0.000024 - momentum: 0.000000
|
114 |
+
2023-10-16 23:40:39,610 epoch 3 - iter 1386/1546 - loss 0.05683208 - time (sec): 61.92 - samples/sec: 1799.53 - lr: 0.000024 - momentum: 0.000000
|
115 |
+
2023-10-16 23:40:46,465 epoch 3 - iter 1540/1546 - loss 0.05646703 - time (sec): 68.78 - samples/sec: 1801.06 - lr: 0.000023 - momentum: 0.000000
|
116 |
+
2023-10-16 23:40:46,726 ----------------------------------------------------------------------------------------------------
|
117 |
+
2023-10-16 23:40:46,727 EPOCH 3 done: loss 0.0563 - lr: 0.000023
|
118 |
+
2023-10-16 23:40:49,096 DEV : loss 0.08413656055927277 - f1-score (micro avg) 0.7407
|
119 |
+
2023-10-16 23:40:49,109 ----------------------------------------------------------------------------------------------------
|
120 |
+
2023-10-16 23:40:56,043 epoch 4 - iter 154/1546 - loss 0.03911177 - time (sec): 6.93 - samples/sec: 1680.71 - lr: 0.000023 - momentum: 0.000000
|
121 |
+
2023-10-16 23:41:03,048 epoch 4 - iter 308/1546 - loss 0.03409785 - time (sec): 13.94 - samples/sec: 1698.01 - lr: 0.000023 - momentum: 0.000000
|
122 |
+
2023-10-16 23:41:09,927 epoch 4 - iter 462/1546 - loss 0.03599079 - time (sec): 20.82 - samples/sec: 1745.38 - lr: 0.000022 - momentum: 0.000000
|
123 |
+
2023-10-16 23:41:16,679 epoch 4 - iter 616/1546 - loss 0.03396381 - time (sec): 27.57 - samples/sec: 1762.13 - lr: 0.000022 - momentum: 0.000000
|
124 |
+
2023-10-16 23:41:23,254 epoch 4 - iter 770/1546 - loss 0.03546867 - time (sec): 34.14 - samples/sec: 1781.08 - lr: 0.000022 - momentum: 0.000000
|
125 |
+
2023-10-16 23:41:30,011 epoch 4 - iter 924/1546 - loss 0.03515207 - time (sec): 40.90 - samples/sec: 1780.02 - lr: 0.000021 - momentum: 0.000000
|
126 |
+
2023-10-16 23:41:36,908 epoch 4 - iter 1078/1546 - loss 0.03640270 - time (sec): 47.80 - samples/sec: 1787.74 - lr: 0.000021 - momentum: 0.000000
|
127 |
+
2023-10-16 23:41:43,799 epoch 4 - iter 1232/1546 - loss 0.03764293 - time (sec): 54.69 - samples/sec: 1786.11 - lr: 0.000021 - momentum: 0.000000
|
128 |
+
2023-10-16 23:41:50,658 epoch 4 - iter 1386/1546 - loss 0.03780300 - time (sec): 61.55 - samples/sec: 1794.32 - lr: 0.000020 - momentum: 0.000000
|
129 |
+
2023-10-16 23:41:57,608 epoch 4 - iter 1540/1546 - loss 0.03770330 - time (sec): 68.50 - samples/sec: 1805.56 - lr: 0.000020 - momentum: 0.000000
|
130 |
+
2023-10-16 23:41:57,872 ----------------------------------------------------------------------------------------------------
|
131 |
+
2023-10-16 23:41:57,872 EPOCH 4 done: loss 0.0376 - lr: 0.000020
|
132 |
+
2023-10-16 23:41:59,933 DEV : loss 0.08385952562093735 - f1-score (micro avg) 0.7728
|
133 |
+
2023-10-16 23:41:59,945 saving best model
|
134 |
+
2023-10-16 23:42:00,364 ----------------------------------------------------------------------------------------------------
|
135 |
+
2023-10-16 23:42:06,961 epoch 5 - iter 154/1546 - loss 0.01723144 - time (sec): 6.59 - samples/sec: 1877.96 - lr: 0.000020 - momentum: 0.000000
|
136 |
+
2023-10-16 23:42:13,855 epoch 5 - iter 308/1546 - loss 0.02130077 - time (sec): 13.49 - samples/sec: 1808.30 - lr: 0.000019 - momentum: 0.000000
|
137 |
+
2023-10-16 23:42:20,643 epoch 5 - iter 462/1546 - loss 0.02384876 - time (sec): 20.28 - samples/sec: 1816.13 - lr: 0.000019 - momentum: 0.000000
|
138 |
+
2023-10-16 23:42:27,408 epoch 5 - iter 616/1546 - loss 0.02637631 - time (sec): 27.04 - samples/sec: 1818.43 - lr: 0.000019 - momentum: 0.000000
|
139 |
+
2023-10-16 23:42:34,374 epoch 5 - iter 770/1546 - loss 0.02693432 - time (sec): 34.01 - samples/sec: 1824.58 - lr: 0.000018 - momentum: 0.000000
|
140 |
+
2023-10-16 23:42:41,338 epoch 5 - iter 924/1546 - loss 0.02773812 - time (sec): 40.97 - samples/sec: 1806.78 - lr: 0.000018 - momentum: 0.000000
|
141 |
+
2023-10-16 23:42:48,253 epoch 5 - iter 1078/1546 - loss 0.02756016 - time (sec): 47.89 - samples/sec: 1828.54 - lr: 0.000018 - momentum: 0.000000
|
142 |
+
2023-10-16 23:42:55,133 epoch 5 - iter 1232/1546 - loss 0.02690068 - time (sec): 54.77 - samples/sec: 1815.36 - lr: 0.000017 - momentum: 0.000000
|
143 |
+
2023-10-16 23:43:01,995 epoch 5 - iter 1386/1546 - loss 0.02760696 - time (sec): 61.63 - samples/sec: 1813.32 - lr: 0.000017 - momentum: 0.000000
|
144 |
+
2023-10-16 23:43:08,871 epoch 5 - iter 1540/1546 - loss 0.02778396 - time (sec): 68.50 - samples/sec: 1809.87 - lr: 0.000017 - momentum: 0.000000
|
145 |
+
2023-10-16 23:43:09,124 ----------------------------------------------------------------------------------------------------
|
146 |
+
2023-10-16 23:43:09,125 EPOCH 5 done: loss 0.0278 - lr: 0.000017
|
147 |
+
2023-10-16 23:43:11,166 DEV : loss 0.10244771093130112 - f1-score (micro avg) 0.7896
|
148 |
+
2023-10-16 23:43:11,178 saving best model
|
149 |
+
2023-10-16 23:43:11,591 ----------------------------------------------------------------------------------------------------
|
150 |
+
2023-10-16 23:43:18,310 epoch 6 - iter 154/1546 - loss 0.01205148 - time (sec): 6.72 - samples/sec: 1874.61 - lr: 0.000016 - momentum: 0.000000
|
151 |
+
2023-10-16 23:43:25,177 epoch 6 - iter 308/1546 - loss 0.01547760 - time (sec): 13.58 - samples/sec: 1868.15 - lr: 0.000016 - momentum: 0.000000
|
152 |
+
2023-10-16 23:43:32,027 epoch 6 - iter 462/1546 - loss 0.02030563 - time (sec): 20.44 - samples/sec: 1818.03 - lr: 0.000016 - momentum: 0.000000
|
153 |
+
2023-10-16 23:43:38,895 epoch 6 - iter 616/1546 - loss 0.02101298 - time (sec): 27.30 - samples/sec: 1818.55 - lr: 0.000015 - momentum: 0.000000
|
154 |
+
2023-10-16 23:43:45,757 epoch 6 - iter 770/1546 - loss 0.02068268 - time (sec): 34.16 - samples/sec: 1826.89 - lr: 0.000015 - momentum: 0.000000
|
155 |
+
2023-10-16 23:43:52,699 epoch 6 - iter 924/1546 - loss 0.02111511 - time (sec): 41.11 - samples/sec: 1806.61 - lr: 0.000015 - momentum: 0.000000
|
156 |
+
2023-10-16 23:43:59,574 epoch 6 - iter 1078/1546 - loss 0.01946876 - time (sec): 47.98 - samples/sec: 1811.25 - lr: 0.000014 - momentum: 0.000000
|
157 |
+
2023-10-16 23:44:06,428 epoch 6 - iter 1232/1546 - loss 0.01960139 - time (sec): 54.84 - samples/sec: 1782.03 - lr: 0.000014 - momentum: 0.000000
|
158 |
+
2023-10-16 23:44:13,375 epoch 6 - iter 1386/1546 - loss 0.01983819 - time (sec): 61.78 - samples/sec: 1791.03 - lr: 0.000014 - momentum: 0.000000
|
159 |
+
2023-10-16 23:44:20,351 epoch 6 - iter 1540/1546 - loss 0.01996004 - time (sec): 68.76 - samples/sec: 1802.90 - lr: 0.000013 - momentum: 0.000000
|
160 |
+
2023-10-16 23:44:20,625 ----------------------------------------------------------------------------------------------------
|
161 |
+
2023-10-16 23:44:20,625 EPOCH 6 done: loss 0.0200 - lr: 0.000013
|
162 |
+
2023-10-16 23:44:22,734 DEV : loss 0.10681257396936417 - f1-score (micro avg) 0.7824
|
163 |
+
2023-10-16 23:44:22,747 ----------------------------------------------------------------------------------------------------
|
164 |
+
2023-10-16 23:44:29,520 epoch 7 - iter 154/1546 - loss 0.01882747 - time (sec): 6.77 - samples/sec: 1707.23 - lr: 0.000013 - momentum: 0.000000
|
165 |
+
2023-10-16 23:44:36,364 epoch 7 - iter 308/1546 - loss 0.01786356 - time (sec): 13.62 - samples/sec: 1706.79 - lr: 0.000013 - momentum: 0.000000
|
166 |
+
2023-10-16 23:44:43,172 epoch 7 - iter 462/1546 - loss 0.01399142 - time (sec): 20.42 - samples/sec: 1720.07 - lr: 0.000012 - momentum: 0.000000
|
167 |
+
2023-10-16 23:44:50,128 epoch 7 - iter 616/1546 - loss 0.01287695 - time (sec): 27.38 - samples/sec: 1754.57 - lr: 0.000012 - momentum: 0.000000
|
168 |
+
2023-10-16 23:44:57,096 epoch 7 - iter 770/1546 - loss 0.01298667 - time (sec): 34.35 - samples/sec: 1771.84 - lr: 0.000012 - momentum: 0.000000
|
169 |
+
2023-10-16 23:45:04,180 epoch 7 - iter 924/1546 - loss 0.01203834 - time (sec): 41.43 - samples/sec: 1779.29 - lr: 0.000011 - momentum: 0.000000
|
170 |
+
2023-10-16 23:45:11,016 epoch 7 - iter 1078/1546 - loss 0.01282603 - time (sec): 48.27 - samples/sec: 1800.52 - lr: 0.000011 - momentum: 0.000000
|
171 |
+
2023-10-16 23:45:17,812 epoch 7 - iter 1232/1546 - loss 0.01248353 - time (sec): 55.06 - samples/sec: 1805.09 - lr: 0.000011 - momentum: 0.000000
|
172 |
+
2023-10-16 23:45:24,596 epoch 7 - iter 1386/1546 - loss 0.01265255 - time (sec): 61.85 - samples/sec: 1798.93 - lr: 0.000010 - momentum: 0.000000
|
173 |
+
2023-10-16 23:45:31,426 epoch 7 - iter 1540/1546 - loss 0.01287871 - time (sec): 68.68 - samples/sec: 1803.61 - lr: 0.000010 - momentum: 0.000000
|
174 |
+
2023-10-16 23:45:31,687 ----------------------------------------------------------------------------------------------------
|
175 |
+
2023-10-16 23:45:31,687 EPOCH 7 done: loss 0.0128 - lr: 0.000010
|
176 |
+
2023-10-16 23:45:33,866 DEV : loss 0.10830121487379074 - f1-score (micro avg) 0.8008
|
177 |
+
2023-10-16 23:45:33,880 saving best model
|
178 |
+
2023-10-16 23:45:34,323 ----------------------------------------------------------------------------------------------------
|
179 |
+
2023-10-16 23:45:41,641 epoch 8 - iter 154/1546 - loss 0.01179072 - time (sec): 7.32 - samples/sec: 1690.40 - lr: 0.000010 - momentum: 0.000000
|
180 |
+
2023-10-16 23:45:48,999 epoch 8 - iter 308/1546 - loss 0.01016285 - time (sec): 14.67 - samples/sec: 1759.76 - lr: 0.000009 - momentum: 0.000000
|
181 |
+
2023-10-16 23:45:55,984 epoch 8 - iter 462/1546 - loss 0.01018965 - time (sec): 21.66 - samples/sec: 1748.65 - lr: 0.000009 - momentum: 0.000000
|
182 |
+
2023-10-16 23:46:02,875 epoch 8 - iter 616/1546 - loss 0.00940054 - time (sec): 28.55 - samples/sec: 1776.30 - lr: 0.000009 - momentum: 0.000000
|
183 |
+
2023-10-16 23:46:09,813 epoch 8 - iter 770/1546 - loss 0.00901390 - time (sec): 35.49 - samples/sec: 1789.22 - lr: 0.000008 - momentum: 0.000000
|
184 |
+
2023-10-16 23:46:17,225 epoch 8 - iter 924/1546 - loss 0.00873861 - time (sec): 42.90 - samples/sec: 1782.49 - lr: 0.000008 - momentum: 0.000000
|
185 |
+
2023-10-16 23:46:24,038 epoch 8 - iter 1078/1546 - loss 0.00860591 - time (sec): 49.71 - samples/sec: 1770.95 - lr: 0.000008 - momentum: 0.000000
|
186 |
+
2023-10-16 23:46:30,861 epoch 8 - iter 1232/1546 - loss 0.00864071 - time (sec): 56.54 - samples/sec: 1759.08 - lr: 0.000007 - momentum: 0.000000
|
187 |
+
2023-10-16 23:46:37,745 epoch 8 - iter 1386/1546 - loss 0.00860292 - time (sec): 63.42 - samples/sec: 1768.59 - lr: 0.000007 - momentum: 0.000000
|
188 |
+
2023-10-16 23:46:44,604 epoch 8 - iter 1540/1546 - loss 0.00880895 - time (sec): 70.28 - samples/sec: 1762.21 - lr: 0.000007 - momentum: 0.000000
|
189 |
+
2023-10-16 23:46:44,874 ----------------------------------------------------------------------------------------------------
|
190 |
+
2023-10-16 23:46:44,875 EPOCH 8 done: loss 0.0088 - lr: 0.000007
|
191 |
+
2023-10-16 23:46:46,985 DEV : loss 0.11089599132537842 - f1-score (micro avg) 0.7927
|
192 |
+
2023-10-16 23:46:46,998 ----------------------------------------------------------------------------------------------------
|
193 |
+
2023-10-16 23:46:53,836 epoch 9 - iter 154/1546 - loss 0.01019047 - time (sec): 6.84 - samples/sec: 1789.49 - lr: 0.000006 - momentum: 0.000000
|
194 |
+
2023-10-16 23:47:00,712 epoch 9 - iter 308/1546 - loss 0.00780744 - time (sec): 13.71 - samples/sec: 1843.01 - lr: 0.000006 - momentum: 0.000000
|
195 |
+
2023-10-16 23:47:07,681 epoch 9 - iter 462/1546 - loss 0.00633308 - time (sec): 20.68 - samples/sec: 1856.35 - lr: 0.000006 - momentum: 0.000000
|
196 |
+
2023-10-16 23:47:14,504 epoch 9 - iter 616/1546 - loss 0.00589385 - time (sec): 27.51 - samples/sec: 1833.30 - lr: 0.000005 - momentum: 0.000000
|
197 |
+
2023-10-16 23:47:21,527 epoch 9 - iter 770/1546 - loss 0.00517419 - time (sec): 34.53 - samples/sec: 1816.10 - lr: 0.000005 - momentum: 0.000000
|
198 |
+
2023-10-16 23:47:28,487 epoch 9 - iter 924/1546 - loss 0.00498572 - time (sec): 41.49 - samples/sec: 1806.16 - lr: 0.000005 - momentum: 0.000000
|
199 |
+
2023-10-16 23:47:35,321 epoch 9 - iter 1078/1546 - loss 0.00495726 - time (sec): 48.32 - samples/sec: 1788.16 - lr: 0.000004 - momentum: 0.000000
|
200 |
+
2023-10-16 23:47:42,255 epoch 9 - iter 1232/1546 - loss 0.00485612 - time (sec): 55.26 - samples/sec: 1794.97 - lr: 0.000004 - momentum: 0.000000
|
201 |
+
2023-10-16 23:47:49,207 epoch 9 - iter 1386/1546 - loss 0.00481666 - time (sec): 62.21 - samples/sec: 1795.14 - lr: 0.000004 - momentum: 0.000000
|
202 |
+
2023-10-16 23:47:56,096 epoch 9 - iter 1540/1546 - loss 0.00484866 - time (sec): 69.10 - samples/sec: 1790.64 - lr: 0.000003 - momentum: 0.000000
|
203 |
+
2023-10-16 23:47:56,367 ----------------------------------------------------------------------------------------------------
|
204 |
+
2023-10-16 23:47:56,367 EPOCH 9 done: loss 0.0048 - lr: 0.000003
|
205 |
+
2023-10-16 23:47:58,475 DEV : loss 0.12125992029905319 - f1-score (micro avg) 0.7942
|
206 |
+
2023-10-16 23:47:58,488 ----------------------------------------------------------------------------------------------------
|
207 |
+
2023-10-16 23:48:05,559 epoch 10 - iter 154/1546 - loss 0.00554820 - time (sec): 7.07 - samples/sec: 1779.17 - lr: 0.000003 - momentum: 0.000000
|
208 |
+
2023-10-16 23:48:12,504 epoch 10 - iter 308/1546 - loss 0.00532913 - time (sec): 14.01 - samples/sec: 1782.70 - lr: 0.000003 - momentum: 0.000000
|
209 |
+
2023-10-16 23:48:19,380 epoch 10 - iter 462/1546 - loss 0.00538433 - time (sec): 20.89 - samples/sec: 1740.74 - lr: 0.000002 - momentum: 0.000000
|
210 |
+
2023-10-16 23:48:26,473 epoch 10 - iter 616/1546 - loss 0.00450212 - time (sec): 27.98 - samples/sec: 1758.70 - lr: 0.000002 - momentum: 0.000000
|
211 |
+
2023-10-16 23:48:33,548 epoch 10 - iter 770/1546 - loss 0.00405722 - time (sec): 35.06 - samples/sec: 1785.83 - lr: 0.000002 - momentum: 0.000000
|
212 |
+
2023-10-16 23:48:40,558 epoch 10 - iter 924/1546 - loss 0.00351806 - time (sec): 42.07 - samples/sec: 1799.92 - lr: 0.000001 - momentum: 0.000000
|
213 |
+
2023-10-16 23:48:47,503 epoch 10 - iter 1078/1546 - loss 0.00321229 - time (sec): 49.01 - samples/sec: 1795.61 - lr: 0.000001 - momentum: 0.000000
|
214 |
+
2023-10-16 23:48:54,315 epoch 10 - iter 1232/1546 - loss 0.00327509 - time (sec): 55.83 - samples/sec: 1784.03 - lr: 0.000001 - momentum: 0.000000
|
215 |
+
2023-10-16 23:49:01,187 epoch 10 - iter 1386/1546 - loss 0.00338661 - time (sec): 62.70 - samples/sec: 1784.67 - lr: 0.000000 - momentum: 0.000000
|
216 |
+
2023-10-16 23:49:08,070 epoch 10 - iter 1540/1546 - loss 0.00333392 - time (sec): 69.58 - samples/sec: 1779.95 - lr: 0.000000 - momentum: 0.000000
|
217 |
+
2023-10-16 23:49:08,338 ----------------------------------------------------------------------------------------------------
|
218 |
+
2023-10-16 23:49:08,338 EPOCH 10 done: loss 0.0033 - lr: 0.000000
|
219 |
+
2023-10-16 23:49:10,368 DEV : loss 0.12140633165836334 - f1-score (micro avg) 0.8065
|
220 |
+
2023-10-16 23:49:10,380 saving best model
|
221 |
+
2023-10-16 23:49:11,227 ----------------------------------------------------------------------------------------------------
|
222 |
+
2023-10-16 23:49:11,228 Loading model from best epoch ...
|
223 |
+
2023-10-16 23:49:12,839 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
|
224 |
+
2023-10-16 23:49:18,915
|
225 |
+
Results:
|
226 |
+
- F-score (micro) 0.798
|
227 |
+
- F-score (macro) 0.6998
|
228 |
+
- Accuracy 0.6823
|
229 |
+
|
230 |
+
By class:
|
231 |
+
precision recall f1-score support
|
232 |
+
|
233 |
+
LOC 0.8416 0.8647 0.8530 946
|
234 |
+
BUILDING 0.5440 0.5351 0.5395 185
|
235 |
+
STREET 0.6833 0.7321 0.7069 56
|
236 |
+
|
237 |
+
micro avg 0.7891 0.8071 0.7980 1187
|
238 |
+
macro avg 0.6896 0.7107 0.6998 1187
|
239 |
+
weighted avg 0.7877 0.8071 0.7972 1187
|
240 |
+
|
241 |
+
2023-10-16 23:49:18,915 ----------------------------------------------------------------------------------------------------
|