Bill Psomas
commited on
Commit
•
3f47daa
1
Parent(s):
848db31
update readme
Browse files- README.md +54 -0
- checkpoint.pth +3 -0
- configs.yaml +45 -0
- log.txt +100 -0
README.md
CHANGED
@@ -1,3 +1,57 @@
|
|
1 |
---
|
2 |
license: cc-by-4.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: cc-by-4.0
|
3 |
+
datasets:
|
4 |
+
- imagenet-1k
|
5 |
+
metrics:
|
6 |
+
- accuracy
|
7 |
+
pipeline_tag: image-classification
|
8 |
+
language:
|
9 |
+
- en
|
10 |
+
tags:
|
11 |
+
- resnet
|
12 |
+
- convolutional neural network
|
13 |
+
- simpool
|
14 |
+
- dino
|
15 |
+
- computer vision
|
16 |
+
- deep learning
|
17 |
---
|
18 |
+
|
19 |
+
# Self-supervised ResNet-50 model with SimPool
|
20 |
+
|
21 |
+
ResNet-50 model with SimPool (gamma=2.0) trained on ImageNet-1k for 100 epochs. Self-supervision with [DINO](https://arxiv.org/abs/2104.14294).
|
22 |
+
|
23 |
+
SimPool is a simple attention-based pooling method at the end of network, introduced on this ICCV 2023 [paper](https://arxiv.org/pdf/2309.06891.pdf) and released in this [repository](https://github.com/billpsomas/simpool/).
|
24 |
+
Disclaimer: This model card is written by the author of SimPool, i.e. [Bill Psomas](http://users.ntua.gr/psomasbill/).
|
25 |
+
|
26 |
+
## Motivation
|
27 |
+
|
28 |
+
Convolutional networks and vision transformers have different forms of pairwise interactions, pooling across layers and pooling at the end of the network. Does the latter really need to be different?
|
29 |
+
As a by-product of pooling, vision transformers provide spatial attention for free, but this is most often of low quality unless self-supervised, which is not well studied. Is supervision really the problem?
|
30 |
+
|
31 |
+
## Method
|
32 |
+
|
33 |
+
SimPool is a simple attention-based pooling mechanism as a replacement of the default one for both convolutional and transformer encoders. For transformers, we completely discard the [CLS] token.
|
34 |
+
Interestingly, we find that, whether supervised or self-supervised, SimPool improves performance on pre-training and downstream tasks and provides attention maps delineating object boundaries in all cases.
|
35 |
+
One could thus call SimPool universal.
|
36 |
+
|
37 |
+
## Evaluation with k-NN
|
38 |
+
|
39 |
+
| k | top1 | top5 |
|
40 |
+
| ------- | ------- | ------- |
|
41 |
+
| 10 | 63.828 | 81.82 |
|
42 |
+
| 20 | 63.502 | 83.824 |
|
43 |
+
| 100 | 60.658 | 84.716 |
|
44 |
+
| 200 | 58.66 | 83.846 |
|
45 |
+
|
46 |
+
## BibTeX entry and citation info
|
47 |
+
|
48 |
+
```
|
49 |
+
@misc{psomas2023simpool,
|
50 |
+
title={Keep It SimPool: Who Said Supervised Transformers Suffer from Attention Deficit?},
|
51 |
+
author={Bill Psomas and Ioannis Kakogeorgiou and Konstantinos Karantzalos and Yannis Avrithis},
|
52 |
+
year={2023},
|
53 |
+
eprint={2309.06891},
|
54 |
+
archivePrefix={arXiv},
|
55 |
+
primaryClass={cs.CV}
|
56 |
+
}
|
57 |
+
```
|
checkpoint.pth
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1b3d901d0f9cd5582abed27c89f41a2968cb5f9a7dc0f58f96c7a32351367430
|
3 |
+
size 675750049
|
configs.yaml
ADDED
@@ -0,0 +1,45 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
arch: resnet50
|
2 |
+
backend: nccl
|
3 |
+
batch_size_per_gpu: 80
|
4 |
+
clip_grad: 0.0
|
5 |
+
data_path: /path/to/imagenet/
|
6 |
+
dist_url: env://
|
7 |
+
drop_path_rate: 0.1
|
8 |
+
epochs: 100
|
9 |
+
eval_every: 10
|
10 |
+
freeze_last_layer: 1
|
11 |
+
global_crops_scale:
|
12 |
+
- 0.14
|
13 |
+
- 1.0
|
14 |
+
local_crops_number: 6
|
15 |
+
local_crops_scale:
|
16 |
+
- 0.05
|
17 |
+
- 0.14
|
18 |
+
local_rank: 0
|
19 |
+
lr: 0.3
|
20 |
+
min_lr: 0.0048
|
21 |
+
mode: simpool
|
22 |
+
momentum_teacher: 0.996
|
23 |
+
nb_knn:
|
24 |
+
- 10
|
25 |
+
- 20
|
26 |
+
- 100
|
27 |
+
- 200
|
28 |
+
norm_last_layer: true
|
29 |
+
num_workers: 10
|
30 |
+
optimizer: lars
|
31 |
+
out_dim: 60000
|
32 |
+
output_dir: /path/to/output/
|
33 |
+
patch_size: 16
|
34 |
+
saveckp_freq: 20
|
35 |
+
seed: 0
|
36 |
+
subset: -1
|
37 |
+
teacher_temp: 0.07
|
38 |
+
temperature: 0.07
|
39 |
+
use_bn_in_head: true
|
40 |
+
use_fp16: false
|
41 |
+
warmup_epochs: 10
|
42 |
+
warmup_teacher_temp: 0.04
|
43 |
+
warmup_teacher_temp_epochs: 50
|
44 |
+
weight_decay: 1.0e-06
|
45 |
+
weight_decay_end: 1.0e-06
|
log.txt
ADDED
@@ -0,0 +1,100 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{"train_loss": 10.449837056398392, "train_entropy": 9.389393039047718, "train_KL_div": 1.0604439457431436, "train_lr": 0.07493249324932494, "train_wd": 1.000000000000015e-06, "epoch": 0, "k-NN": {"10": {"top1": 2.36, "top5": 5.818}, "20": {"top1": 2.73, "top5": 6.41}, "100": {"top1": 3.178, "top5": 8.764}, "200": {"top1": 3.292, "top5": 9.318}}}
|
2 |
+
{"train_loss": 10.167079906046391, "train_entropy": 9.65371017241478, "train_KL_div": 0.5133696808293462, "train_lr": 0.22494749474947498, "train_wd": 1.000000000000015e-06, "epoch": 1}
|
3 |
+
{"train_loss": 9.468380255460739, "train_entropy": 8.58747710877657, "train_KL_div": 0.880903076402843, "train_lr": 0.37496249624962497, "train_wd": 1.000000000000015e-06, "epoch": 2}
|
4 |
+
{"train_loss": 8.479013512432575, "train_entropy": 7.222656021177769, "train_KL_div": 1.256357355952263, "train_lr": 0.524977497749775, "train_wd": 1.000000000000015e-06, "epoch": 3}
|
5 |
+
{"train_loss": 7.515101688563823, "train_entropy": 5.9015126974582675, "train_KL_div": 1.6135889759212731, "train_lr": 0.674992499249925, "train_wd": 1.000000000000015e-06, "epoch": 4}
|
6 |
+
{"train_loss": 6.680009725153446, "train_entropy": 4.778078202843666, "train_KL_div": 1.9019315302670001, "train_lr": 0.8250075007500748, "train_wd": 1.000000000000015e-06, "epoch": 5}
|
7 |
+
{"train_loss": 6.12774421530962, "train_entropy": 4.047231483012438, "train_KL_div": 2.0805127462893727, "train_lr": 0.9750225022502252, "train_wd": 1.000000000000015e-06, "epoch": 6}
|
8 |
+
{"train_loss": 5.788710211277008, "train_entropy": 3.603530557900667, "train_KL_div": 2.1851796337515115, "train_lr": 1.1250375037503753, "train_wd": 1.000000000000015e-06, "epoch": 7}
|
9 |
+
{"train_loss": 5.561759056508541, "train_entropy": 3.325290210336447, "train_KL_div": 2.2364688062518834, "train_lr": 1.2750525052505253, "train_wd": 1.000000000000015e-06, "epoch": 8}
|
10 |
+
{"train_loss": 5.391559284448624, "train_entropy": 3.138916272819042, "train_KL_div": 2.2526429802775385, "train_lr": 1.4250675067506746, "train_wd": 1.000000000000015e-06, "epoch": 9}
|
11 |
+
{"train_loss": 5.2455336748361585, "train_entropy": 3.003100367337465, "train_KL_div": 2.2424333059489725, "train_lr": 1.4998484155601604, "train_wd": 1.000000000000015e-06, "epoch": 10, "k-NN": {"10": {"top1": 40.368, "top5": 60.464}, "20": {"top1": 40.57, "top5": 63.354}, "100": {"top1": 37.946, "top5": 64.804}, "200": {"top1": 36.05, "top5": 63.766}}}
|
12 |
+
{"train_loss": 5.124244156062603, "train_entropy": 2.903090033352375, "train_KL_div": 2.2211541321873667, "train_lr": 1.4989382202191224, "train_wd": 1.000000000000015e-06, "epoch": 11}
|
13 |
+
{"train_loss": 5.0256589850187305, "train_entropy": 2.8311997632086277, "train_KL_div": 2.1944592456817626, "train_lr": 1.4971184830521438, "train_wd": 1.000000000000015e-06, "epoch": 12}
|
14 |
+
{"train_loss": 4.943505420267582, "train_entropy": 2.7756613405048847, "train_KL_div": 2.1678441066741945, "train_lr": 1.494391421128656, "train_wd": 1.000000000000015e-06, "epoch": 13}
|
15 |
+
{"train_loss": 4.879734697937965, "train_entropy": 2.7337953796088694, "train_KL_div": 2.1459393372386693, "train_lr": 1.4907603569535401, "train_wd": 1.000000000000015e-06, "epoch": 14}
|
16 |
+
{"train_loss": 4.82360738325119, "train_entropy": 2.7040232205688954, "train_KL_div": 2.119584177866578, "train_lr": 1.4862297144191705, "train_wd": 1.000000000000015e-06, "epoch": 15}
|
17 |
+
{"train_loss": 4.777354749560356, "train_entropy": 2.681550843566656, "train_KL_div": 2.095803919225931, "train_lr": 1.4808050134155832, "train_wd": 1.000000000000015e-06, "epoch": 16}
|
18 |
+
{"train_loss": 4.740407183349133, "train_entropy": 2.666192293405533, "train_KL_div": 2.074214899018407, "train_lr": 1.4744928631053371, "train_wd": 1.000000000000015e-06, "epoch": 17}
|
19 |
+
{"train_loss": 4.711565686523914, "train_entropy": 2.658193520128727, "train_KL_div": 2.0533721644580365, "train_lr": 1.4673009538712782, "train_wd": 1.000000000000015e-06, "epoch": 18}
|
20 |
+
{"train_loss": 4.689444116234779, "train_entropy": 2.657263245970011, "train_KL_div": 2.0321808568388224, "train_lr": 1.4592380479469773, "train_wd": 1.000000000000015e-06, "epoch": 19}
|
21 |
+
{"train_loss": 4.6621909416913985, "train_entropy": 2.6582521418631075, "train_KL_div": 2.0039387890696525, "train_lr": 1.450313968741308, "train_wd": 1.000000000000015e-06, "epoch": 20, "k-NN": {"10": {"top1": 53.63, "top5": 73.67}, "20": {"top1": 53.372, "top5": 76.126}, "100": {"top1": 50.418, "top5": 77.084}, "200": {"top1": 48.432, "top5": 76.132}}}
|
22 |
+
{"train_loss": 4.647263663291931, "train_entropy": 2.6619999133348466, "train_KL_div": 1.985263742864132, "train_lr": 1.4405395888701316, "train_wd": 1.000000000000015e-06, "epoch": 21}
|
23 |
+
{"train_loss": 4.631990493118763, "train_entropy": 2.669117329120636, "train_KL_div": 1.9628731496036054, "train_lr": 1.4299268169096957, "train_wd": 1.000000000000015e-06, "epoch": 22}
|
24 |
+
{"train_loss": 4.619528388857842, "train_entropy": 2.6794597724974154, "train_KL_div": 1.9400685989707709, "train_lr": 1.4184885828878597, "train_wd": 1.000000000000015e-06, "epoch": 23}
|
25 |
+
{"train_loss": 4.611837514638901, "train_entropy": 2.6918885149657727, "train_KL_div": 1.919948979064822, "train_lr": 1.4062388225308553, "train_wd": 1.000000000000015e-06, "epoch": 24}
|
26 |
+
{"train_loss": 4.603427241146565, "train_entropy": 2.7055979170501234, "train_KL_div": 1.8978293050229549, "train_lr": 1.39319246028475, "train_wd": 1.000000000000015e-06, "epoch": 25}
|
27 |
+
{"train_loss": 4.5979758430719375, "train_entropy": 2.72082040977478, "train_KL_div": 1.87715540663898, "train_lr": 1.3793653911322932, "train_wd": 1.000000000000015e-06, "epoch": 26}
|
28 |
+
{"train_loss": 4.593641734242439, "train_entropy": 2.7375692039728166, "train_KL_div": 1.8560725199133157, "train_lr": 1.3647744612273618, "train_wd": 1.000000000000015e-06, "epoch": 27}
|
29 |
+
{"train_loss": 4.595699313998223, "train_entropy": 2.757804524928331, "train_KL_div": 1.8378947650045157, "train_lr": 1.3494374473704784, "train_wd": 1.000000000000015e-06, "epoch": 28}
|
30 |
+
{"train_loss": 4.596212774693966, "train_entropy": 2.7804619541168214, "train_KL_div": 1.815750805452466, "train_lr": 1.3333730353505442, "train_wd": 1.000000000000015e-06, "epoch": 29}
|
31 |
+
{"train_loss": 4.598976962089538, "train_entropy": 2.8030542490184307, "train_KL_div": 1.795922696441412, "train_lr": 1.3166007971790659, "train_wd": 1.000000000000015e-06, "epoch": 30, "k-NN": {"10": {"top1": 58.68, "top5": 78.158}, "20": {"top1": 58.574, "top5": 80.422}, "100": {"top1": 55.572, "top5": 81.364}, "200": {"top1": 53.522, "top5": 80.32}}}
|
32 |
+
{"train_loss": 4.601598667144775, "train_entropy": 2.828574624478817, "train_KL_div": 1.7730240271687507, "train_lr": 1.299141167244701, "train_wd": 1.000000000000015e-06, "epoch": 31}
|
33 |
+
{"train_loss": 4.606501205146313, "train_entropy": 2.8556448546946047, "train_KL_div": 1.750856324851513, "train_lr": 1.2810154174170678, "train_wd": 1.000000000000015e-06, "epoch": 32}
|
34 |
+
{"train_loss": 4.618217970252037, "train_entropy": 2.884929783701897, "train_KL_div": 1.7332881642729043, "train_lr": 1.2622456311302719, "train_wd": 1.000000000000015e-06, "epoch": 33}
|
35 |
+
{"train_loss": 4.624960979044437, "train_entropy": 2.9153065445423127, "train_KL_div": 1.7096543975025416, "train_lr": 1.242854676477644, "train_wd": 1.000000000000015e-06, "epoch": 34}
|
36 |
+
{"train_loss": 4.634613274753094, "train_entropy": 2.946100403189659, "train_KL_div": 1.6885128435194492, "train_lr": 1.222866178350482, "train_wd": 1.000000000000015e-06, "epoch": 35}
|
37 |
+
{"train_loss": 4.647865023553371, "train_entropy": 2.9811553439497946, "train_KL_div": 1.6667096441090108, "train_lr": 1.2023044896547554, "train_wd": 1.000000000000015e-06, "epoch": 36}
|
38 |
+
{"train_loss": 4.661201903760433, "train_entropy": 3.016474710315466, "train_KL_div": 1.6447271532565355, "train_lr": 1.181194661640857, "train_wd": 1.000000000000015e-06, "epoch": 37}
|
39 |
+
{"train_loss": 4.6744207764863965, "train_entropy": 3.0524465629458426, "train_KL_div": 1.6219741736352444, "train_lr": 1.1595624133825075, "train_wd": 1.000000000000015e-06, "epoch": 38}
|
40 |
+
{"train_loss": 4.693183571636677, "train_entropy": 3.09200253623724, "train_KL_div": 1.6011809964329005, "train_lr": 1.1374341004420114, "train_wd": 1.000000000000015e-06, "epoch": 39}
|
41 |
+
{"train_loss": 4.711371740698814, "train_entropy": 3.1325233362019063, "train_KL_div": 1.578848370999098, "train_lr": 1.114836682760085, "train_wd": 1.000000000000015e-06, "epoch": 40, "k-NN": {"10": {"top1": 61.32, "top5": 80.242}, "20": {"top1": 61.088, "top5": 82.392}, "100": {"top1": 58.196, "top5": 83.298}, "200": {"top1": 56.278, "top5": 82.364}}}
|
42 |
+
{"train_loss": 4.730860756337643, "train_entropy": 3.175362041980028, "train_KL_div": 1.55549867272377, "train_lr": 1.0917976918093049, "train_wd": 1.000000000000015e-06, "epoch": 41}
|
43 |
+
{"train_loss": 4.754233127295971, "train_entropy": 3.220520591288805, "train_KL_div": 1.533712494507432, "train_lr": 1.0683451970512654, "train_wd": 1.000000000000015e-06, "epoch": 42}
|
44 |
+
{"train_loss": 4.78042931753397, "train_entropy": 3.269156540900469, "train_KL_div": 1.5112727368921042, "train_lr": 1.0445077717382412, "train_wd": 1.000000000000015e-06, "epoch": 43}
|
45 |
+
{"train_loss": 4.8067700149416925, "train_entropy": 3.3187265184819696, "train_KL_div": 1.4880434669405223, "train_lr": 1.0203144581011085, "train_wd": 1.000000000000015e-06, "epoch": 44}
|
46 |
+
{"train_loss": 4.837271986544132, "train_entropy": 3.3728389540314674, "train_KL_div": 1.4644330016374587, "train_lr": 0.9957947319658386, "train_wd": 1.000000000000015e-06, "epoch": 45}
|
47 |
+
{"train_loss": 4.867743356406689, "train_entropy": 3.427095182299614, "train_KL_div": 1.440648139283061, "train_lr": 0.9709784668417526, "train_wd": 1.000000000000015e-06, "epoch": 46}
|
48 |
+
{"train_loss": 4.902647614359855, "train_entropy": 3.485941599428654, "train_KL_div": 1.4167059880495072, "train_lr": 0.9458958975252506, "train_wd": 1.000000000000015e-06, "epoch": 47}
|
49 |
+
{"train_loss": 4.936469689190388, "train_entropy": 3.544396564155817, "train_KL_div": 1.392073102414608, "train_lr": 0.9205775832633725, "train_wd": 1.000000000000015e-06, "epoch": 48}
|
50 |
+
{"train_loss": 4.9753448957204816, "train_entropy": 3.606009334564209, "train_KL_div": 1.3693355347663163, "train_lr": 0.8950543705220573, "train_wd": 1.000000000000015e-06, "epoch": 49}
|
51 |
+
{"train_loss": 4.968914308726788, "train_entropy": 3.60816898432374, "train_KL_div": 1.3607453027963639, "train_lr": 0.8693573554044859, "train_wd": 1.000000000000015e-06, "epoch": 50, "k-NN": {"10": {"top1": 62.606, "top5": 81.258}, "20": {"top1": 62.51, "top5": 83.434}, "100": {"top1": 59.706, "top5": 84.204}, "200": {"top1": 57.664, "top5": 83.348}}}
|
52 |
+
{"train_loss": 4.963112980544567, "train_entropy": 3.6079727408289908, "train_KL_div": 1.3551402306109668, "train_lr": 0.8435178457652498, "train_wd": 1.000000000000015e-06, "epoch": 51}
|
53 |
+
{"train_loss": 4.955107469856739, "train_entropy": 3.605429495513439, "train_KL_div": 1.3496779587417842, "train_lr": 0.81756732306658, "train_wd": 1.000000000000015e-06, "epoch": 52}
|
54 |
+
{"train_loss": 4.947526503682137, "train_entropy": 3.6032412466406822, "train_KL_div": 1.3442852396517992, "train_lr": 0.7915374040230085, "train_wd": 1.000000000000015e-06, "epoch": 53}
|
55 |
+
{"train_loss": 4.939839230179786, "train_entropy": 3.5994246728420256, "train_KL_div": 1.340414534404874, "train_lr": 0.7654598020812908, "train_wd": 1.000000000000015e-06, "epoch": 54}
|
56 |
+
{"train_loss": 4.929350660145283, "train_entropy": 3.5949827539026735, "train_KL_div": 1.334367891728878, "train_lr": 0.739366288782445, "train_wd": 1.000000000000015e-06, "epoch": 55}
|
57 |
+
{"train_loss": 4.918193859159946, "train_entropy": 3.589451118350029, "train_KL_div": 1.3287427161484957, "train_lr": 0.7132886550530276, "train_wd": 1.000000000000015e-06, "epoch": 56}
|
58 |
+
{"train_loss": 4.908644145309925, "train_entropy": 3.5840909039676188, "train_KL_div": 1.324553216382861, "train_lr": 0.6872586724727882, "train_wd": 1.000000000000015e-06, "epoch": 57}
|
59 |
+
{"train_loss": 4.898883959114552, "train_entropy": 3.5798143591284752, "train_KL_div": 1.3190695780962705, "train_lr": 0.661308054565888, "train_wd": 1.000000000000015e-06, "epoch": 58}
|
60 |
+
{"train_loss": 4.887877831161022, "train_entropy": 3.5738284061849117, "train_KL_div": 1.3140494076013565, "train_lr": 0.6354684181628609, "train_wd": 1.000000000000015e-06, "epoch": 59}
|
61 |
+
{"train_loss": 4.876774698376655, "train_entropy": 3.5685866465270517, "train_KL_div": 1.308188029706478, "train_lr": 0.6097712448803728, "train_wd": 1.000000000000015e-06, "epoch": 60, "k-NN": {"10": {"top1": 63.514, "top5": 81.712}, "20": {"top1": 63.278, "top5": 83.788}, "100": {"top1": 60.484, "top5": 84.738}, "200": {"top1": 58.544, "top5": 83.858}}}
|
62 |
+
{"train_loss": 4.867280040442943, "train_entropy": 3.5640356072187425, "train_KL_div": 1.3032444142103194, "train_lr": 0.5842478427657233, "train_wd": 1.000000000000015e-06, "epoch": 61}
|
63 |
+
{"train_loss": 4.855790178358554, "train_entropy": 3.557568785637617, "train_KL_div": 1.2982213801592588, "train_lr": 0.5589293081528082, "train_wd": 1.000000000000015e-06, "epoch": 62}
|
64 |
+
{"train_loss": 4.847070850729942, "train_entropy": 3.553624374985695, "train_KL_div": 1.293446442693472, "train_lr": 0.5338464877760347, "train_wd": 1.000000000000015e-06, "epoch": 63}
|
65 |
+
{"train_loss": 4.8365606622695925, "train_entropy": 3.548783615887165, "train_KL_div": 1.2877770300209521, "train_lr": 0.5090299411883176, "train_wd": 1.000000000000015e-06, "epoch": 64}
|
66 |
+
{"train_loss": 4.826163962960243, "train_entropy": 3.54379882606864, "train_KL_div": 1.2823651144653558, "train_lr": 0.48450990352897855, "train_wd": 1.000000000000015e-06, "epoch": 65}
|
67 |
+
{"train_loss": 4.8161005507111545, "train_entropy": 3.538640930235386, "train_KL_div": 1.2774595837146043, "train_lr": 0.4603162486868846, "train_wd": 1.000000000000015e-06, "epoch": 66}
|
68 |
+
{"train_loss": 4.804622949123383, "train_entropy": 3.533218850046396, "train_KL_div": 1.2714040691405535, "train_lr": 0.4364784529037116, "train_wd": 1.000000000000015e-06, "epoch": 67}
|
69 |
+
{"train_loss": 4.795523640930653, "train_entropy": 3.529409927397966, "train_KL_div": 1.2661136866286398, "train_lr": 0.41302555886169284, "train_wd": 1.000000000000015e-06, "epoch": 68}
|
70 |
+
{"train_loss": 4.785011376559734, "train_entropy": 3.525017201423645, "train_KL_div": 1.2599941370785237, "train_lr": 0.38998614029957523, "train_wd": 1.000000000000015e-06, "epoch": 69}
|
71 |
+
{"train_loss": 4.775844187557698, "train_entropy": 3.5197538111507893, "train_KL_div": 1.2560903428643941, "train_lr": 0.36738826719992773, "train_wd": 1.000000000000015e-06, "epoch": 70, "k-NN": {"10": {"top1": 63.702, "top5": 81.938}, "20": {"top1": 63.478, "top5": 83.982}, "100": {"top1": 60.828, "top5": 84.75}, "200": {"top1": 58.816, "top5": 83.878}}}
|
72 |
+
{"train_loss": 4.766067259669304, "train_entropy": 3.5165095616281032, "train_KL_div": 1.2495576706379652, "train_lr": 0.34525947159018555, "train_wd": 1.000000000000015e-06, "epoch": 71}
|
73 |
+
{"train_loss": 4.757967102885246, "train_entropy": 3.511660830795765, "train_KL_div": 1.24630623729527, "train_lr": 0.32362671399911985, "train_wd": 1.000000000000015e-06, "epoch": 72}
|
74 |
+
{"train_loss": 4.747447923123836, "train_entropy": 3.50767432564497, "train_KL_div": 1.2397735749036074, "train_lr": 0.30251635060958443, "train_wd": 1.000000000000015e-06, "epoch": 73}
|
75 |
+
{"train_loss": 4.7385219835639, "train_entropy": 3.503481776356697, "train_KL_div": 1.2350401680916547, "train_lr": 0.2819541011475686, "train_wd": 1.000000000000015e-06, "epoch": 74}
|
76 |
+
{"train_loss": 4.729638677358627, "train_entropy": 3.500288457751274, "train_KL_div": 1.2293501847088337, "train_lr": 0.26196501754666823, "train_wd": 1.000000000000015e-06, "epoch": 75}
|
77 |
+
{"train_loss": 4.721902140021324, "train_entropy": 3.497050025075674, "train_KL_div": 1.224852084979415, "train_lr": 0.24257345342617007, "train_wd": 1.000000000000015e-06, "epoch": 76}
|
78 |
+
{"train_loss": 4.715184380650521, "train_entropy": 3.49575875338912, "train_KL_div": 1.2194255896955728, "train_lr": 0.2238030344199124, "train_wd": 1.000000000000015e-06, "epoch": 77}
|
79 |
+
{"train_loss": 4.707324535787105, "train_entropy": 3.492444661796093, "train_KL_div": 1.2148798391669988, "train_lr": 0.20567662939209372, "train_wd": 1.000000000000015e-06, "epoch": 78}
|
80 |
+
{"train_loss": 4.701352820754051, "train_entropy": 3.491047471255064, "train_KL_div": 1.2103053124994039, "train_lr": 0.18821632257508172, "train_wd": 1.000000000000015e-06, "epoch": 79}
|
81 |
+
{"train_loss": 4.694893266558648, "train_entropy": 3.489258858293295, "train_KL_div": 1.2056343666762113, "train_lr": 0.1714433866631785, "train_wd": 1.000000000000015e-06, "epoch": 80, "k-NN": {"10": {"top1": 63.842, "top5": 82.0}, "20": {"top1": 63.654, "top5": 83.898}, "100": {"top1": 60.766, "top5": 84.776}, "200": {"top1": 58.784, "top5": 83.946}}}
|
82 |
+
{"train_loss": 4.689299988627433, "train_entropy": 3.48850383323431, "train_KL_div": 1.2007961137667298, "train_lr": 0.1553782568951199, "train_wd": 1.000000000000015e-06, "epoch": 81}
|
83 |
+
{"train_loss": 4.684984575450421, "train_entropy": 3.488232976526022, "train_KL_div": 1.196751569621265, "train_lr": 0.14004050615688546, "train_wd": 1.000000000000015e-06, "epoch": 82}
|
84 |
+
{"train_loss": 4.67952576893568, "train_entropy": 3.4863946334719658, "train_KL_div": 1.1931311101168394, "train_lr": 0.1254488211351497, "train_wd": 1.000000000000015e-06, "epoch": 83}
|
85 |
+
{"train_loss": 4.674294273018837, "train_entropy": 3.486509329199791, "train_KL_div": 1.1877849027067422, "train_lr": 0.11162097955043498, "train_wd": 1.000000000000015e-06, "epoch": 84}
|
86 |
+
{"train_loss": 4.67085353410244, "train_entropy": 3.4866335148513317, "train_KL_div": 1.1842199771180748, "train_lr": 0.09857382849769691, "train_wd": 1.000000000000015e-06, "epoch": 85}
|
87 |
+
{"train_loss": 4.667921773612499, "train_entropy": 3.486697638005018, "train_KL_div": 1.1812240997850896, "train_lr": 0.0863232639207329, "train_wd": 1.000000000000015e-06, "epoch": 86}
|
88 |
+
{"train_loss": 4.663681262671948, "train_entropy": 3.486406087011099, "train_KL_div": 1.177275151245296, "train_lr": 0.07488421124542628, "train_wd": 1.000000000000015e-06, "epoch": 87}
|
89 |
+
{"train_loss": 4.6620805332064625, "train_entropy": 3.4878364756703375, "train_KL_div": 1.1742440226003528, "train_lr": 0.06427060719540977, "train_wd": 1.000000000000015e-06, "epoch": 88}
|
90 |
+
{"train_loss": 4.658433960437774, "train_entropy": 3.486937234252691, "train_KL_div": 1.1714966863766312, "train_lr": 0.054495382812318714, "train_wd": 1.000000000000015e-06, "epoch": 89}
|
91 |
+
{"train_loss": 4.655751956164837, "train_entropy": 3.487239393979311, "train_KL_div": 1.1685125179886817, "train_lr": 0.04557044770130586, "train_wd": 1.000000000000015e-06, "epoch": 90, "k-NN": {"10": {"top1": 63.758, "top5": 81.838}, "20": {"top1": 63.522, "top5": 83.82}, "100": {"top1": 60.678, "top5": 84.75}, "200": {"top1": 58.682, "top5": 83.844}}}
|
92 |
+
{"train_loss": 4.654251498639583, "train_entropy": 3.488241710752249, "train_KL_div": 1.1660097463428973, "train_lr": 0.03750667552102294, "train_wd": 1.000000000000015e-06, "epoch": 91}
|
93 |
+
{"train_loss": 4.653276021361351, "train_entropy": 3.4884550351798533, "train_KL_div": 1.1648209442570805, "train_lr": 0.030313890735742997, "train_wd": 1.000000000000015e-06, "epoch": 92}
|
94 |
+
{"train_loss": 4.651876353561878, "train_entropy": 3.488577649265528, "train_KL_div": 1.1632986719980836, "train_lr": 0.024000856645763145, "train_wd": 1.000000000000015e-06, "epoch": 93}
|
95 |
+
{"train_loss": 4.649052367269993, "train_entropy": 3.488668581187725, "train_KL_div": 1.1603837516382336, "train_lr": 0.018575264710673736, "train_wd": 1.000000000000015e-06, "epoch": 94}
|
96 |
+
{"train_loss": 4.6480642236471175, "train_entropy": 3.489142097502947, "train_KL_div": 1.1589220889136196, "train_lr": 0.014043725178499274, "train_wd": 1.000000000000015e-06, "epoch": 95}
|
97 |
+
{"train_loss": 4.64655362045765, "train_entropy": 3.4882131513655183, "train_KL_div": 1.158340444765985, "train_lr": 0.01041175903212961, "train_wd": 1.000000000000015e-06, "epoch": 96}
|
98 |
+
{"train_loss": 4.646539884030819, "train_entropy": 3.489233550876379, "train_KL_div": 1.1573062996640802, "train_lr": 0.007683791262852558, "train_wd": 1.000000000000015e-06, "epoch": 97}
|
99 |
+
{"train_loss": 4.645518612086773, "train_entropy": 3.4883745178878307, "train_KL_div": 1.1571440563127398, "train_lr": 0.005863145479183758, "train_wd": 1.000000000000015e-06, "epoch": 98}
|
100 |
+
{"train_loss": 4.644806251823902, "train_entropy": 3.488529181241989, "train_KL_div": 1.1562770416736603, "train_lr": 0.004952039857561666, "train_wd": 1.000000000000015e-06, "epoch": 99, "k-NN": {"10": {"top1": 63.828, "top5": 81.82}, "20": {"top1": 63.502, "top5": 83.824}, "100": {"top1": 60.658, "top5": 84.716}, "200": {"top1": 58.66, "top5": 83.846}}}
|