Trick or ResNet Treat

Community Article Published October 31, 2024

A small 🎃 treat, I just uploaded a few small ResNets trained like they've never been trained before. A user request, I threw a recent hparam set (MobileNet-v4 Conv Small x ResNet Strikes Back / timm, ra4 in the tables) at the 'Basic Block' ResNet-18 & 34, including V2 (pre-activation) variants.

The results were good! ResNet-18s at 73-74% and 34s at 77-78%, oh my!

See the table below for some context. I included some past 'best' ResNet results,

ResNet Strikes Back (a1, a1h)
torchvision 'Batteries Included' Resnets (tv2) that followed RSB
O.G. torchvision (tv) ResNets in 18-50 range

I did actually train a D variant ResNet50 w/ similar ra4 hparams, but they didn't improve upon past quite as much, likely need further hparam tweaks, more augreg.

model	img_size	top1	top5	param_count
resnet50d.ra4_e3600_r224_in1k	224	80.958	95.372	25.58
resnet50.tv2_in1k	224	80.856	95.43	25.56
resnet50d.a1_in1k	224	80.686	94.712	25.58
resnet50.a1h_in1k	224	80.662	95.306	25.56
resnet50.a1_in1k	224	80.368	94.59	25.56
resnetv2_34d.ra4_e3600_r224_in1k	224	78.268	93.956	21.82
resnetv2_34.ra4_e3600_r224_in1k	224	77.636	93.528	21.8
resnet34.ra4_e3600_r224_in1k	224	77.448	93.502	21.8
resnet34.a1_in1k	224	76.428	92.88	21.8
resnet50.tv_in1k	224	76.128	92.858	25.56
resnetv2_18d.ra4_e3600_r224_in1k	224	74.412	91.928	11.71
resnet18d.ra4_e3600_r224_in1k	224	74.324	91.832	11.71
resnetv2_18.ra4_e3600_r224_in1k	224	73.578	91.352	11.69
resnet34.tv_in1k	224	73.316	91.422	21.8
resnet18.a1_in1k	224	71.49	90.076	11.69
resnet18.tv_in1k	224	69.758	89.074	11.69

The new weights all scale quite nicely to higher resolutions at inference time. Some points of interest here. The tv2 and a1h ResNet50 were trained at 176x176 resolution and by evaluating at 224x224 they were attempting to hit the 'peak' in the train-test resolution discrepancy (https://arxiv.org/abs/1906.06423). When I was working on the RSB recipe I did not want to sacrifice higher res scaling by trying to bag the peak for 224x224 eval, only the low-cost a3 trained at lower res. You can see that in this 288x288 table, the a1 RSB has more to give, the tv2 is already on the downslope, and a1h just past peak. These new ra4 are res scaling champs and have a bit more to give.

model	img_size	top1	top5	param_count
resnet50d.ra4_e3600_r224_in1k	288	81.812	95.91	25.58
resnet50d.a1_in1k	288	81.45	95.216	25.58
resnet50.a1_in1k	288	81.232	95.108	25.56
resnet50.a1h_in1k	288	80.914	95.516	25.56
resnet50.tv2_in1k	288	80.87	95.646	25.56
resnetv2_34d.ra4_e3600_r224_in1k	288	79.59	94.77	21.82
resnetv2_34.ra4_e3600_r224_in1k	288	79.072	94.566	21.8
resnet34.ra4_e3600_r224_in1k	288	78.952	94.45	21.8
resnet34.a1_in1k	288	77.91	93.768	21.8
resnet50.tv_in1k	288	77.252	93.606	25.56
resnetv2_18d.ra4_e3600_r224_in1k	288	76.044	93.02	11.71
resnet18d.ra4_e3600_r224_in1k	288	76.024	92.78	11.71
resnetv2_18.ra4_e3600_r224_in1k	288	75.34	92.678	11.69
resnet34.tv_in1k	288	74.8	92.356	21.8
resnet18.a1_in1k	288	73.152	91.036	11.69
resnet18.tv_in1k	288	71.274	90.244	11.69

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote