timm
/

Image Classification
timm
PyTorch
Safetensors
Transformers

Model card for regnety_320.seer_ft_in1k

A RegNetY-32GF image classification model. Pretrained according to SEER: self-supervised learning with SwAV on "2B random internet images". Fine-tuned on ImageNet-1k by paper authors.

SEER is licensed under SEER license, Copyright (c) Meta Platforms, Inc. All Rights Reserved. The license is a non-commercial license with useage and distribution restrictions.

The timm RegNet implementation includes a number of enhancements not present in other implementations, including:

  • stochastic depth
  • gradient checkpointing
  • layer-wise LR decay
  • configurable output stride (dilation)
  • configurable activation and norm layers
  • option for a pre-activation bottleneck block used in RegNetV variant
  • only known RegNetZ model definitions with pretrained weights

Model Details

Model Usage

Image Classification

from urllib.request import urlopen
from PIL import Image
import timm

img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))

model = timm.create_model('regnety_320.seer_ft_in1k', pretrained=True)
model = model.eval()

# get model specific transforms (normalization, resize)
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)

output = model(transforms(img).unsqueeze(0))  # unsqueeze single image into batch of 1

top5_probabilities, top5_class_indices = torch.topk(output.softmax(dim=1) * 100, k=5)

Feature Map Extraction

from urllib.request import urlopen
from PIL import Image
import timm

img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))

model = timm.create_model(
    'regnety_320.seer_ft_in1k',
    pretrained=True,
    features_only=True,
)
model = model.eval()

# get model specific transforms (normalization, resize)
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)

output = model(transforms(img).unsqueeze(0))  # unsqueeze single image into batch of 1

for o in output:
    # print shape of each feature map in output
    # e.g.:
    #  torch.Size([1, 32, 192, 192])
    #  torch.Size([1, 232, 96, 96])
    #  torch.Size([1, 696, 48, 48])
    #  torch.Size([1, 1392, 24, 24])
    #  torch.Size([1, 3712, 12, 12])

    print(o.shape)

Image Embeddings

from urllib.request import urlopen
from PIL import Image
import timm

img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))

model = timm.create_model(
    'regnety_320.seer_ft_in1k',
    pretrained=True,
    num_classes=0,  # remove classifier nn.Linear
)
model = model.eval()

# get model specific transforms (normalization, resize)
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)

output = model(transforms(img).unsqueeze(0))  # output is (batch_size, num_features) shaped tensor

# or equivalently (without needing to set num_classes=0)

output = model.forward_features(transforms(img).unsqueeze(0))
# output is unpooled, a (1, 3712, 12, 12) shaped tensor

output = model.forward_head(output, pre_logits=True)
# output is a (1, num_features) shaped tensor

Model Comparison

Explore the dataset and runtime metrics of this model in timm model results.

For the comparison summary below, the ra_in1k, ra3_in1k, ch_in1k, sw_*, and lion_* tagged weights are trained in timm.

model img_size top1 top5 param_count gmacs macts
regnety_1280.swag_ft_in1k 384 88.228 98.684 644.81 374.99 210.2
regnety_320.swag_ft_in1k 384 86.84 98.364 145.05 95.0 88.87
regnety_160.swag_ft_in1k 384 86.024 98.05 83.59 46.87 67.67
regnety_160.sw_in12k_ft_in1k 288 86.004 97.83 83.59 26.37 38.07
regnety_1280.swag_lc_in1k 224 85.996 97.848 644.81 127.66 71.58
regnety_160.lion_in12k_ft_in1k 288 85.982 97.844 83.59 26.37 38.07
regnety_160.sw_in12k_ft_in1k 224 85.574 97.666 83.59 15.96 23.04
regnety_160.lion_in12k_ft_in1k 224 85.564 97.674 83.59 15.96 23.04
regnety_120.sw_in12k_ft_in1k 288 85.398 97.584 51.82 20.06 35.34
regnety_2560.seer_ft_in1k 384 85.15 97.436 1282.6 747.83 296.49
regnetz_e8.ra3_in1k 320 85.036 97.268 57.7 15.46 63.94
regnety_120.sw_in12k_ft_in1k 224 84.976 97.416 51.82 12.14 21.38
regnety_320.swag_lc_in1k 224 84.56 97.446 145.05 32.34 30.26
regnetz_040_h.ra3_in1k 320 84.496 97.004 28.94 6.43 37.94
regnetz_e8.ra3_in1k 256 84.436 97.02 57.7 9.91 40.94
regnety_1280.seer_ft_in1k 384 84.432 97.092 644.81 374.99 210.2
regnetz_040.ra3_in1k 320 84.246 96.93 27.12 6.35 37.78
regnetz_d8.ra3_in1k 320 84.054 96.992 23.37 6.19 37.08
regnetz_d8_evos.ch_in1k 320 84.038 96.992 23.46 7.03 38.92
regnetz_d32.ra3_in1k 320 84.022 96.866 27.58 9.33 37.08
regnety_080.ra3_in1k 288 83.932 96.888 39.18 13.22 29.69
regnety_640.seer_ft_in1k 384 83.912 96.924 281.38 188.47 124.83
regnety_160.swag_lc_in1k 224 83.778 97.286 83.59 15.96 23.04
regnetz_040_h.ra3_in1k 256 83.776 96.704 28.94 4.12 24.29
regnetv_064.ra3_in1k 288 83.72 96.75 30.58 10.55 27.11
regnety_064.ra3_in1k 288 83.718 96.724 30.58 10.56 27.11
regnety_160.deit_in1k 288 83.69 96.778 83.59 26.37 38.07
regnetz_040.ra3_in1k 256 83.62 96.704 27.12 4.06 24.19
regnetz_d8.ra3_in1k 256 83.438 96.776 23.37 3.97 23.74
regnetz_d32.ra3_in1k 256 83.424 96.632 27.58 5.98 23.74
regnetz_d8_evos.ch_in1k 256 83.36 96.636 23.46 4.5 24.92
regnety_320.seer_ft_in1k 384 83.35 96.71 145.05 95.0 88.87
regnetv_040.ra3_in1k 288 83.204 96.66 20.64 6.6 20.3
regnety_320.tv2_in1k 224 83.162 96.42 145.05 32.34 30.26
regnety_080.ra3_in1k 224 83.16 96.486 39.18 8.0 17.97
regnetv_064.ra3_in1k 224 83.108 96.458 30.58 6.39 16.41
regnety_040.ra3_in1k 288 83.044 96.5 20.65 6.61 20.3
regnety_064.ra3_in1k 224 83.02 96.292 30.58 6.39 16.41
regnety_160.deit_in1k 224 82.974 96.502 83.59 15.96 23.04
regnetx_320.tv2_in1k 224 82.816 96.208 107.81 31.81 36.3
regnety_032.ra_in1k 288 82.742 96.418 19.44 5.29 18.61
regnety_160.tv2_in1k 224 82.634 96.22 83.59 15.96 23.04
regnetz_c16_evos.ch_in1k 320 82.634 96.472 13.49 3.86 25.88
regnety_080_tv.tv2_in1k 224 82.592 96.246 39.38 8.51 19.73
regnetx_160.tv2_in1k 224 82.564 96.052 54.28 15.99 25.52
regnetz_c16.ra3_in1k 320 82.51 96.358 13.46 3.92 25.88
regnetv_040.ra3_in1k 224 82.44 96.198 20.64 4.0 12.29
regnety_040.ra3_in1k 224 82.304 96.078 20.65 4.0 12.29
regnetz_c16.ra3_in1k 256 82.16 96.048 13.46 2.51 16.57
regnetz_c16_evos.ch_in1k 256 81.936 96.15 13.49 2.48 16.57
regnety_032.ra_in1k 224 81.924 95.988 19.44 3.2 11.26
regnety_032.tv2_in1k 224 81.77 95.842 19.44 3.2 11.26
regnetx_080.tv2_in1k 224 81.552 95.544 39.57 8.02 14.06
regnetx_032.tv2_in1k 224 80.924 95.27 15.3 3.2 11.37
regnety_320.pycls_in1k 224 80.804 95.246 145.05 32.34 30.26
regnetz_b16.ra3_in1k 288 80.712 95.47 9.72 2.39 16.43
regnety_016.tv2_in1k 224 80.66 95.334 11.2 1.63 8.04
regnety_120.pycls_in1k 224 80.37 95.12 51.82 12.14 21.38
regnety_160.pycls_in1k 224 80.288 94.964 83.59 15.96 23.04
regnetx_320.pycls_in1k 224 80.246 95.01 107.81 31.81 36.3
regnety_080.pycls_in1k 224 79.882 94.834 39.18 8.0 17.97
regnetz_b16.ra3_in1k 224 79.872 94.974 9.72 1.45 9.95
regnetx_160.pycls_in1k 224 79.862 94.828 54.28 15.99 25.52
regnety_064.pycls_in1k 224 79.716 94.772 30.58 6.39 16.41
regnetx_120.pycls_in1k 224 79.592 94.738 46.11 12.13 21.37
regnetx_016.tv2_in1k 224 79.44 94.772 9.19 1.62 7.93
regnety_040.pycls_in1k 224 79.23 94.654 20.65 4.0 12.29
regnetx_080.pycls_in1k 224 79.198 94.55 39.57 8.02 14.06
regnetx_064.pycls_in1k 224 79.064 94.454 26.21 6.49 16.37
regnety_032.pycls_in1k 224 78.884 94.412 19.44 3.2 11.26
regnety_008_tv.tv2_in1k 224 78.654 94.388 6.43 0.84 5.42
regnetx_040.pycls_in1k 224 78.482 94.24 22.12 3.99 12.2
regnetx_032.pycls_in1k 224 78.178 94.08 15.3 3.2 11.37
regnety_016.pycls_in1k 224 77.862 93.73 11.2 1.63 8.04
regnetx_008.tv2_in1k 224 77.302 93.672 7.26 0.81 5.15
regnetx_016.pycls_in1k 224 76.908 93.418 9.19 1.62 7.93
regnety_008.pycls_in1k 224 76.296 93.05 6.26 0.81 5.25
regnety_004.tv2_in1k 224 75.592 92.712 4.34 0.41 3.89
regnety_006.pycls_in1k 224 75.244 92.518 6.06 0.61 4.33
regnetx_008.pycls_in1k 224 75.042 92.342 7.26 0.81 5.15
regnetx_004_tv.tv2_in1k 224 74.57 92.184 5.5 0.42 3.17
regnety_004.pycls_in1k 224 74.018 91.764 4.34 0.41 3.89
regnetx_006.pycls_in1k 224 73.862 91.67 6.2 0.61 3.98
regnetx_004.pycls_in1k 224 72.38 90.832 5.16 0.4 3.14
regnety_002.pycls_in1k 224 70.282 89.534 3.16 0.2 2.17
regnetx_002.pycls_in1k 224 68.752 88.556 2.68 0.2 2.16

Citation

@article{goyal2022vision,
  title={Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision}, 
  author={Priya Goyal and Quentin Duval and Isaac Seessel and Mathilde Caron and Ishan Misra and Levent Sagun and Armand Joulin and Piotr Bojanowski},
  year={2022},
  eprint={2202.08360},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}
@InProceedings{Radosavovic2020,
  title = {Designing Network Design Spaces},
  author = {Ilija Radosavovic and Raj Prateek Kosaraju and Ross Girshick and Kaiming He and Piotr Doll{'a}r},
  booktitle = {CVPR},
  year = {2020}
}
@misc{rw2019timm,
  author = {Ross Wightman},
  title = {PyTorch Image Models},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  doi = {10.5281/zenodo.4414861},
  howpublished = {\url{https://github.com/huggingface/pytorch-image-models}}
}
Downloads last month
244
Safetensors
Model size
145M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train timm/regnety_320.seer_ft_in1k