metadata

license: cc-by-nc-4.0
library_name: transformers
datasets:
  - imagenet-1k

Hiera mae_in1k_ft_in1k

This model is the transformers format converted version of the Hiera model mae_in1k_ft_in1k (https://github.com/facebookresearch/hiera)

Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles

from PIL import Image
import torch
from transformers import AutoModelForImageClassification, AutoImageProcessor

REPO = "p1atdev/hiera_mae_in1k_ft_in1k"

processor = AutoImageProcessor.from_pretrained(REPO)
model = AutoModelForImageClassification.from_pretrained(REPO, trust_remote_code=True)

image = Image.open("image.png")
with torch.no_grad():
  outputs = model(**processor(image, return_tensors="pt"))
print(outputs.logits.argmax().item())
# 207 (golden retriever (imagenet-1k))