Zero-Shot Image Classification
TiC-CLIP
vision

How to Use this Model for Zero-Shot Image Classification?

#2
by eclipticwonder - opened

Hi,

How to use this model for Zero-Shot Image Classification? Can you provide a sample code?

Hi,
Thanks for your interest. Here is an example for loading and evaluating the model:

import open_clip
from huggingface_hub import hf_hub_download
filename = hf_hub_download(repo_id="apple/TiC-CLIP-bestpool-cumulative", filename="checkpoints/2016.pt")
model, _, preprocess = open_clip.create_model_and_transforms('ViT-B-16', filename)

tokenizer = open_clip.get_tokenizer('ViT-B-16')

image = preprocess(Image.open("image.png").convert('RGB')).unsqueeze(0)
text = tokenizer(["a diagram", "a dog", "a cat"])

with torch.no_grad(), torch.cuda.amp.autocast():
    image_features = model.encode_image(image)
    text_features = model.encode_text(text)
    image_features /= image_features.norm(dim=-1, keepdim=True)
    text_features /= text_features.norm(dim=-1, keepdim=True)

    text_probs = (100.0 * image_features @ text_features.T).softmax(dim=-1)

print("Label probs:", text_probs)
fartashf changed discussion status to closed

Please note that these models are released to facilitate research on continual learning. Please refer to the model card for more code examples.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment