--- license: mit library_name: open_clip pipeline_tag: zero-shot-image-classification --- [[Paper]](https://openreview.net/forum?id=e3scLKNiNg¬eId=e3scLKNiNg) [[GitHub]](https://github.com/fra31/perceptual-metrics) Robust perceptual metric, based on CLIP model `laion/CLIP-ViT-B-16-laion2B-s34B-b88K` Adversarially fine-tuned with FARE ([Schlarmann et al. (2024)](https://arxiv.org/abs/2402.12336)) on ImageNet with infinity-norm and radius 4/255. Performance on the perceptual similarity task [NIGHTS](https://dreamsim-nights.github.io): ``` Clean L-inf, eps=4/255 L2, eps=3 90.6 71.5 65.5 ``` ## Usage ```python model, _, image_processor = open_clip.create_model_and_transforms('hf-hub:chs20/FARE4-ViT-B-16-laion2B-s34B-b88K') ``` ## Citation If you find this model useful, please consider citing our papers: ```bibtex @inproceedings{croce2024adversarially, title={Adversarially Robust CLIP Models Induce Better (Robust) Perceptual Metrics}, author={Croce, Francesco and Schlarmann, Christian and Singh, Naman Deep and Hein, Matthias}, year={2024}, booktitle={{ICML Workshop on Foundation Models in the Wild}} } ``` ```bibtex @inproceedings{schlarmann2024robustclip, title={Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models}, author={Schlarmann, Christian and Singh, Naman Deep and Croce, Francesco and Hein, Matthias}, year={2024}, booktitle={{ICML}} } ```