Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,34 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
datasets:
|
4 |
+
- HuggingFaceM4/flickr30k
|
5 |
+
language:
|
6 |
+
- en
|
7 |
+
library_name: transformers
|
8 |
+
pipeline_tag: image-to-text
|
9 |
+
---
|
10 |
+
|
11 |
+
# CLIP
|
12 |
+
|
13 |
+
In the paper titled "Learning Transferable Visual Models From Natural Language Supervision," OpenAI introduces CLIP, short for Contrastive Language-Image Pre-training. This model learns how sentences and images are related, retrieving the most relevant images for a given sentence during training. What sets CLIP apart is its training on complete sentences instead of individual categories like cars or dogs. This approach allows the model to learn more and discover patterns between images and text. When trained on a large dataset of images and their corresponding texts, CLIP can also function as a classifier, outperforming models trained directly on ImageNet for classification tasks. Further exploration of the paper reveals in-depth details and astonishing outcomes.
|
14 |
+
|
15 |
+
## Useage
|
16 |
+
|
17 |
+
```python
|
18 |
+
from PIL import Image
|
19 |
+
import requests
|
20 |
+
|
21 |
+
from transformers import CLIPProcessor, CLIPModel
|
22 |
+
|
23 |
+
model = CLIPModel.from_pretrained("SRDdev/CLIP")
|
24 |
+
processor = CLIPProcessor.from_pretrained("SRDdev/CLIP")
|
25 |
+
|
26 |
+
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
|
27 |
+
image = Image.open(requests.get(url, stream=True).raw)
|
28 |
+
|
29 |
+
inputs = processor(text=["a photo of a cat", "a photo of a dog"], images=image, return_tensors="pt", padding=True)
|
30 |
+
|
31 |
+
outputs = model(**inputs)
|
32 |
+
logits_per_image = outputs.logits_per_image # this is the image-text similarity score
|
33 |
+
probs = logits_per_image.softmax(dim=1) # we can take the softmax to get the label probabilities
|
34 |
+
```
|