gokaygokay commited on
Commit
07af1ac
1 Parent(s): a73fee0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +42 -3
README.md CHANGED
@@ -1,3 +1,42 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ library_name: transformers
6
+ pipeline_tag: image-text-to-text
7
+ tags:
8
+ - art
9
+ ---
10
+
11
+ Fine-tuned version of PaliGemma 224x224 on image-prompt pairs.
12
+
13
+ ```
14
+ pip install git+https://github.com/huggingface/transformers
15
+ ```
16
+
17
+ ```python
18
+ from transformers import AutoProcessor, PaliGemmaForConditionalGeneration
19
+ from PIL import Image
20
+ import requests
21
+ import torch
22
+
23
+ model_id = "gokaygokay/SDXL-Captioner"
24
+
25
+ url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg?download=true"
26
+ image = Image.open(requests.get(url, stream=True).raw)
27
+
28
+ model = PaliGemmaForConditionalGeneration.from_pretrained(model_id).to('cuda').eval()
29
+ processor = AutoProcessor.from_pretrained(model_id)
30
+
31
+ ## prefix
32
+ prompt = "caption en"
33
+ model_inputs = processor(text=prompt, images=image, return_tensors="pt").to('cuda')
34
+ input_len = model_inputs["input_ids"].shape[-1]
35
+
36
+ with torch.inference_mode():
37
+ generation = model.generate(**model_inputs, max_new_tokens=256, do_sample=False)
38
+ generation = generation[0][input_len:]
39
+ decoded = processor.decode(generation, skip_special_tokens=True)
40
+ print(decoded)
41
+
42
+ ```