maxiw commited on
Commit
45a615f
1 Parent(s): 1570003

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -1
README.md CHANGED
@@ -73,7 +73,40 @@ Users (both direct and downstream) should be made aware of the risks, biases and
73
 
74
  Use the code below to get started with the model.
75
 
76
- [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
77
 
78
  ## Training Details
79
 
 
73
 
74
  Use the code below to get started with the model.
75
 
76
+
77
+ ```python
78
+ from PIL import Image
79
+ from transformers import AutoProcessor, AutoModelForCausalLM
80
+
81
+ device = "cuda:0" if torch.cuda.is_available() else "cpu"
82
+ torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
83
+
84
+ model = AutoModelForCausalLM.from_pretrained("AskUI/PTA-1", torch_dtype=torch_dtype, trust_remote_code=True).to(device)
85
+ processor = AutoProcessor.from_pretrained("AskUI/PTA-1", trust_remote_code=True)
86
+
87
+ task_prompt = "<OPEN_VOCABULARY_DETECTION>"
88
+ prompt = task_prompt + "description of the target element"
89
+
90
+ image = Image.open("path to screenshot")
91
+
92
+ inputs = processor(text=prompt, images=image, return_tensors="pt").to(device, torch_dtype)
93
+
94
+ generated_ids = model.generate(
95
+ input_ids=inputs["input_ids"],
96
+ pixel_values=inputs["pixel_values"],
97
+ max_new_tokens=1024,
98
+ do_sample=False,
99
+ num_beams=3,
100
+ )
101
+ generated_text = processor.batch_decode(generated_ids, skip_special_tokens=False)[0]
102
+
103
+ parsed_answer = processor.post_process_generation(generated_text, task="<OPEN_VOCABULARY_DETECTION>", image_size=(image.width, image.height))
104
+
105
+ print(parsed_answer)
106
+ ```
107
+
108
+
109
+
110
 
111
  ## Training Details
112