Code to retrieve "points" like in the demo?
#1
by
deepboothcells
- opened
Could you provide code to retrieve the points when new ask Molmo to specifically show something i nan image (like in the demo in conjunction with segment anything)?
The points are returned in plain text image coordinates, normalized to between 0 and 100. So its just a matter of parsing them out and de-normalizing them, we can add more official code to do those but for now you can use this:
def extract_points(molmo_output, image_w, image_h):
all_points = []
for match in re.finditer(r'x\d*="\s*([0-9]+(?:\.[0-9]+)?)"\s+y\d*="\s*([0-9]+(?:\.[0-9]+)?)"', molmo_output):
try:
point = [float(match.group(i)) for i in range(1, 3)]
except ValueError:
pass
else:
point = np.array(point)
if np.max(point) > 100:
# Treat as an invalid output
continue
point /= 100.0
point = point * np.array([image_w, image_h])
all_points.append(point)
return all_points
chrisc36
changed discussion status to
closed