view post Post 2891 The new Qwen-2 VL models seem to perform quite well in object detection. You can prompt them to respond with bounding boxes in a reference frame of 1k x 1k pixels and scale those boxes to the original image size.You can try it out with my space maxiw/Qwen2-VL-Detection 4 replies ยท ๐ 11 11 ๐ 5 5 ๐ค 1 1 + Reply