Is this ready to be used with the transformers library?

#1
by MihaiATK - opened

@danaaubakirova

Is this ready to be used with the transformers library?

If yes, please add examples and instructions regarding how the prompt should look like for multiple images.

Also, I noticed that the original training script has a crop_anchors argument and it's not clear to me what that's for and if I need to alter my training document images in any way before processing. Do you happen to know? In case you provide usage for training with the transformers library, please mention how to handle it.

Thank you!

Ah, I found more info by searching the transformers repo: https://github.com/huggingface/transformers/pull/31792

Hello Mihaiii,

No, you don't need to process the image in advance. You will be able to indicate it in an image_processor in the transformers, do_anchor_resize=True and do_adaptive_crop=True, which will prepare the crops for training. This part refers to the Shape Adaptive Cropping Module which was introduced in the original paper to deal with the images of various aspect ratios and resolution.
The model will be out soon in transformers.

Thanks for the interest!
Best,
Dana

Sign up or log in to comment