Diminishing returns over size

by hrishbhdalal - opened Jul 16, 2024

Jul 16, 2024

Hey guys,
I love your work. I have been using your model since yesterday and am very impressed. Thanks for making it open source! I see that even the smaller models perform quite well and then the bigger models do not perform that crazily, meaning that the models do perform well but you see very small improvements, which of course are still remarkable.

I noticed that whenever there is a lot of information/text in an image, it struggles a bit, which I think is because of the small resolution size of the images even in bigger models, which I assume is the same everywhere. If I can advise you, please train a model that can take upto a mega pixel, like in a curriculum learning kinda scenario where the model works well now but then maybe extend it at the end of training for 4 times the pixels (double the size), this way I think you can kill it in general OCR, I am pretty sure.

But regardless, this is amazing and thanks for contributing to the society!

Best regards,
HD

czczup

OpenGVLab org Jul 16, 2024

•

edited Jul 16, 2024

Hello!
Thank you for your interest. I would like to ask how you are running our model, if you're using the code in quick-start, there is a max_num parameter that can be used to adjust image resolution. The default value is 6, which means that the maximum resolution of the input image is 6x448x448, for example, 896x1344.

If you're using these models on our online demo, you can adjust max_input_tiles in the Advanced Options sidebar on the left side. You can set it to 24, which means that the input resolution has at most 24x448x448 pixels, or about 4.8 million pixels.

I hope this may help you.

Best regards,
Zhe Chen

hrishbhdalal

Jul 16, 2024

I was using the default max_num=6 values and now as soon as I pumped it to 12, the results became much better! Thanks again main! Swift response!

hrishbhdalal changed discussion status to closed Jul 16, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment