MD MUHAIMIN RAHMAN
sezan92
AI & ML interests
AI, Reinforcement learning, Graph Neural Network, Computer Vision, Robotics
Recent Activity
reacted
to
prithivMLmods's
post
with ๐
8 days ago
OpenGVLab's InternVL3_5-2B-MPO [Mixed Preference Optimization (MPO)] is a compact vision-language model in the InternVL3.5 series. You can now experience it in the Tiny VLMs Lab, an app featuring 15+ multimodal VLMs ranging from 250M to 4B parameters. These models support tasks such as OCR, reasoning, single-shot answering with small models, and captioning (including ablated variants), across a broad range of visual categories. They are also capable of handling images with complex, sensitive, or nuanced content, while adapting to varying aspect ratios and resolutions.
โจ Space/App : https://huggingface.co/spaces/prithivMLmods/Tiny-VLMs-Lab
๐ซ Model : https://huggingface.co/OpenGVLab/InternVL3_5-2B-MPO
โ๏ธ Collection: https://huggingface.co/collections/OpenGVLab/internvl35-68ac87bd52ebe953485927fb
๐๏ธ Paper : https://arxiv.org/pdf/2508.18265
โ๏ธ Multimodal Space Collection : https://huggingface.co/collections/prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0
To learn more, visit the relevant spaces, collections, and model cards.
reacted
to
prithivMLmods's
post
with ๐ค
8 days ago
OpenGVLab's InternVL3_5-2B-MPO [Mixed Preference Optimization (MPO)] is a compact vision-language model in the InternVL3.5 series. You can now experience it in the Tiny VLMs Lab, an app featuring 15+ multimodal VLMs ranging from 250M to 4B parameters. These models support tasks such as OCR, reasoning, single-shot answering with small models, and captioning (including ablated variants), across a broad range of visual categories. They are also capable of handling images with complex, sensitive, or nuanced content, while adapting to varying aspect ratios and resolutions.
โจ Space/App : https://huggingface.co/spaces/prithivMLmods/Tiny-VLMs-Lab
๐ซ Model : https://huggingface.co/OpenGVLab/InternVL3_5-2B-MPO
โ๏ธ Collection: https://huggingface.co/collections/OpenGVLab/internvl35-68ac87bd52ebe953485927fb
๐๏ธ Paper : https://arxiv.org/pdf/2508.18265
โ๏ธ Multimodal Space Collection : https://huggingface.co/collections/prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0
To learn more, visit the relevant spaces, collections, and model cards.
updated
a dataset
7 months ago
hf-vision/course-assets