@mahbubchula
You can just use the DOI or simply mention that this model is active here.
https://huggingface.co/prithivMLmods/DeepCaption-VLA-7B?doi=true
@mahbubchula
You can just use the DOI or simply mention that this model is active here.
https://huggingface.co/prithivMLmods/DeepCaption-VLA-7B?doi=true
Yes, you can!
@mahbubchula
Try the simple workflow I’ve created below:
Here, I implemented https://huggingface.co/prithivMLmods/Behemoth-3B-070225-post0.1, which is close in functionality to DeepCaption-VLA-7B. I switched to this model because it better fits the VRAM usage on a T4 Colab instance. You can adapt the model according to your use cases, requirements, and available resources.
For detection functionality, refer to the following app: https://huggingface.co/spaces/sergiopaniego/vlm_object_understanding
Demo UI | Image Inference |
---|---|
![]() |
![]() |
Demo UI | Image Inference |
---|---|
![]() |
![]() |
no flash_attention
no flash_attention