Vision to text capabilities
#120
by
kmewhort
- opened
The announcements for this model excitingly highlighted vision-to-text capabilities, but its not clear from any of the documents I can find how to leverage this. Are there any VQA examples someone could share?