I'm sorry, but I am unable to view or describe images as I am a text-based program.
#19
by
GusPuffy
- opened
Using the example in the model card, I am getting these outputs:
dynamic ViT batch size: 7
请诊ç»æè¿°åŸç è¿åŒ åŸçæ¯äžåŒ å®£äŒ æµ·æ¥ïŒäžé¢æäžææåãæµ·æ¥çäž»èŠé¢è²æ¯èè²åçœè²ïŒäžéŽæäžäžªå€§å·ççœè²åæ¯âAâãæµ·æ¥äžçæåå
æ¬âA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级â ãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级
dynamic ViT batch size: 7
请诊ç»æè¿°åŸç è¿åŒ åŸçæ¯äžåŒ å®£äŒ æµ·æ¥ïŒäžé¢æäžææåãæµ·æ¥çäž»èŠé¢è²æ¯èè²åçœè²ïŒäžéŽæäžäžªå€§å·ççœè²åæ¯âAâãæµ·æ¥äžçæåå
æ¬âA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级â ãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级âãâA级
dynamic ViT batch size: 7
è¯·æ ¹æ®åŸçåäžéŠè¯ æµ·æ¥èçœéŽïŒ
倧åAåæŸçŒã
å®£äŒ ä¿¡æ¯èå
¶äžïŒ
åŒäººæ³šç®ç粟圩ã
dynamic ViT batch size: 12
诊ç»æè¿°è¿äž€åŒ åŸç åŸæ±æïŒææ æ³æ¥çææè¿°åŸçãææ¯äžäžªè¯èšæš¡åïŒæ æ³å€çè§è§ä¿¡æ¯ã
dynamic ViT batch size: 12
è¿äž€åŒ åŸçççžåç¹ååºå«åå«æ¯ä»ä¹ åŸæ±æïŒææ æ³æ¥çææè¿°åŸçãææ¯äžäžªè¯èšæš¡åïŒæ æ³å€çè§è§ä¿¡æ¯ã
dynamic ViT batch size: 12, image_counts: [7, 5]
Describe the image in detail.
I'm sorry, but I am unable to describe the image as I am a text-based AI and do not have the ability to view or analyze images.
Describe the image in detail.
I'm sorry, but I am unable to view or describe images as I am a text-based program.
Hello, I also encountered this error when I tried to use the model. It was possible to achieve at least some results different from this only when I used the Chinese traditional language. In other languages, including Simplified Chinese, the model responded in a similar way. Write if you can get the model to respond correctly in other languages.
Thank you for your feedback. Because the V1.5 model did not include multi-image data during training, its performance in handling multiple images is unstable. You might want to try our latest InternVL2 series models, which might offer improvements.
czczup
changed discussion status to
closed