图像编码

#8
by baoge1129 - opened

您好,我用InternViT-6B-448px-V1-5对一张448×448的图片进行编码,最后一层输出结果的shape为[1, 1025, 3200],这三个数都分别代表什么意思?后续这个张量该怎么处理?

5b6a4d1ca91eb4a83f21a2c2492d677.png

OpenGVLab org

(batchsize,patch数量,embeding维数)

zwgao changed discussion status to closed

Sign up or log in to comment