internlm
/

Intern-S1-mini-GGUF

Image-Text-to-Text

Model card Files Files and versions Community

ocr能力很差，但是官方的demo是没问题

#2

by pypry - opened 6 days ago

pypry

6 days ago

你们自己通过llama.cpp测过吗

pypry

6 days ago

我用的精度是f16的，llama sever的参数和系统提示词都是参考model card的设定

Intern Large Models org 2 days ago

@pypry Hi, pls. provide sample code and data to reproduce.

pypry

2 days ago

@unsubscribe
随便给个中文文档的截图，提示词是“识别并输出图中文字”。模型会胡乱输出。但是给一个不包含文字的图片，它能正确描述图片内容。

pypry

2 days ago

还有个问题是不管我上传的图片多大，llama sever 都会对图片进行缩放，这样对于大图的识别肯定会有问题。作为对比，我使用minicpm 4.5模型，llama sever会对大图进行分片

pypry

2 days ago

我的测试方式是使用llama server跑模型，通过open ai api去访问

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment