--- backbone: - convNext-Tiny integrating: True domain: - cv frameworks: - pytorch language: - en - ch license: Apache License 2.0 metrics: - Line Accuracy tags: - OCR - Alibaba - 文字识别 - 读光 tasks: - ocr-recognition studios: - damo/cv_ocr-text-spotting datasets: test: - damo/WebText_Dataset widgets: - task: ocr-recognition inputs: - type: image examples: - name: 1 inputs: - name: image data: http://duguang-labelling.oss-cn-shanghai.aliyuncs.com/mass_img_tmp_20220922/ocr_recognition.jpg --- # 文字识别模型介绍 文字识别,即给定一张文本图片,识别出图中所含文字并输出对应字符串。 本模型用于通用场景的文字识别,我们还有下列用于其他场景的模型: - [手写场景](https://www.modelscope.cn/models/damo/cv_convnextTiny_ocr-recognition-handwritten_damo/summary) - [文档印刷场景](https://www.modelscope.cn/models/damo/cv_convnextTiny_ocr-recognition-document_damo/summary) - [自然场景](https://www.modelscope.cn/models/damo/cv_convnextTiny_ocr-recognition-scene_damo/summary) - [车牌场景](https://www.modelscope.cn/models/damo/cv_convnextTiny_ocr-recognition-licenseplate_damo/summary) 文本检测模型: - [通用场景行检测](https://modelscope.cn/models/damo/cv_resnet18_ocr-detection-line-level_damo/summary) - [通用场景单词检测](https://modelscope.cn/models/damo/cv_resnet18_ocr-detection-word-level_damo/summary) 以及对整图中文字进行检测识别的完整OCR能力: - [通用场景整图检测识别](https://modelscope.cn/studios/damo/cv_ocr-text-spotting/summary) 欢迎使用! ## 模型描述 本模型主要包括三个主要部分,Convolutional Backbone提取图像视觉特征,ConvTransformer Blocks用于对视觉特征进行上下文建模,最后连接CTC loss进行识别解码以及网络梯度优化。识别模型结构如下图: