trocr-base-printed / README.md
philschmid's picture
philschmid HF staff
Update README.md
9a910af
|
raw
history blame
2.05 kB
metadata
tags:
  - trocr
  - image-to-text
  - endpoints-template
library_name: generic

Fork of microsoft/trocr-base-printed for an OCR Inference endpoint.

This repository implements a custom task for ocr-detection for 🤗 Inference Endpoints. The code for the customized pipeline is in the pipeline.py.

To use deploy this model as an Inference Endpoint, you have to select Custom as the task to use the pipeline.py file. -> double check if it is selected

Run Request

The endpoint expects the image to be served as binary. Below is an curl and python example

cURL

  1. get image
wget https://fki.tic.heia-fr.ch/static/img/a01-122-02-00.jpg -O test.jpg
  1. send cURL request
curl --request POST \
  --url https://{ENDPOINT}/ \
  --header 'Content-Type: image/jpg' \
  --header 'Authorization: Bearer {HF_TOKEN}' \
  --data-binary '@test.jpg'
  1. the expected output
{"text": "INDLUS THE"

Python

  1. get image
wget https://fki.tic.heia-fr.ch/static/img/a01-122-02-00.jpg -O test.jpg
  1. run request
import json
from typing import List
import requests as r
import base64

ENDPOINT_URL = ""
HF_TOKEN = ""


def predict(path_to_image: str = None, candiates: List[str] = None):
    with open(path_to_image, "rb") as i:
        b64 = base64.b64encode(i.read())

    payload = {"inputs": {"image": b64.decode("utf-8"), "candiates": candiates}}
    response = r.post(
        ENDPOINT_URL, headers={"Authorization": f"Bearer {HF_TOKEN}"}, json=payload
    )
    return response.json()


prediction = predict(
    path_to_image="palace.jpg", candiates=["sea", "palace", "car", "ship"]
)

expected output

[{'label': 'palace', 'score': 0.9996134638786316},
 {'label': 'car', 'score': 0.0002602009626571089},
 {'label': 'ship', 'score': 0.00011758189066313207},
 {'label': 'sea', 'score': 8.666840585647151e-06}]