Serovvans's picture
Upload README.md
4a0c45a verified
|
raw
history blame
1.04 kB
metadata
license: openrail

trocr-old-russian

Info

The model is trained to recognize printed texts in Old Russian language

Usage

Base-usage

from PIL import Image
from transformers import TrOCRProcessor, VisionEncoderDecoderModel

hf_model = VisionEncoderDecoderModel.from_pretrained("Serovvans/trocr-prereform-orthography")

image = Image.open("./path/to/your/image")

processor = TrOCRProcessor.from_pretrained("microsoft/trocr-base-printed")
pixel_values = processor(images=image, return_tensors="pt").pixel_values

generated_ids = hf_model.generate(pixel_values)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(generated_text)

Usage for recognizing the book


Metrics on test

  • CER (Char Error Rate) = 0.095
  • WER (Word Error Rate) = 0.298