metadata
license: openrail
trocr-old-russian
Info
The model is trained to recognize printed texts in Old Russian language
- Use microsoft/trocr-small-printed as base model for fine-tune.
- Fine-tune on 636k text images
Usage
Base-usage
from PIL import Image
from transformers import VisionEncoderDecoderModel
hf_model = VisionEncoderDecoderModel.from_pretrained("Serovvans/trocr-old-russian")
image = Image.open("./path/to/yout/image")
processor = TrOCRProcessor.from_pretrained("microsoft/trocr-base-printed")
pixel_values = processor(images=image, return_tensors="pt").pixel_values
generated_ids = hf_model.generate(pixel_values)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(generated_text)
Usage for recognizing the book