Devanagari OCR with TrOCR

This model is a Devanagari Optical Character Recognition (OCR) model based on VisionEncoderDecoder architecture, fine-tuned on Nepali/Devanagari script. The model uses the TrOCRProcessor from Hugging Face to process and generate text from images.

Model Details

  • Model: syubraj/TrOCR_Nepali
  • Processor: TrOCRProcessor combining a Vision Transformer (ViT) feature extractor and a tokenizer.

How to Use

You can use this model in Python with the following steps:

from transformers import VisionEncoderDecoderModel, TrOCRProcessor, AutoTokenizer
from PIL import Image
import torch

# Load the model and processor
tokenizer = AutoTokenizer.from_pretrained("syubraj/TrOCR_Nepali")
model = VisionEncoderDecoderModel.from_pretrained("syubraj/TrOCR_Nepali")
processor = TrOCRProcessor.from_pretrained("syubraj/TrOCR_Nepali")

# Load image
image = Image.open("path_to_image").convert("RGB")

# Preprocess image
pixel_values = processor(image, return_tensors="pt").pixel_values

# Generate text
generated_ids = model.generate(pixel_values)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]

print(generated_text)
Downloads last month
108
Safetensors
Model size
224M params
Tensor type
F32
ยท
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Space using syubraj/TrOCR_Nepali 1