Setup Instructions
Clone the Surya OCR GitHub Repository
git clone https://github.com/VikParuchuri/surya.git
cd surya
Switch to v0.4.14
git checkout f7c6c04
Install Dependencies
The author has not provided requirements.txt file, but environment.yml
from our conda environment has been uploaded, This file can be used to recreate environment for arabic_layout_model model.
ArabicDoc Pipeline
Download ArabicDoc.cpython-310-x86_64-linux-gnu.so
, 10x_best.pt
and surya folder
from the Repository.
Place ArabicDoc.cpython-310-x86_64-linux-gnu.so
, 10x_best.pt
and surya folder
in same directory (They are dependent on each other).
from ArabicDoc import arabic_layout_model # This import will originate from ArabicDoc.cpython-310-x86_64-linux-gnu.so , which is present in the repo. Also this works with Linux based OS only.
from surya.postprocessing.heatmap import draw_bboxes_on_image
from PIL import Image
image_path = "sample.jpg"
image = Image.open(image_path)
bboxes = arabic_layout_model(image_path)
plotted_image = draw_bboxes_on_image(bboxes,image)
Refer to benchmark.ipynb
for comparison between Traditional Surya Layout Model and New Layout Model.
Refer to results
folder to visualize images obtained from both the models.
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API was unable to determine this model's library.