File size: 1,354 Bytes
7a762af 2720487 b2e09a4 2720487 56b5168 2720487 cf3b0e4 2720487 56b5168 2720487 56b5168 2720487 19c6f45 2720487 56b5168 2720487 ca998c7 96c7a32 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
---
license: apache-2.0
---
## Setup Instructions
### Clone the Surya OCR GitHub Repository
```bash
git clone https://github.com/VikParuchuri/surya.git
cd surya
```
### Switch to v0.4.14
```bash
git checkout f7c6c04
```
### Install Dependencies
The author has not provided requirements.txt file, but `environment.yml` from our conda environment has been uploaded, This file can be used to recreate environment for arabic_layout_model model.
### ArabicDoc Pipeline
Download `ArabicDoc.cpython-310-x86_64-linux-gnu.so` , `10x_best.pt` and `surya folder` from the Repository.
Place `ArabicDoc.cpython-310-x86_64-linux-gnu.so`, `10x_best.pt` and `surya folder` in same directory (They are dependent on each other).
```python
from ArabicDoc import arabic_layout_model # This import will originate from ArabicDoc.cpython-310-x86_64-linux-gnu.so , which is present in the repo. Also this works with Linux based OS only.
from surya.postprocessing.heatmap import draw_bboxes_on_image
from PIL import Image
image_path = "sample.jpg"
image = Image.open(image_path)
bboxes = arabic_layout_model(image_path)
plotted_image = draw_bboxes_on_image(bboxes,image)
```
#### Refer to `benchmark.ipynb` for comparison between Traditional Surya Layout Model and New Layout Model.
#### Refer to `results` folder to visualize images obtained from both the models. |