File size: 1,354 Bytes
7a762af
 
 
2720487
 
 
 
 
 
b2e09a4
2720487
 
 
 
 
 
 
 
 
 
 
56b5168
2720487
 
cf3b0e4
2720487
56b5168
 
2720487
 
56b5168
2720487
19c6f45
2720487
 
 
56b5168
2720487
 
ca998c7
96c7a32
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
---
license: apache-2.0
---

## Setup Instructions

### Clone the Surya OCR GitHub Repository

```bash
git clone https://github.com/VikParuchuri/surya.git
cd surya
```

### Switch to v0.4.14

```bash
git checkout f7c6c04
```

### Install Dependencies

The author has not provided requirements.txt file, but `environment.yml` from our conda environment has been uploaded, This file can be used to recreate environment for arabic_layout_model model.


### ArabicDoc Pipeline 

Download `ArabicDoc.cpython-310-x86_64-linux-gnu.so` , `10x_best.pt` and `surya folder` from the Repository.
Place `ArabicDoc.cpython-310-x86_64-linux-gnu.so`, `10x_best.pt` and `surya folder` in same directory (They are dependent on each other).

```python
from ArabicDoc import arabic_layout_model # This import will originate from ArabicDoc.cpython-310-x86_64-linux-gnu.so , which is present in the repo. Also this works with Linux based OS only.
from surya.postprocessing.heatmap import draw_bboxes_on_image
from PIL import Image

image_path = "sample.jpg"
image  = Image.open(image_path)
bboxes = arabic_layout_model(image_path)
plotted_image  = draw_bboxes_on_image(bboxes,image)
```
#### Refer to `benchmark.ipynb` for comparison between Traditional Surya Layout Model and New Layout Model.
#### Refer to `results` folder to visualize images obtained from both the models.