File size: 1,156 Bytes
15bde15
 
 
 
 
69ef53f
 
 
15bde15
69ef53f
 
 
 
 
 
 
 
 
15bde15
69ef53f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15bde15
69ef53f
 
15bde15
 
69ef53f
 
 
 
 
15bde15
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
---
license: apache-2.0
language:
- ar
---

# Surya OCR Arabic

This repository contains the `surya-ocr-arabic-segment` model, which is based on a modified SegFormer architecture. The model was fine-tuned for document segmentation tasks.


## Setup Instructions

### Clone the Surya OCR GitHub Repository

To use the `SegformerForRegressionMask` class, you need to clone the Surya OCR GitHub repository:

```bash
git clone https://github.com/vikp/surya.git
cd surya
```

### Switch to v0.4.14

```bash
git checkout f7c6c04
```

### Install Dependencies

You can install the required dependencies using the following command:

```bash
pip install -r requirements.txt
```

### Import and Use the Model

You can load and use the `surya-ocr-arabic-segment` model as follows:

```python

#we are importing `SegformerForRegressionMask` from the folder of surya OCR repo.
from surya.surya.model.detection.segformer import SegformerForRegressionMask
import torch

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = SegformerForRegressionMask.from_pretrained("ketanmore/surya-ocr-arabic-segment", torch_dtype=torch.float32).to(device)
```