|
--- |
|
license: mit |
|
--- |
|
|
|
![alt text](https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcSksxbjFqxppkfVHAN30x6JjNc_3JGeGILZPA&s "Title") |
|
|
|
|
|
# Fine Tuning Script For Layout Model Of Surya OCR. |
|
|
|
This repository contains [layout-fine-tune.ipynb](https://huggingface.co/ketanmore/surya-layout-fine-tune-script/blob/main/layout-fine-tune.ipynb) file, Please use this file to fine tune [Surya Layout Model](https://huggingface.co/vikp/surya_layout2). This model uses modified architecture of Segformer. |
|
|
|
## Setup Instructions |
|
|
|
### Clone the Surya OCR GitHub Repository |
|
|
|
```bash |
|
git clone https://github.com/vikp/surya.git |
|
cd surya |
|
``` |
|
|
|
### Switch to v0.4.14 |
|
|
|
```bash |
|
git checkout f7c6c04 |
|
``` |
|
|
|
### Install Dependencies |
|
|
|
You can install the required dependencies using the following command: |
|
|
|
```bash |
|
pip install -r requirements.txt |
|
``` |
|
|
|
# Image Pre-processing |
|
|
|
For image pre-processing we can directly import a function and image processor from [surya ocr github repository](https://github.com/VikParuchuri/surya/tree/v0.4.14). |
|
|
|
```python |
|
from surya.input.processing import prepare_image_detection |
|
``` |
|
|
|
```python |
|
from surya.model.detection.segformer import load_processor |
|
``` |
|
|
|
```python |
|
from PIL import Image |
|
image = Image.open("path/to/image") |
|
images = [prepare_image_detection(img=image, processor=load_processor())] |
|
``` |
|
|
|
```python |
|
import torch |
|
images = torch.stack(images, dim=0).to(model.dtype).to(model.device) |
|
``` |
|
|
|
# Loading Model |
|
|
|
```python |
|
from surya.model.detection.segformer import load_model |
|
``` |
|
|
|
```python |
|
model = load_model("vikp/surya_layout2") |
|
``` |
|
|
|
```python |
|
output = model(pixel_values=images) |
|
``` |
|
|
|
|
|
### Note : Loss function |
|
|
|
[Surya-layout-Model](https://huggingface.co/vikp/surya_layout2) does not have pre-defined loss function, We have to define it according to our dataset and the Requirements. |