Spaces:
Running
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
Hindi & English OCR with Keyword Search
This project implements a web-based prototype for Optical Character Recognition (OCR) on images containing text in both Hindi and English. It also includes a basic keyword search functionality based on the extracted text.
Features
- Upload and process images containing Hindi and English text
- Extract text from images using OCR
- Perform keyword search on the extracted text
- Web-based interface for easy interaction
Technology Stack
- Python
- Hugging Face Transformers (Qwen2-VL-2B-Instruct model)
- PyTorch
- Gradio (for web interface)
Setup and Installation
Clone the repository:
git clone [your-repo-url] cd [your-repo-name]
Install the required dependencies:
pip install transformers torch gradio Pillow
Download the Qwen2-VL-2B-Instruct model: The model will be automatically downloaded when you run the application for the first time.
Usage
Run the application:
python app.py
Open the provided URL in your web browser.
Upload an image containing Hindi and/or English text.
(Optional) Enter a keyword to search within the extracted text.
View the OCR results and any keyword matches.
Limitations
- The current implementation uses CPU for processing, which may be slower for large images.
Future Improvements
- Implement GPU support for faster processing
- Add support for multiple image uploads
- Enhance the user interface for better user experience
Link
https://huggingface.co/spaces/pranshh/ocr-assignment
Acknowledgements
This project uses the Qwen2-VL-2B-Instruct model from Hugging Face Transformers.