KhangPTT373pdf_qa / README.MD
KhangPTT373's picture
Upload folder using huggingface_hub
683c41b verified
# Folder structure
```
ORAL_PDF_QA/
β”œβ”€β”€ __pycache__/
β”œβ”€β”€ bge_model_ctranslate2/
β”œβ”€β”€ data/
β”œβ”€β”€ parsed/
β”œβ”€β”€ logs/
β”œβ”€β”€ pdf/
β”œβ”€β”€ pictures/
β”œβ”€β”€ tables/
β”œβ”€β”€ venv/
β”œβ”€β”€ .gitignore
β”œβ”€β”€ chroma_service.py
β”œβ”€β”€ config.py
β”œβ”€β”€ gradio_demo.py
β”œβ”€β”€ pdf_parsing_service.py
β”œβ”€β”€ questions.txt
β”œβ”€β”€ README.MD
β”œβ”€β”€ requirements.txt
└── utils.py
```
# Download
```
pip install -r requirements.txt
```
Download `bge_model_ctranslate2` embedding model<br>
Download `parsed` folder at https://drive.google.com/drive/folders/174I-pX1f7_mGG28Wwd9JPOgnOS5O16BA?usp=sharing<br>
Download `tables` folder (extracted tables) from https://drive.google.com/drive/folders/12r0F_Ce25kecUSzp_HvjHjhrV6LbyYyx?usp=sharing<br>
Download `pictures` folder (extracted pictures) from https://drive.google.com/drive/folders/1EvTLNNrBvQr-_lIzZSRL8ayrevKTmtJK?usp=sharing<br>
# Usage
```
python chroma_service.py
```
```
pyrhon gradio_demo.py
```