fastapi uvicorn python-multipart torch datasets pdf2image accelerate pytesseract transformers[torch,sentencepiece] haystack-ai qdrant-haystack fastembed-haystack scikit-learn pdfplumber unstructured pdfminer.six pillow_heif unstructured_inference unstructured_pytesseract opencv-python-headless