fastapi uvicorn python-multipart torch datasets pdf2image accelerate pytesseract transformers[torch,sentencepiece] haystack-ai qdrant-haystack fastembed-haystack scikit-learn pdfplumber