# This requirements.txt file is intended for use with Hugging Face Spaces (Python 3.10), and so is not intended as the file to use to install packages for a local install. Please refer to the README.md file for install instructions for the app. # --- Core and data packages --- numpy<=2.4.4 pandas<=2.3.3 bleach<=6.3.0 polars<=1.38.1 pyarrow<=24.0.0 openpyxl<=3.1.5 boto3<=1.42.91 python-dotenv<=1.2.2 defusedxml<=0.7.1 Faker<=40.8.0 python-levenshtein<=0.27.3 rapidfuzz<=3.14.5 markdown<=3.10.2 tabulate<=0.10.0 # --- PDF / OCR / Redaction tools --- pdfminer.six<=20260107 pdf2image<=1.17.0 pymupdf<=1.27.1 pikepdf<=10.3.0 opencv-python<=4.13.0.92 presidio_analyzer<=2.2.362 presidio_anonymizer<=2.2.362 presidio-image-redactor<=0.0.58 # --- Document generation --- python-docx<=1.2.0 # --- Gradio and apps --- gradio[mcp]==6.10.0 gradio-pdf-redaction<=0.0.25 gradio_image_annotation_redaction==0.5.5 # Custom annotator version with rotation, zoom, labels, and box IDs spaces # --- AWS Lambda runtime --- awslambdaric<=3.1.1 # --- Machine learning / NLP --- scikit-learn<=1.8.0 spacy<=3.8.14 en_core_web_lg @ https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-3.8.0/en_core_web_lg-3.8.0.tar.gz spaczz<=0.6.1 transformers<=5.12.0 accelerate<=1.13.0 bitsandbytes<=0.49.2 sentencepiece<=0.2.1 optimum<=2.1.0 # --- Testing --- pytest<=9.0.3 pytest-cov<=7.1.0 # --- LLM libraries --- google-genai<=1.73.0 openai<=2.31.0 # --- PyTorch (CUDA 12.8) --- --extra-index-url https://download.pytorch.org/whl/cu128 torch==2.8.0 torchvision==0.23.0 # flash-attn is optional (USE_FLASH_ATTENTION=True). Official cp310 wheels exist for torch<=2.8 only; # with torch 2.9 + Python 3.10 the Space uses sdpa attention instead (no extra package needed). flash-attn @ https://github.com/Dao-AILab/flash-attention/releases/download/v2.8.3/flash_attn-2.8.3+cu12torch2.8cxx11abiTRUE-cp310-cp310-linux_x86_64.whl