--- title: HNTAI - Medical Data Extraction API emoji: 📉 colorFrom: blue colorTo: green sdk: docker app_port: 7860 pinned: false --- # HNTAI - Scalable Medical Data Extraction API This is a FastAPI-based scalable API for extracting and processing medical data from various document formats, aligned with "ChatGPT Version 3 - Scalable" architecture. ## Features - Document text extraction (PDF, DOCX, Images) - Audio transcription - Medical data extraction - PHI (Protected Health Information) scrubbing with audit logging - Text summarization with Redis caching - PostgreSQL database integration for persistence - Async processing for scalability - Health endpoints (/live, /ready) - Security features (non-root containers, secrets management, HIPAA compliance) ## Architecture Alignment Fully aligned with "ChatGPT Version 3 - Scalable": - FastAPI for async API handling - Redis for caching and PHI stats - PostgreSQL for audit logs and data persistence - Kubernetes deployment with security contexts - Network policies and HIPAA compliance - Prometheus monitoring - Proper resource limits and health probes ## Deployment Options - **Hugging Face Spaces**: Lightweight Docker deployment (legacy) - **Kubernetes**: Scalable production deployment with security features ## Environment Variables - `DATABASE_URL`: PostgreSQL connection string - `REDIS_URL`: Redis connection string - `SECRET_KEY`: Application secret key - `JWT_SECRET_KEY`: JWT signing key ## API Endpoints - GET /health/live - Liveness health check - GET /health/ready - Readiness health check - GET /metrics - Prometheus metrics - POST /generate_patient_summary - Generate comprehensive patient summaries (with streaming support) - POST /upload - Upload and process medical documents - GET /get_updated_medical_data - Retrieve processed medical data - PUT /update_medical_data - Update medical data fields - POST /transcribe - Transcribe audio files - POST /extract_medical_data - Extract structured medical data - POST /api/generate_summary - Generate text summaries - POST /api/extract_medical_data_from_audio - Process audio recordings - POST /api/patient_summary_openvino - Generate patient summaries using OpenVINO ## Development ### Code Quality This project uses the following tools for code quality: - **Black**: Code formatting - **isort**: Import sorting - **flake8**: Linting - **mypy**: Type checking Run quality checks: ```bash black . isort . flake8 . mypy . ``` ### Testing Run tests with: ```bash python -m pytest ``` For more details, check the API documentation at `/docs`, [DEVELOPMENT.md](DEVELOPMENT.md) for development guides, and [DEPLOYMENT.md](DEPLOYMENT.md) for deployment instructions.