# HNTAI - Scalable Medical Data Extraction API - Development Guide ## Overview This FastAPI-based application provides scalable medical data extraction services, fully aligned with the "ChatGPT Version 3 - Scalable" architecture. It features async processing, Redis caching, PostgreSQL persistence, and enterprise-grade security. ## Architecture ### Core Components 1. **FastAPI Application** (`app.py`) - Main application factory with lifespan events - CORS middleware for cross-origin requests - Centralized agent initialization - Route registration from APIRouter 2. **Configuration** (`config_settings.py`) - Pydantic-based settings with validation - Environment variable loading - Database and Redis URL configuration 3. **Inference Service** (`inference_service.py`) - Async text summarization using thread pools - Model caching for performance - Chunking for long text processing 4. **PHI Scrubber Service** (`phi_scrubber_service.py`) - Regex-based PHI detection and redaction - Audit logging to PostgreSQL - Redis-based statistics tracking 5. **API Routes** (`api/routes_fastapi.py`) - FastAPI APIRouter with async endpoints - Health checks (/live, /ready) - Placeholder routes for full migration ### Data Flow ``` Client Request → FastAPI → Route Handler → Agent/Service → Redis Cache → PostgreSQL → Response ``` ## Development Setup ### Prerequisites - Python 3.10+ - PostgreSQL 13+ - Redis 6+ - Docker (optional) ### Local Development 1. **Clone and Setup Virtual Environment** ```bash git clone cd hntai python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate ``` 2. **Install Dependencies** ```bash pip install -r requirements.txt ``` 3. **Setup Database and Redis** ```bash # Start PostgreSQL (using Docker) docker run -d --name postgres -e POSTGRES_PASSWORD=password -p 5432:5432 postgres:13 # Start Redis (using Docker) docker run -d --name redis -p 6379:6379 redis:6 # Create database createdb medical_ai ``` 4. **Environment Variables** Create `.env` file: ```bash DATABASE_URL=postgresql://postgres:password@localhost:5432/medical_ai REDIS_URL=redis://localhost:6379/0 SECRET_KEY=your-secret-key-here JWT_SECRET_KEY=your-jwt-secret-key-here ``` 5. **Run Database Migrations** ```bash # Apply schema psql -d medical_ai -f database/postgresql/001_schema.sql ``` 6. **Run the Application** ```bash # Development mode python -m ai_med_extract.main # Or directly uvicorn ai_med_extract.app:create_app --reload --host 0.0.0.0 --port 7860 ``` 7. **Access the Application** - API: http://localhost:7860 - Docs: http://localhost:7860/docs (FastAPI auto-generated) - Health: http://localhost:7860/live ### Debugging 1. **Enable Debug Logging** ```python import logging logging.basicConfig(level=logging.DEBUG) ``` 2. **Use FastAPI Debug Mode** ```bash uvicorn ai_med_extract.app:create_app --reload --debug --host 0.0.0.0 --port 7860 ``` 3. **Test Endpoints** ```bash # Health check curl http://localhost:7860/live # API docs curl http://localhost:7860/openapi.json ``` 4. **Database Debugging** ```bash # Connect to PostgreSQL psql -d medical_ai # Check PHI audit logs SELECT * FROM phi_audit_log LIMIT 10; ``` 5. **Redis Debugging** ```bash # Connect to Redis CLI redis-cli # Check keys KEYS * ``` ## Production Deployment ### Option 1: Docker Deployment 1. **Build Docker Image** ```bash docker build -t hntai-api . ``` 2. **Run Container** ```bash docker run -d \ --name hntai-api \ -p 7860:7860 \ -e DATABASE_URL=postgresql://... \ -e REDIS_URL=redis://... \ -e SECRET_KEY=... \ -e JWT_SECRET_KEY=... \ hntai-api ``` ### Option 2: Kubernetes Deployment 1. **Prerequisites** - Kubernetes cluster - kubectl configured - PostgreSQL and Redis services running 2. **Create Secrets** ```bash kubectl create secret generic medical-ai-secrets \ --from-literal=DATABASE_URL=postgresql://... \ --from-literal=REDIS_URL=redis://... \ --from-literal=SECRET_KEY=... \ --from-literal=JWT_SECRET_KEY=... ``` 3. **Deploy to Kubernetes** ```bash kubectl apply -f infra/k8s/secure_deployment.yaml ``` 4. **Verify Deployment** ```bash kubectl get pods -n medical-ai kubectl logs -n medical-ai deployment/medical-ai-service ``` ### Option 3: Hugging Face Spaces (Legacy) The application still supports HF Spaces deployment for lightweight use cases. 1. **Update app.py** for HF Spaces compatibility 2. **Deploy via HF Spaces** with Docker SDK ## Monitoring and Observability ### Prometheus Metrics The application exposes metrics at `/metrics` endpoint. 1. **Setup Prometheus** ```bash kubectl apply -f monitoring/prometheus.yml ``` 2. **Access Metrics** ```bash curl http://ai-service.medical-ai.svc.cluster.local:80/metrics ``` ### Health Checks - **Liveness** (`/live`): Basic health check - **Readiness** (`/ready`): Checks if agents are initialized ### Logging - Structured JSON logging - PHI operations logged to database - Error tracking with stack traces ## Security Features ### HIPAA Compliance - PHI scrubbing with audit trails - Non-root container execution - Secrets management via Kubernetes - Network policies restricting traffic ### Authentication - JWT-based authentication (framework ready) - API key support (configurable) ## API Usage ### Health Endpoints ```bash GET /live GET /ready ``` ### PHI Scrubbing ```bash POST /phi/scrub Content-Type: application/json { "text": "Patient John Doe, SSN 123-45-6789, diagnosed with diabetes." } ``` Response: ```json { "scrubbed_text": "Patient [REDACTED], SSN [REDACTED], diagnosed with diabetes.", "phi_found": ["NAME", "SSN"], "redaction_count": 2 } ``` ### Text Summarization ```bash POST /api/generate_summary Content-Type: application/json { "text": "Long medical text...", "max_length": 150, "min_length": 50 } ``` ### Generate Patient Summary The `generate_patient_summary` endpoint has been migrated from the original Flask implementation to FastAPI. It generates a comprehensive 4-section patient summary from EHR data, with support for streaming (SSE) to handle long-running tasks and prevent timeouts. **Endpoint**: `POST /generate_patient_summary` **Query Parameters**: - `stream` (optional, default: `false`): Set to `true` for Server-Sent Events (SSE) streaming updates. **Request Body** (JSON): ```json { "patientid": "12345", "token": "your-auth-token", "key": "your-api-key", "patient_summarizer_model_name": "microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-q4.gguf", "patient_summarizer_model_type": "gguf", "generation_mode": "hq", // Options: "hq" (high-quality), "fast", "rule" (deterministic) "timeout_mode": "fast" // Options: "fast" (8s EHR timeout), "extended" (30s) } ``` **Synchronous Response** (when `stream=false`): ```json { "summary": "## Clinical Assessment\n- Patient details...\n\n## Key Trends & Changes\n- Changes detected...\n\n## Plan & Suggested Actions\n- Recommendations...\n\n## Direct Guidance for Physician\n- Clinical insights...", "baseline": "Patient baseline data...", "delta": "Changes from previous visits...", "timing": {"ehr_api": 2.5, "generation": 15.3, "total": 17.8}, "model_used": "microsoft/Phi-3-mini-4k-instruct (gguf)", "timeout_mode_used": "fast" } ``` **Streaming Response** (when `stream=true`): - Returns a `text/event-stream` response with SSE events: - `type: progress` - Progress updates (e.g., 10%, 50%) - `type: complete` - Final result with full summary - `type: error` - Error details if failed - `type: heartbeat` - Keep-alive signals **Notes**: - The endpoint integrates with an external EHR API to fetch patient data. - Supports multiple model types: GGUF, text-generation, summarization, seq2seq. - Includes fallbacks for timeouts, API errors, and model failures. - PHI scrubbing is applied automatically. - Full implementation includes delta computation, baseline building, and 4-section markdown output. ### Other Endpoints (Migration in Progress) - `POST /upload` - File upload and text extraction - `POST /transcribe` - Audio transcription - `POST /extract_medical_data` - Structured medical data extraction - `POST /api/extract_medical_data_from_audio` - Audio-based medical extraction ## Troubleshooting ### Common Issues 1. **Model Loading Failures** - Check HF_HOME and cache directories - Ensure sufficient memory - Verify internet connectivity for model downloads 2. **Database Connection Errors** - Verify DATABASE_URL format - Check PostgreSQL service status - Ensure database exists and schema applied 3. **Redis Connection Issues** - Verify REDIS_URL format - Check Redis service availability - Monitor Redis memory usage 4. **PHI Scrubbing Not Working** - Check regex patterns in phi_scrubber_service.py - Verify Redis connection for stats - Check database audit logs ### Performance Tuning - Adjust thread pools in inference_service.py - Configure Redis connection pooling - Set appropriate resource limits in K8s - Monitor memory usage for model caching ## Contributing 1. Follow async/await patterns for new endpoints 2. Add proper error handling and logging 3. Update tests for new functionality 4. Ensure HIPAA compliance for PHI handling 5. Document API changes in this guide