HNTAI / DEVELOPMENT.md
Adhil Krishna G
Deployed to Live
5aafb3a
|
Raw
History Blame
9.64 kB

HNTAI - Scalable Medical Data Extraction API - Development Guide

Overview

This FastAPI-based application provides scalable medical data extraction services, fully aligned with the "ChatGPT Version 3 - Scalable" architecture. It features async processing, Redis caching, PostgreSQL persistence, and enterprise-grade security.

Architecture

Core Components

  1. FastAPI Application (app.py)

    • Main application factory with lifespan events
    • CORS middleware for cross-origin requests
    • Centralized agent initialization
    • Route registration from APIRouter
  2. Configuration (config_settings.py)

    • Pydantic-based settings with validation
    • Environment variable loading
    • Database and Redis URL configuration
  3. Inference Service (inference_service.py)

    • Async text summarization using thread pools
    • Model caching for performance
    • Chunking for long text processing
  4. PHI Scrubber Service (phi_scrubber_service.py)

    • Regex-based PHI detection and redaction
    • Audit logging to PostgreSQL
    • Redis-based statistics tracking
  5. API Routes (api/routes_fastapi.py)

    • FastAPI APIRouter with async endpoints
    • Health checks (/live, /ready)
    • Placeholder routes for full migration

Data Flow

Client Request → FastAPI → Route Handler → Agent/Service → Redis Cache → PostgreSQL → Response

Development Setup

Prerequisites

  • Python 3.10+
  • PostgreSQL 13+
  • Redis 6+
  • Docker (optional)

Local Development

  1. Clone and Setup Virtual Environment

    git clone <repository>
    cd hntai
    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
    
  2. Install Dependencies

    pip install -r requirements.txt
    
  3. Setup Database and Redis

    # Start PostgreSQL (using Docker)
    docker run -d --name postgres -e POSTGRES_PASSWORD=password -p 5432:5432 postgres:13
    
    # Start Redis (using Docker)
    docker run -d --name redis -p 6379:6379 redis:6
    
    # Create database
    createdb medical_ai
    
  4. Environment Variables Create .env file:

    DATABASE_URL=postgresql://postgres:password@localhost:5432/medical_ai
    REDIS_URL=redis://localhost:6379/0
    SECRET_KEY=your-secret-key-here
    JWT_SECRET_KEY=your-jwt-secret-key-here
    
  5. Run Database Migrations

    # Apply schema
    psql -d medical_ai -f database/postgresql/001_schema.sql
    
  6. Run the Application

    # Development mode
    python -m ai_med_extract.main
    
    # Or directly
    uvicorn ai_med_extract.app:create_app --reload --host 0.0.0.0 --port 7860
    
  7. Access the Application

Debugging

  1. Enable Debug Logging

    import logging
    logging.basicConfig(level=logging.DEBUG)
    
  2. Use FastAPI Debug Mode

    uvicorn ai_med_extract.app:create_app --reload --debug --host 0.0.0.0 --port 7860
    
  3. Test Endpoints

    # Health check
    curl http://localhost:7860/live
    
    # API docs
    curl http://localhost:7860/openapi.json
    
  4. Database Debugging

    # Connect to PostgreSQL
    psql -d medical_ai
    
    # Check PHI audit logs
    SELECT * FROM phi_audit_log LIMIT 10;
    
  5. Redis Debugging

    # Connect to Redis CLI
    redis-cli
    
    # Check keys
    KEYS *
    

Production Deployment

Option 1: Docker Deployment

  1. Build Docker Image

    docker build -t hntai-api .
    
  2. Run Container

    docker run -d \
      --name hntai-api \
      -p 7860:7860 \
      -e DATABASE_URL=postgresql://... \
      -e REDIS_URL=redis://... \
      -e SECRET_KEY=... \
      -e JWT_SECRET_KEY=... \
      hntai-api
    

Option 2: Kubernetes Deployment

  1. Prerequisites

    • Kubernetes cluster
    • kubectl configured
    • PostgreSQL and Redis services running
  2. Create Secrets

    kubectl create secret generic medical-ai-secrets \
      --from-literal=DATABASE_URL=postgresql://... \
      --from-literal=REDIS_URL=redis://... \
      --from-literal=SECRET_KEY=... \
      --from-literal=JWT_SECRET_KEY=...
    
  3. Deploy to Kubernetes

    kubectl apply -f infra/k8s/secure_deployment.yaml
    
  4. Verify Deployment

    kubectl get pods -n medical-ai
    kubectl logs -n medical-ai deployment/medical-ai-service
    

Option 3: Hugging Face Spaces (Legacy)

The application still supports HF Spaces deployment for lightweight use cases.

  1. Update app.py for HF Spaces compatibility
  2. Deploy via HF Spaces with Docker SDK

Monitoring and Observability

Prometheus Metrics

The application exposes metrics at /metrics endpoint.

  1. Setup Prometheus

    kubectl apply -f monitoring/prometheus.yml
    
  2. Access Metrics

    curl http://ai-service.medical-ai.svc.cluster.local:80/metrics
    

Health Checks

  • Liveness (/live): Basic health check
  • Readiness (/ready): Checks if agents are initialized

Logging

  • Structured JSON logging
  • PHI operations logged to database
  • Error tracking with stack traces

Security Features

HIPAA Compliance

  • PHI scrubbing with audit trails
  • Non-root container execution
  • Secrets management via Kubernetes
  • Network policies restricting traffic

Authentication

  • JWT-based authentication (framework ready)
  • API key support (configurable)

API Usage

Health Endpoints

GET /live
GET /ready

PHI Scrubbing

POST /phi/scrub
Content-Type: application/json

{
  "text": "Patient John Doe, SSN 123-45-6789, diagnosed with diabetes."
}

Response:

{
  "scrubbed_text": "Patient [REDACTED], SSN [REDACTED], diagnosed with diabetes.",
  "phi_found": ["NAME", "SSN"],
  "redaction_count": 2
}

Text Summarization

POST /api/generate_summary
Content-Type: application/json

{
  "text": "Long medical text...",
  "max_length": 150,
  "min_length": 50
}

Generate Patient Summary

The generate_patient_summary endpoint has been migrated from the original Flask implementation to FastAPI. It generates a comprehensive 4-section patient summary from EHR data, with support for streaming (SSE) to handle long-running tasks and prevent timeouts.

Endpoint: POST /generate_patient_summary

Query Parameters:

  • stream (optional, default: false): Set to true for Server-Sent Events (SSE) streaming updates.

Request Body (JSON):

{
  "patientid": "12345",
  "token": "your-auth-token",
  "key": "your-api-key",
  "patient_summarizer_model_name": "microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-q4.gguf",
  "patient_summarizer_model_type": "gguf",
  "generation_mode": "hq",  // Options: "hq" (high-quality), "fast", "rule" (deterministic)
  "timeout_mode": "fast"    // Options: "fast" (8s EHR timeout), "extended" (30s)
}

Synchronous Response (when stream=false):

{
  "summary": "## Clinical Assessment\n- Patient details...\n\n## Key Trends & Changes\n- Changes detected...\n\n## Plan & Suggested Actions\n- Recommendations...\n\n## Direct Guidance for Physician\n- Clinical insights...",
  "baseline": "Patient baseline data...",
  "delta": "Changes from previous visits...",
  "timing": {"ehr_api": 2.5, "generation": 15.3, "total": 17.8},
  "model_used": "microsoft/Phi-3-mini-4k-instruct (gguf)",
  "timeout_mode_used": "fast"
}

Streaming Response (when stream=true):

  • Returns a text/event-stream response with SSE events:
    • type: progress - Progress updates (e.g., 10%, 50%)
    • type: complete - Final result with full summary
    • type: error - Error details if failed
    • type: heartbeat - Keep-alive signals

Notes:

  • The endpoint integrates with an external EHR API to fetch patient data.
  • Supports multiple model types: GGUF, text-generation, summarization, seq2seq.
  • Includes fallbacks for timeouts, API errors, and model failures.
  • PHI scrubbing is applied automatically.
  • Full implementation includes delta computation, baseline building, and 4-section markdown output.

Other Endpoints (Migration in Progress)

  • POST /upload - File upload and text extraction
  • POST /transcribe - Audio transcription
  • POST /extract_medical_data - Structured medical data extraction
  • POST /api/extract_medical_data_from_audio - Audio-based medical extraction

Troubleshooting

Common Issues

  1. Model Loading Failures

    • Check HF_HOME and cache directories
    • Ensure sufficient memory
    • Verify internet connectivity for model downloads
  2. Database Connection Errors

    • Verify DATABASE_URL format
    • Check PostgreSQL service status
    • Ensure database exists and schema applied
  3. Redis Connection Issues

    • Verify REDIS_URL format
    • Check Redis service availability
    • Monitor Redis memory usage
  4. PHI Scrubbing Not Working

    • Check regex patterns in phi_scrubber_service.py
    • Verify Redis connection for stats
    • Check database audit logs

Performance Tuning

  • Adjust thread pools in inference_service.py
  • Configure Redis connection pooling
  • Set appropriate resource limits in K8s
  • Monitor memory usage for model caching

Contributing

  1. Follow async/await patterns for new endpoints
  2. Add proper error handling and logging
  3. Update tests for new functionality
  4. Ensure HIPAA compliance for PHI handling
  5. Document API changes in this guide