Spaces:
Paused
HNTAI - Scalable Medical Data Extraction API - Development Guide
Overview
This FastAPI-based application provides scalable medical data extraction services, fully aligned with the "ChatGPT Version 3 - Scalable" architecture. It features async processing, Redis caching, PostgreSQL persistence, and enterprise-grade security.
Architecture
Core Components
FastAPI Application (
app.py)- Main application factory with lifespan events
- CORS middleware for cross-origin requests
- Centralized agent initialization
- Route registration from APIRouter
Configuration (
config_settings.py)- Pydantic-based settings with validation
- Environment variable loading
- Database and Redis URL configuration
Inference Service (
inference_service.py)- Async text summarization using thread pools
- Model caching for performance
- Chunking for long text processing
PHI Scrubber Service (
phi_scrubber_service.py)- Regex-based PHI detection and redaction
- Audit logging to PostgreSQL
- Redis-based statistics tracking
API Routes (
api/routes_fastapi.py)- FastAPI APIRouter with async endpoints
- Health checks (/live, /ready)
- Placeholder routes for full migration
Data Flow
Client Request → FastAPI → Route Handler → Agent/Service → Redis Cache → PostgreSQL → Response
Development Setup
Prerequisites
- Python 3.10+
- PostgreSQL 13+
- Redis 6+
- Docker (optional)
Local Development
Clone and Setup Virtual Environment
git clone <repository> cd hntai python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activateInstall Dependencies
pip install -r requirements.txtSetup Database and Redis
# Start PostgreSQL (using Docker) docker run -d --name postgres -e POSTGRES_PASSWORD=password -p 5432:5432 postgres:13 # Start Redis (using Docker) docker run -d --name redis -p 6379:6379 redis:6 # Create database createdb medical_aiEnvironment Variables Create
.envfile:DATABASE_URL=postgresql://postgres:password@localhost:5432/medical_ai REDIS_URL=redis://localhost:6379/0 SECRET_KEY=your-secret-key-here JWT_SECRET_KEY=your-jwt-secret-key-hereRun Database Migrations
# Apply schema psql -d medical_ai -f database/postgresql/001_schema.sqlRun the Application
# Development mode python -m ai_med_extract.main # Or directly uvicorn ai_med_extract.app:create_app --reload --host 0.0.0.0 --port 7860Access the Application
- API: http://localhost:7860
- Docs: http://localhost:7860/docs (FastAPI auto-generated)
- Health: http://localhost:7860/live
Debugging
Enable Debug Logging
import logging logging.basicConfig(level=logging.DEBUG)Use FastAPI Debug Mode
uvicorn ai_med_extract.app:create_app --reload --debug --host 0.0.0.0 --port 7860Test Endpoints
# Health check curl http://localhost:7860/live # API docs curl http://localhost:7860/openapi.jsonDatabase Debugging
# Connect to PostgreSQL psql -d medical_ai # Check PHI audit logs SELECT * FROM phi_audit_log LIMIT 10;Redis Debugging
# Connect to Redis CLI redis-cli # Check keys KEYS *
Production Deployment
Option 1: Docker Deployment
Build Docker Image
docker build -t hntai-api .Run Container
docker run -d \ --name hntai-api \ -p 7860:7860 \ -e DATABASE_URL=postgresql://... \ -e REDIS_URL=redis://... \ -e SECRET_KEY=... \ -e JWT_SECRET_KEY=... \ hntai-api
Option 2: Kubernetes Deployment
Prerequisites
- Kubernetes cluster
- kubectl configured
- PostgreSQL and Redis services running
Create Secrets
kubectl create secret generic medical-ai-secrets \ --from-literal=DATABASE_URL=postgresql://... \ --from-literal=REDIS_URL=redis://... \ --from-literal=SECRET_KEY=... \ --from-literal=JWT_SECRET_KEY=...Deploy to Kubernetes
kubectl apply -f infra/k8s/secure_deployment.yamlVerify Deployment
kubectl get pods -n medical-ai kubectl logs -n medical-ai deployment/medical-ai-service
Option 3: Hugging Face Spaces (Legacy)
The application still supports HF Spaces deployment for lightweight use cases.
- Update app.py for HF Spaces compatibility
- Deploy via HF Spaces with Docker SDK
Monitoring and Observability
Prometheus Metrics
The application exposes metrics at /metrics endpoint.
Setup Prometheus
kubectl apply -f monitoring/prometheus.ymlAccess Metrics
curl http://ai-service.medical-ai.svc.cluster.local:80/metrics
Health Checks
- Liveness (
/live): Basic health check - Readiness (
/ready): Checks if agents are initialized
Logging
- Structured JSON logging
- PHI operations logged to database
- Error tracking with stack traces
Security Features
HIPAA Compliance
- PHI scrubbing with audit trails
- Non-root container execution
- Secrets management via Kubernetes
- Network policies restricting traffic
Authentication
- JWT-based authentication (framework ready)
- API key support (configurable)
API Usage
Health Endpoints
GET /live
GET /ready
PHI Scrubbing
POST /phi/scrub
Content-Type: application/json
{
"text": "Patient John Doe, SSN 123-45-6789, diagnosed with diabetes."
}
Response:
{
"scrubbed_text": "Patient [REDACTED], SSN [REDACTED], diagnosed with diabetes.",
"phi_found": ["NAME", "SSN"],
"redaction_count": 2
}
Text Summarization
POST /api/generate_summary
Content-Type: application/json
{
"text": "Long medical text...",
"max_length": 150,
"min_length": 50
}
Generate Patient Summary
The generate_patient_summary endpoint has been migrated from the original Flask implementation to FastAPI. It generates a comprehensive 4-section patient summary from EHR data, with support for streaming (SSE) to handle long-running tasks and prevent timeouts.
Endpoint: POST /generate_patient_summary
Query Parameters:
stream(optional, default:false): Set totruefor Server-Sent Events (SSE) streaming updates.
Request Body (JSON):
{
"patientid": "12345",
"token": "your-auth-token",
"key": "your-api-key",
"patient_summarizer_model_name": "microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-q4.gguf",
"patient_summarizer_model_type": "gguf",
"generation_mode": "hq", // Options: "hq" (high-quality), "fast", "rule" (deterministic)
"timeout_mode": "fast" // Options: "fast" (8s EHR timeout), "extended" (30s)
}
Synchronous Response (when stream=false):
{
"summary": "## Clinical Assessment\n- Patient details...\n\n## Key Trends & Changes\n- Changes detected...\n\n## Plan & Suggested Actions\n- Recommendations...\n\n## Direct Guidance for Physician\n- Clinical insights...",
"baseline": "Patient baseline data...",
"delta": "Changes from previous visits...",
"timing": {"ehr_api": 2.5, "generation": 15.3, "total": 17.8},
"model_used": "microsoft/Phi-3-mini-4k-instruct (gguf)",
"timeout_mode_used": "fast"
}
Streaming Response (when stream=true):
- Returns a
text/event-streamresponse with SSE events:type: progress- Progress updates (e.g., 10%, 50%)type: complete- Final result with full summarytype: error- Error details if failedtype: heartbeat- Keep-alive signals
Notes:
- The endpoint integrates with an external EHR API to fetch patient data.
- Supports multiple model types: GGUF, text-generation, summarization, seq2seq.
- Includes fallbacks for timeouts, API errors, and model failures.
- PHI scrubbing is applied automatically.
- Full implementation includes delta computation, baseline building, and 4-section markdown output.
Other Endpoints (Migration in Progress)
POST /upload- File upload and text extractionPOST /transcribe- Audio transcriptionPOST /extract_medical_data- Structured medical data extractionPOST /api/extract_medical_data_from_audio- Audio-based medical extraction
Troubleshooting
Common Issues
Model Loading Failures
- Check HF_HOME and cache directories
- Ensure sufficient memory
- Verify internet connectivity for model downloads
Database Connection Errors
- Verify DATABASE_URL format
- Check PostgreSQL service status
- Ensure database exists and schema applied
Redis Connection Issues
- Verify REDIS_URL format
- Check Redis service availability
- Monitor Redis memory usage
PHI Scrubbing Not Working
- Check regex patterns in phi_scrubber_service.py
- Verify Redis connection for stats
- Check database audit logs
Performance Tuning
- Adjust thread pools in inference_service.py
- Configure Redis connection pooling
- Set appropriate resource limits in K8s
- Monitor memory usage for model caching
Contributing
- Follow async/await patterns for new endpoints
- Add proper error handling and logging
- Update tests for new functionality
- Ensure HIPAA compliance for PHI handling
- Document API changes in this guide