# HNTAI - Scalable Medical Data Extraction API - Development Guide

## Overview

This FastAPI-based application provides scalable medical data extraction services, fully aligned with the "ChatGPT Version 3 - Scalable" architecture. It features async processing, Redis caching, PostgreSQL persistence, and enterprise-grade security.

## Architecture

### Core Components

1. **FastAPI Application** (`app.py`)
   - Main application factory with lifespan events
   - CORS middleware for cross-origin requests
   - Centralized agent initialization
   - Route registration from APIRouter

2. **Configuration** (`config_settings.py`)
   - Pydantic-based settings with validation
   - Environment variable loading
   - Database and Redis URL configuration

3. **Inference Service** (`inference_service.py`)
   - Async text summarization using thread pools
   - Model caching for performance
   - Chunking for long text processing

4. **PHI Scrubber Service** (`phi_scrubber_service.py`)
   - Regex-based PHI detection and redaction
   - Audit logging to PostgreSQL
   - Redis-based statistics tracking

5. **API Routes** (`api/routes_fastapi.py`)
   - FastAPI APIRouter with async endpoints
   - Health checks (/live, /ready)
   - Placeholder routes for full migration

### Data Flow

```
Client Request → FastAPI → Route Handler → Agent/Service → Redis Cache → PostgreSQL → Response
```

## Development Setup

### Prerequisites

- Python 3.10+
- PostgreSQL 13+
- Redis 6+
- Docker (optional)

### Local Development

1. **Clone and Setup Virtual Environment**
   ```bash
   git clone <repository>
   cd hntai
   python -m venv venv
   source venv/bin/activate  # On Windows: venv\Scripts\activate
   ```

2. **Install Dependencies**
   ```bash
   pip install -r requirements.txt
   ```

3. **Setup Database and Redis**
   ```bash
   # Start PostgreSQL (using Docker)
   docker run -d --name postgres -e POSTGRES_PASSWORD=password -p 5432:5432 postgres:13

   # Start Redis (using Docker)
   docker run -d --name redis -p 6379:6379 redis:6

   # Create database
   createdb medical_ai
   ```

4. **Environment Variables**
   Create `.env` file:
   ```bash
   DATABASE_URL=postgresql://postgres:password@localhost:5432/medical_ai
   REDIS_URL=redis://localhost:6379/0
   SECRET_KEY=your-secret-key-here
   JWT_SECRET_KEY=your-jwt-secret-key-here
   ```

5. **Run Database Migrations**
   ```bash
   # Apply schema
   psql -d medical_ai -f database/postgresql/001_schema.sql
   ```

6. **Run the Application**
   ```bash
   # Development mode
   python -m ai_med_extract.main

   # Or directly
   uvicorn ai_med_extract.app:create_app --reload --host 0.0.0.0 --port 7860
   ```

7. **Access the Application**
   - API: http://localhost:7860
   - Docs: http://localhost:7860/docs (FastAPI auto-generated)
   - Health: http://localhost:7860/live

### Debugging

1. **Enable Debug Logging**
   ```python
   import logging
   logging.basicConfig(level=logging.DEBUG)
   ```

2. **Use FastAPI Debug Mode**
   ```bash
   uvicorn ai_med_extract.app:create_app --reload --debug --host 0.0.0.0 --port 7860
   ```

3. **Test Endpoints**
   ```bash
   # Health check
   curl http://localhost:7860/live

   # API docs
   curl http://localhost:7860/openapi.json
   ```

4. **Database Debugging**
   ```bash
   # Connect to PostgreSQL
   psql -d medical_ai

   # Check PHI audit logs
   SELECT * FROM phi_audit_log LIMIT 10;
   ```

5. **Redis Debugging**
   ```bash
   # Connect to Redis CLI
   redis-cli

   # Check keys
   KEYS *
   ```

## Production Deployment

### Option 1: Docker Deployment

1. **Build Docker Image**
   ```bash
   docker build -t hntai-api .
   ```

2. **Run Container**
   ```bash
   docker run -d \
     --name hntai-api \
     -p 7860:7860 \
     -e DATABASE_URL=postgresql://... \
     -e REDIS_URL=redis://... \
     -e SECRET_KEY=... \
     -e JWT_SECRET_KEY=... \
     hntai-api
   ```

### Option 2: Kubernetes Deployment

1. **Prerequisites**
   - Kubernetes cluster
   - kubectl configured
   - PostgreSQL and Redis services running

2. **Create Secrets**
   ```bash
   kubectl create secret generic medical-ai-secrets \
     --from-literal=DATABASE_URL=postgresql://... \
     --from-literal=REDIS_URL=redis://... \
     --from-literal=SECRET_KEY=... \
     --from-literal=JWT_SECRET_KEY=...
   ```

3. **Deploy to Kubernetes**
   ```bash
   kubectl apply -f infra/k8s/secure_deployment.yaml
   ```

4. **Verify Deployment**
   ```bash
   kubectl get pods -n medical-ai
   kubectl logs -n medical-ai deployment/medical-ai-service
   ```

### Option 3: Hugging Face Spaces (Legacy)

The application still supports HF Spaces deployment for lightweight use cases.

1. **Update app.py** for HF Spaces compatibility
2. **Deploy via HF Spaces** with Docker SDK

## Monitoring and Observability

### Prometheus Metrics

The application exposes metrics at `/metrics` endpoint.

1. **Setup Prometheus**
   ```bash
   kubectl apply -f monitoring/prometheus.yml
   ```

2. **Access Metrics**
   ```bash
   curl http://ai-service.medical-ai.svc.cluster.local:80/metrics
   ```

### Health Checks

- **Liveness** (`/live`): Basic health check
- **Readiness** (`/ready`): Checks if agents are initialized

### Logging

- Structured JSON logging
- PHI operations logged to database
- Error tracking with stack traces

## Security Features

### HIPAA Compliance

- PHI scrubbing with audit trails
- Non-root container execution
- Secrets management via Kubernetes
- Network policies restricting traffic

### Authentication

- JWT-based authentication (framework ready)
- API key support (configurable)

## API Usage

### Health Endpoints

```bash
GET /live
GET /ready
```

### PHI Scrubbing

```bash
POST /phi/scrub
Content-Type: application/json

{
  "text": "Patient John Doe, SSN 123-45-6789, diagnosed with diabetes."
}
```

Response:
```json
{
  "scrubbed_text": "Patient [REDACTED], SSN [REDACTED], diagnosed with diabetes.",
  "phi_found": ["NAME", "SSN"],
  "redaction_count": 2
}
```

### Text Summarization

```bash
POST /api/generate_summary
Content-Type: application/json

{
  "text": "Long medical text...",
  "max_length": 150,
  "min_length": 50
}
```

### Generate Patient Summary

The `generate_patient_summary` endpoint has been migrated from the original Flask implementation to FastAPI. It generates a comprehensive 4-section patient summary from EHR data, with support for streaming (SSE) to handle long-running tasks and prevent timeouts.

**Endpoint**: `POST /generate_patient_summary`

**Query Parameters**:
- `stream` (optional, default: `false`): Set to `true` for Server-Sent Events (SSE) streaming updates.

**Request Body** (JSON):
```json
{
  "patientid": "12345",
  "token": "your-auth-token",
  "key": "your-api-key",
  "patient_summarizer_model_name": "microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-q4.gguf",
  "patient_summarizer_model_type": "gguf",
  "generation_mode": "hq",  // Options: "hq" (high-quality), "fast", "rule" (deterministic)
  "timeout_mode": "fast"    // Options: "fast" (8s EHR timeout), "extended" (30s)
}
```

**Synchronous Response** (when `stream=false`):
```json
{
  "summary": "## Clinical Assessment\n- Patient details...\n\n## Key Trends & Changes\n- Changes detected...\n\n## Plan & Suggested Actions\n- Recommendations...\n\n## Direct Guidance for Physician\n- Clinical insights...",
  "baseline": "Patient baseline data...",
  "delta": "Changes from previous visits...",
  "timing": {"ehr_api": 2.5, "generation": 15.3, "total": 17.8},
  "model_used": "microsoft/Phi-3-mini-4k-instruct (gguf)",
  "timeout_mode_used": "fast"
}
```

**Streaming Response** (when `stream=true`):
- Returns a `text/event-stream` response with SSE events:
  - `type: progress` - Progress updates (e.g., 10%, 50%)
  - `type: complete` - Final result with full summary
  - `type: error` - Error details if failed
  - `type: heartbeat` - Keep-alive signals

**Notes**:
- The endpoint integrates with an external EHR API to fetch patient data.
- Supports multiple model types: GGUF, text-generation, summarization, seq2seq.
- Includes fallbacks for timeouts, API errors, and model failures.
- PHI scrubbing is applied automatically.
- Full implementation includes delta computation, baseline building, and 4-section markdown output.

### Other Endpoints (Migration in Progress)
- `POST /upload` - File upload and text extraction
- `POST /transcribe` - Audio transcription
- `POST /extract_medical_data` - Structured medical data extraction
- `POST /api/extract_medical_data_from_audio` - Audio-based medical extraction

## Troubleshooting

### Common Issues

1. **Model Loading Failures**
   - Check HF_HOME and cache directories
   - Ensure sufficient memory
   - Verify internet connectivity for model downloads

2. **Database Connection Errors**
   - Verify DATABASE_URL format
   - Check PostgreSQL service status
   - Ensure database exists and schema applied

3. **Redis Connection Issues**
   - Verify REDIS_URL format
   - Check Redis service availability
   - Monitor Redis memory usage

4. **PHI Scrubbing Not Working**
   - Check regex patterns in phi_scrubber_service.py
   - Verify Redis connection for stats
   - Check database audit logs

### Performance Tuning

- Adjust thread pools in inference_service.py
- Configure Redis connection pooling
- Set appropriate resource limits in K8s
- Monitor memory usage for model caching

## Contributing

1. Follow async/await patterns for new endpoints
2. Add proper error handling and logging
3. Update tests for new functionality
4. Ensure HIPAA compliance for PHI handling
5. Document API changes in this guide