# AI Service (ai_med_extract)

Medical AI service for data extraction, PHI scrubbing, and patient summary generation.

## 📋 Table of Contents
- [Quick Start](#quick-start)
- [Local Development](#local-development)
- [Docker Deployment](#docker-deployment)
- [Environment Variables](#environment-variables)
- [API Endpoints](#api-endpoints)
- [Testing](#testing)

---

## Quick Start

### Prerequisites
- Python 3.10+
- Docker & Docker Compose (for containerized deployment)
- Optional: CUDA 11.8+ for GPU support

### Quick Development Server

```powershell
# From services/ai-service directory
cd src
python -m ai_med_extract.app run_dev
```

This runs Flask's built-in development server on port 7860.

### Smoke Test (No Model Loading)

```powershell
# From services/ai-service directory
python run_smoke_test.py
```

---

## Local Development

### Option 1: Development Server (Fast Iteration)

```powershell
cd .\services\ai-service\src
python -m ai_med_extract.app run_dev
```

### Option 2: WSGI/Gunicorn (Production-like)

```powershell
cd .\services\ai-service\src
pip install gunicorn
$env:PRELOAD_SMALL_MODELS="false"
gunicorn -w 4 -b 0.0.0.0:7860 wsgi:app
```

### Using PowerShell Script

```powershell
cd .\services\ai-service
.\run_local.ps1        # Run without rebuilding
.\run_local.ps1 -Build # Build and run
```

---

## Docker Deployment

### Build Image

```powershell
cd .\services\ai-service
docker build -f Dockerfile.prod -t ai-service:local .
```

### Run Container

```powershell
docker run --rm -p 7860:7860 \
  -e PRELOAD_SMALL_MODELS=false \
  -e HF_HOME=/tmp/huggingface \
  -e TORCH_HOME=/tmp/torch_cache \
  ai-service:local
```

### Docker Compose

```powershell
cd .\services\ai-service
docker-compose up --build     # Build and run
docker-compose logs -f        # Follow logs
```

### Push to Registry

```powershell
docker tag ai-service:local your-registry/ai-service:latest
docker push your-registry/ai-service:latest
```

---

## Environment Variables

| Variable | Description | Default |
|----------|-------------|---------|
| `HF_SPACES` | Signals HF Spaces environment | `false` |
| `PRELOAD_GGUF` | Enable GGUF model preloading | `false` |
| `PRELOAD_SMALL_MODELS` | Load small models at startup | `false` |
| `HF_HOME` | Hugging Face cache directory | `/tmp/huggingface` |
| `TORCH_HOME` | PyTorch cache directory | `/tmp/torch` |
| `WHISPER_CACHE` | Whisper model cache | `/tmp/whisper` |
| `DATABASE_URL` | PostgreSQL connection string | Required for production |
| `REDIS_URL` | Redis connection string | Required for production |
| `SECRET_KEY` | Application secret key | Required |
| `JWT_SECRET_KEY` | JWT signing key | Required |

---

## API Endpoints

### Health & Monitoring
- `GET /health/live` - Liveness probe
- `GET /health/ready` - Readiness probe
- `GET /metrics` - Prometheus metrics

### Document Processing
- `POST /upload` - Upload and process documents
- `POST /transcribe` - Transcribe audio files
- `GET /get_updated_medical_data` - Retrieve processed data
- `PUT /update_medical_data` - Update medical data

### AI Processing
- `POST /generate_patient_summary` - Generate comprehensive patient summaries
- `POST /api/generate_summary` - Generate text summaries
- `POST /api/patient_summary_openvino` - OpenVINO-optimized summaries
- `POST /extract_medical_data` - Extract structured medical data

### Model Management
- `POST /api/load_model` - Load specific AI models
- `GET /api/model_info` - Get model information
- `POST /api/switch_model` - Switch between models

### Verify Endpoints

```powershell
curl http://localhost:7860/health/live
curl http://localhost:7860/health/ready
curl http://localhost:7860/metrics
```

---

## Testing

### Smoke Test (No Models)

```powershell
python run_smoke_test.py
```

### Unit Tests

```powershell
python -m pytest tests/
```

### Integration Tests

```powershell
python -m pytest tests/integration/
```

---

## Project Structure

```
services/ai-service/
├── src/
│   ├── ai_med_extract/
│   │   ├── agents/          # AI agents and processors
│   │   ├── api/             # FastAPI routes
│   │   ├── services/        # Business logic services
│   │   ├── utils/           # Utilities and helpers
│   │   ├── app.py          # Flask application
│   │   └── main.py         # FastAPI application
│   ├── app.py              # Application entry point
│   ├── config_settings.py  # Configuration
│   └── wsgi.py             # WSGI entry point
├── k8s/
│   └── deployment.yaml     # Kubernetes manifests
├── docker-compose.yml      # Local Docker Compose
├── Dockerfile.prod         # Production Docker image
├── run_local.ps1           # PowerShell run script
└── README.md               # This file
```

---

## Kubernetes Deployment

Apply the Kubernetes manifests:

```bash
kubectl apply -f k8s/deployment.yaml
kubectl get pods -l app=ai-service
kubectl logs -f <pod-name>
```

---

## Notes

- **Model Caching**: The Docker Compose file mounts `./model_cache` to persist models between runs
- **GPU Support**: Adjust `Dockerfile.prod` for CUDA/GPU support
- **Secrets**: Never bake secrets into images; use environment variables or mounted secrets
- **Production**: Set `PRELOAD_SMALL_MODELS=true` only if you need models at container start

---

## Additional Documentation

- **Production Deployment**: See `PRODUCTION_READY_SUMMARY.md` in `src/ai_med_extract/`
- **Integration Guide**: See `INTEGRATION_GUIDE.md` in `src/ai_med_extract/utils/`
- **Main Project README**: See `../../README.md` for overall project documentation

---

**For detailed guides and API documentation, see the main project README and the `/docs` endpoint when the service is running.**