# AI Service (ai_med_extract) Medical AI service for data extraction, PHI scrubbing, and patient summary generation. ## 📋 Table of Contents - [Quick Start](#quick-start) - [Local Development](#local-development) - [Docker Deployment](#docker-deployment) - [Environment Variables](#environment-variables) - [API Endpoints](#api-endpoints) - [Testing](#testing) --- ## Quick Start ### Prerequisites - Python 3.10+ - Docker & Docker Compose (for containerized deployment) - Optional: CUDA 11.8+ for GPU support ### Quick Development Server ```powershell # From services/ai-service directory cd src python -m ai_med_extract.app run_dev ``` This runs Flask's built-in development server on port 7860. ### Smoke Test (No Model Loading) ```powershell # From services/ai-service directory python run_smoke_test.py ``` --- ## Local Development ### Option 1: Development Server (Fast Iteration) ```powershell cd .\services\ai-service\src python -m ai_med_extract.app run_dev ``` ### Option 2: WSGI/Gunicorn (Production-like) ```powershell cd .\services\ai-service\src pip install gunicorn $env:PRELOAD_SMALL_MODELS="false" gunicorn -w 4 -b 0.0.0.0:7860 wsgi:app ``` ### Using PowerShell Script ```powershell cd .\services\ai-service .\run_local.ps1 # Run without rebuilding .\run_local.ps1 -Build # Build and run ``` --- ## Docker Deployment ### Build Image ```powershell cd .\services\ai-service docker build -f Dockerfile.prod -t ai-service:local . ``` ### Run Container ```powershell docker run --rm -p 7860:7860 \ -e PRELOAD_SMALL_MODELS=false \ -e HF_HOME=/tmp/huggingface \ -e TORCH_HOME=/tmp/torch_cache \ ai-service:local ``` ### Docker Compose ```powershell cd .\services\ai-service docker-compose up --build # Build and run docker-compose logs -f # Follow logs ``` ### Push to Registry ```powershell docker tag ai-service:local your-registry/ai-service:latest docker push your-registry/ai-service:latest ``` --- ## Environment Variables | Variable | Description | Default | |----------|-------------|---------| | `HF_SPACES` | Signals HF Spaces environment | `false` | | `PRELOAD_GGUF` | Enable GGUF model preloading | `false` | | `PRELOAD_SMALL_MODELS` | Load small models at startup | `false` | | `HF_HOME` | Hugging Face cache directory | `/tmp/huggingface` | | `TORCH_HOME` | PyTorch cache directory | `/tmp/torch` | | `WHISPER_CACHE` | Whisper model cache | `/tmp/whisper` | | `DATABASE_URL` | PostgreSQL connection string | Required for production | | `REDIS_URL` | Redis connection string | Required for production | | `SECRET_KEY` | Application secret key | Required | | `JWT_SECRET_KEY` | JWT signing key | Required | --- ## API Endpoints ### Health & Monitoring - `GET /health/live` - Liveness probe - `GET /health/ready` - Readiness probe - `GET /metrics` - Prometheus metrics ### Document Processing - `POST /upload` - Upload and process documents - `POST /transcribe` - Transcribe audio files - `GET /get_updated_medical_data` - Retrieve processed data - `PUT /update_medical_data` - Update medical data ### AI Processing - `POST /generate_patient_summary` - Generate comprehensive patient summaries - `POST /api/generate_summary` - Generate text summaries - `POST /api/patient_summary_openvino` - OpenVINO-optimized summaries - `POST /extract_medical_data` - Extract structured medical data ### Model Management - `POST /api/load_model` - Load specific AI models - `GET /api/model_info` - Get model information - `POST /api/switch_model` - Switch between models ### Verify Endpoints ```powershell curl http://localhost:7860/health/live curl http://localhost:7860/health/ready curl http://localhost:7860/metrics ``` --- ## Testing ### Smoke Test (No Models) ```powershell python run_smoke_test.py ``` ### Unit Tests ```powershell python -m pytest tests/ ``` ### Integration Tests ```powershell python -m pytest tests/integration/ ``` --- ## Project Structure ``` services/ai-service/ ├── src/ │ ├── ai_med_extract/ │ │ ├── agents/ # AI agents and processors │ │ ├── api/ # FastAPI routes │ │ ├── services/ # Business logic services │ │ ├── utils/ # Utilities and helpers │ │ ├── app.py # Flask application │ │ └── main.py # FastAPI application │ ├── app.py # Application entry point │ ├── config_settings.py # Configuration │ └── wsgi.py # WSGI entry point ├── k8s/ │ └── deployment.yaml # Kubernetes manifests ├── docker-compose.yml # Local Docker Compose ├── Dockerfile.prod # Production Docker image ├── run_local.ps1 # PowerShell run script └── README.md # This file ``` --- ## Kubernetes Deployment Apply the Kubernetes manifests: ```bash kubectl apply -f k8s/deployment.yaml kubectl get pods -l app=ai-service kubectl logs -f ``` --- ## Notes - **Model Caching**: The Docker Compose file mounts `./model_cache` to persist models between runs - **GPU Support**: Adjust `Dockerfile.prod` for CUDA/GPU support - **Secrets**: Never bake secrets into images; use environment variables or mounted secrets - **Production**: Set `PRELOAD_SMALL_MODELS=true` only if you need models at container start --- ## Additional Documentation - **Production Deployment**: See `PRODUCTION_READY_SUMMARY.md` in `src/ai_med_extract/` - **Integration Guide**: See `INTEGRATION_GUIDE.md` in `src/ai_med_extract/utils/` - **Main Project README**: See `../../README.md` for overall project documentation --- **For detailed guides and API documentation, see the main project README and the `/docs` endpoint when the service is running.**