HNTAI / services /ai-service /README.md
sachinchandrankallar's picture
Update .gitignore to include additional files and directories for macOS, Linux, and application-specific configurations. Modify .huggingface.yaml to enhance Docker build settings and hardware requirements. Refactor app.py to remove legacy code and improve error handling. Remove deprecated files related to comprehensive streaming fixes, deployment scripts, and optimized Docker configurations. Update Dockerfile.prod to extend Gunicorn timeout for better performance. Enhance health endpoints and model management with improved logging and error handling. Consolidate routes and simplify architecture for better maintainability.
af75202
|
Raw
History Blame
5.82 kB

AI Service (ai_med_extract)

Medical AI service for data extraction, PHI scrubbing, and patient summary generation.

πŸ“‹ Table of Contents


Quick Start

Prerequisites

  • Python 3.10+
  • Docker & Docker Compose (for containerized deployment)
  • Optional: CUDA 11.8+ for GPU support

Quick Development Server

# From services/ai-service directory
cd src
python -m ai_med_extract.app run_dev

This runs Flask's built-in development server on port 7860.

Smoke Test (No Model Loading)

# From services/ai-service directory
python run_smoke_test.py

Local Development

Option 1: Development Server (Fast Iteration)

cd .\services\ai-service\src
python -m ai_med_extract.app run_dev

Option 2: WSGI/Gunicorn (Production-like)

cd .\services\ai-service\src
pip install gunicorn
$env:PRELOAD_SMALL_MODELS="false"
gunicorn -w 4 -b 0.0.0.0:7860 wsgi:app

Using PowerShell Script

cd .\services\ai-service
.\run_local.ps1        # Run without rebuilding
.\run_local.ps1 -Build # Build and run

Docker Deployment

Build Image

cd .\services\ai-service
docker build -f Dockerfile.prod -t ai-service:local .

Run Container

docker run --rm -p 7860:7860 \
  -e PRELOAD_SMALL_MODELS=false \
  -e HF_HOME=/tmp/huggingface \
  -e TORCH_HOME=/tmp/torch_cache \
  ai-service:local

Docker Compose

cd .\services\ai-service
docker-compose up --build     # Build and run
docker-compose logs -f        # Follow logs

Push to Registry

docker tag ai-service:local your-registry/ai-service:latest
docker push your-registry/ai-service:latest

Environment Variables

Variable Description Default
HF_SPACES Signals HF Spaces environment false
PRELOAD_GGUF Enable GGUF model preloading false
PRELOAD_SMALL_MODELS Load small models at startup false
HF_HOME Hugging Face cache directory /tmp/huggingface
TORCH_HOME PyTorch cache directory /tmp/torch
WHISPER_CACHE Whisper model cache /tmp/whisper
DATABASE_URL PostgreSQL connection string Required for production
REDIS_URL Redis connection string Required for production
SECRET_KEY Application secret key Required
JWT_SECRET_KEY JWT signing key Required

API Endpoints

Health & Monitoring

  • GET /health/live - Liveness probe
  • GET /health/ready - Readiness probe
  • GET /metrics - Prometheus metrics

Document Processing

  • POST /upload - Upload and process documents
  • POST /transcribe - Transcribe audio files
  • GET /get_updated_medical_data - Retrieve processed data
  • PUT /update_medical_data - Update medical data

AI Processing

  • POST /generate_patient_summary - Generate comprehensive patient summaries
  • POST /api/generate_summary - Generate text summaries
  • POST /api/patient_summary_openvino - OpenVINO-optimized summaries
  • POST /extract_medical_data - Extract structured medical data

Model Management

  • POST /api/load_model - Load specific AI models
  • GET /api/model_info - Get model information
  • POST /api/switch_model - Switch between models

Verify Endpoints

curl http://localhost:7860/health/live
curl http://localhost:7860/health/ready
curl http://localhost:7860/metrics

Testing

Smoke Test (No Models)

python run_smoke_test.py

Unit Tests

python -m pytest tests/

Integration Tests

python -m pytest tests/integration/

Project Structure

services/ai-service/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ ai_med_extract/
β”‚   β”‚   β”œβ”€β”€ agents/          # AI agents and processors
β”‚   β”‚   β”œβ”€β”€ api/             # FastAPI routes
β”‚   β”‚   β”œβ”€β”€ services/        # Business logic services
β”‚   β”‚   β”œβ”€β”€ utils/           # Utilities and helpers
β”‚   β”‚   β”œβ”€β”€ app.py          # Flask application
β”‚   β”‚   └── main.py         # FastAPI application
β”‚   β”œβ”€β”€ app.py              # Application entry point
β”‚   β”œβ”€β”€ config_settings.py  # Configuration
β”‚   └── wsgi.py             # WSGI entry point
β”œβ”€β”€ k8s/
β”‚   └── deployment.yaml     # Kubernetes manifests
β”œβ”€β”€ docker-compose.yml      # Local Docker Compose
β”œβ”€β”€ Dockerfile.prod         # Production Docker image
β”œβ”€β”€ run_local.ps1           # PowerShell run script
└── README.md               # This file

Kubernetes Deployment

Apply the Kubernetes manifests:

kubectl apply -f k8s/deployment.yaml
kubectl get pods -l app=ai-service
kubectl logs -f <pod-name>

Notes

  • Model Caching: The Docker Compose file mounts ./model_cache to persist models between runs
  • GPU Support: Adjust Dockerfile.prod for CUDA/GPU support
  • Secrets: Never bake secrets into images; use environment variables or mounted secrets
  • Production: Set PRELOAD_SMALL_MODELS=true only if you need models at container start

Additional Documentation

  • Production Deployment: See PRODUCTION_READY_SUMMARY.md in src/ai_med_extract/
  • Integration Guide: See INTEGRATION_GUIDE.md in src/ai_med_extract/utils/
  • Main Project README: See ../../README.md for overall project documentation

For detailed guides and API documentation, see the main project README and the /docs endpoint when the service is running.