Spaces:
Paused
Paused
Deployment Instructions
This document provides deployment instructions for the Medical AI Service in various environments.
Local Development
Prerequisites
- Python 3.10+
- Docker (optional, for containerized testing)
Setup
- Clone the repository
- Install dependencies:
pip install -r requirements.txt - Set environment variables (see Configuration section)
- Run the application:
python -m uvicorn ai_med_extract.app:create_app --host 0.0.0.0 --port 7860
Testing
- Health check:
curl http://localhost:7860/health/live - API docs:
http://localhost:7860/docs(FastAPI Swagger UI)
Docker Deployment
Build and Run
docker build -t medical-ai-service .
docker run -p 7860:7860 -e SECRET_KEY=your-secret -e DATABASE_URL=your-db medical-ai-service
Configuration
- Exposes port 7860
- Runs FastAPI app with uvicorn
- Includes model caching optimizations
Kubernetes Deployment
Prerequisites
- Kubernetes cluster
- kubectl configured
- Secrets created for database, Redis, and JWT keys
Deploy
kubectl apply -f infra/k8s/secure_deployment.yaml
Features
- Horizontal Pod Autoscaler (2-10 replicas based on CPU/memory)
- Resource limits: 1-4 CPU, 4-8Gi memory
- Prometheus monitoring annotations
- Security contexts and network policies
Scaling
The HPA automatically scales based on:
- CPU utilization > 70%
- Memory utilization > 80%
Hugging Face Spaces Deployment
Prerequisites
- Hugging Face account
- Space created with Docker runtime
Configuration
- Dockerfile exposes port 7860
- FastAPI app listens on 0.0.0.0:7860
- requirements.txt includes all dependencies
- .huggingface.yaml with
runtime: docker - .dockerignore and .gitignore present
Deploy
# Test locally
docker build -t hntai-app .
docker run -p 7860:7860 hntai-app
# Push to HF Spaces
# App available at your-space-name.hf.space
Configuration
Required Environment Variables
SECRET_KEY: Application secret keyJWT_SECRET_KEY: JWT signing keyDATABASE_URL: PostgreSQL connection stringREDIS_URL: Redis connection string
Optional
ENVIRONMENT: prod/dev (default: prod)PORT: Service port (default: 7860)CORS_ORIGINS: Allowed CORS origins (default: *)- Model cache directories and other settings in config_settings.py
Monitoring
Health Checks
/health/live: Liveness probe/health/ready: Readiness probe
Metrics
/metrics: Prometheus metrics endpoint- Includes performance metrics, model loading status
Logging
- Structured JSON logs for production
- Configurable log levels