Spaces:
Paused
Paused
File size: 2,633 Bytes
5aafb3a | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 | # Deployment Instructions
This document provides deployment instructions for the Medical AI Service in various environments.
## Local Development
### Prerequisites
- Python 3.10+
- Docker (optional, for containerized testing)
### Setup
1. Clone the repository
2. Install dependencies: `pip install -r requirements.txt`
3. Set environment variables (see Configuration section)
4. Run the application: `python -m uvicorn ai_med_extract.app:create_app --host 0.0.0.0 --port 7860`
### Testing
- Health check: `curl http://localhost:7860/health/live`
- API docs: `http://localhost:7860/docs` (FastAPI Swagger UI)
## Docker Deployment
### Build and Run
```bash
docker build -t medical-ai-service .
docker run -p 7860:7860 -e SECRET_KEY=your-secret -e DATABASE_URL=your-db medical-ai-service
```
### Configuration
- Exposes port 7860
- Runs FastAPI app with uvicorn
- Includes model caching optimizations
## Kubernetes Deployment
### Prerequisites
- Kubernetes cluster
- kubectl configured
- Secrets created for database, Redis, and JWT keys
### Deploy
```bash
kubectl apply -f infra/k8s/secure_deployment.yaml
```
### Features
- Horizontal Pod Autoscaler (2-10 replicas based on CPU/memory)
- Resource limits: 1-4 CPU, 4-8Gi memory
- Prometheus monitoring annotations
- Security contexts and network policies
### Scaling
The HPA automatically scales based on:
- CPU utilization > 70%
- Memory utilization > 80%
## Hugging Face Spaces Deployment
### Prerequisites
- Hugging Face account
- Space created with Docker runtime
### Configuration
1. Dockerfile exposes port 7860
2. FastAPI app listens on 0.0.0.0:7860
3. requirements.txt includes all dependencies
4. .huggingface.yaml with `runtime: docker`
5. .dockerignore and .gitignore present
### Deploy
```bash
# Test locally
docker build -t hntai-app .
docker run -p 7860:7860 hntai-app
# Push to HF Spaces
# App available at your-space-name.hf.space
```
## Configuration
### Required Environment Variables
- `SECRET_KEY`: Application secret key
- `JWT_SECRET_KEY`: JWT signing key
- `DATABASE_URL`: PostgreSQL connection string
- `REDIS_URL`: Redis connection string
### Optional
- `ENVIRONMENT`: prod/dev (default: prod)
- `PORT`: Service port (default: 7860)
- `CORS_ORIGINS`: Allowed CORS origins (default: *)
- Model cache directories and other settings in config_settings.py
## Monitoring
### Health Checks
- `/health/live`: Liveness probe
- `/health/ready`: Readiness probe
### Metrics
- `/metrics`: Prometheus metrics endpoint
- Includes performance metrics, model loading status
### Logging
- Structured JSON logs for production
- Configurable log levels
|