File size: 2,633 Bytes
5aafb3a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
# Deployment Instructions

This document provides deployment instructions for the Medical AI Service in various environments.

## Local Development

### Prerequisites
- Python 3.10+
- Docker (optional, for containerized testing)

### Setup
1. Clone the repository
2. Install dependencies: `pip install -r requirements.txt`
3. Set environment variables (see Configuration section)
4. Run the application: `python -m uvicorn ai_med_extract.app:create_app --host 0.0.0.0 --port 7860`

### Testing
- Health check: `curl http://localhost:7860/health/live`
- API docs: `http://localhost:7860/docs` (FastAPI Swagger UI)

## Docker Deployment

### Build and Run
```bash
docker build -t medical-ai-service .
docker run -p 7860:7860 -e SECRET_KEY=your-secret -e DATABASE_URL=your-db medical-ai-service
```

### Configuration
- Exposes port 7860
- Runs FastAPI app with uvicorn
- Includes model caching optimizations

## Kubernetes Deployment

### Prerequisites
- Kubernetes cluster
- kubectl configured
- Secrets created for database, Redis, and JWT keys

### Deploy
```bash
kubectl apply -f infra/k8s/secure_deployment.yaml
```

### Features
- Horizontal Pod Autoscaler (2-10 replicas based on CPU/memory)
- Resource limits: 1-4 CPU, 4-8Gi memory
- Prometheus monitoring annotations
- Security contexts and network policies

### Scaling
The HPA automatically scales based on:
- CPU utilization > 70%
- Memory utilization > 80%

## Hugging Face Spaces Deployment

### Prerequisites
- Hugging Face account
- Space created with Docker runtime

### Configuration
1. Dockerfile exposes port 7860
2. FastAPI app listens on 0.0.0.0:7860
3. requirements.txt includes all dependencies
4. .huggingface.yaml with `runtime: docker`
5. .dockerignore and .gitignore present

### Deploy
```bash
# Test locally
docker build -t hntai-app .
docker run -p 7860:7860 hntai-app

# Push to HF Spaces
# App available at your-space-name.hf.space
```

## Configuration

### Required Environment Variables
- `SECRET_KEY`: Application secret key
- `JWT_SECRET_KEY`: JWT signing key
- `DATABASE_URL`: PostgreSQL connection string
- `REDIS_URL`: Redis connection string

### Optional
- `ENVIRONMENT`: prod/dev (default: prod)
- `PORT`: Service port (default: 7860)
- `CORS_ORIGINS`: Allowed CORS origins (default: *)
- Model cache directories and other settings in config_settings.py

## Monitoring

### Health Checks
- `/health/live`: Liveness probe
- `/health/ready`: Readiness probe

### Metrics
- `/metrics`: Prometheus metrics endpoint
- Includes performance metrics, model loading status

### Logging
- Structured JSON logs for production
- Configurable log levels