sachinchandrankallar commited on
Commit
af1ef97
·
1 Parent(s): 2b64d2e

Revert "Update Dockerfiles to use `asgi:app` as the entry point, resolving deployment issues caused by the removal of `app.py`. This change ensures compatibility with the new structure and improves initialization for production environments."

Browse files
DEPLOYMENT.md DELETED
@@ -1,325 +0,0 @@
1
- # Deployment Guide
2
-
3
- ## Quick Start
4
-
5
- ### Docker Deployment (Recommended)
6
-
7
- ```bash
8
- # Build the container
9
- docker build -t hntai:latest .
10
-
11
- # Run the container
12
- docker run -p 7860:7860 hntai:latest
13
- ```
14
-
15
- Access the application at `http://localhost:7860`
16
-
17
- ### Local Development
18
-
19
- ```bash
20
- # Install dependencies
21
- pip install -r requirements.txt
22
-
23
- # Run with uvicorn
24
- python -m uvicorn asgi:app --reload --host 0.0.0.0 --port 7860
25
-
26
- # Or using start script
27
- python start_hf_spaces.py
28
- ```
29
-
30
- ## Deployment Options
31
-
32
- ### 1. Docker (Standard)
33
-
34
- **File**: `Dockerfile`
35
- **Entry Point**: `asgi.py`
36
-
37
- ```bash
38
- docker build -t hntai:latest .
39
- docker run -p 7860:7860 \
40
- -e REDIS_URL=redis://redis:6379 \
41
- -e DATABASE_URL=postgresql://user:pass@db:5432/hntai \
42
- hntai:latest
43
- ```
44
-
45
- ### 2. Docker (Optimized)
46
-
47
- **File**: `Dockerfile.optimized`
48
- **Entry Point**: `asgi.py`
49
- **Features**: Better caching, optimized layers
50
-
51
- ```bash
52
- docker build -f Dockerfile.optimized -t hntai:optimized .
53
- docker run -p 7860:7860 hntai:optimized
54
- ```
55
-
56
- ### 3. Docker Compose
57
-
58
- **File**: `services/ai-service/docker-compose.yml`
59
-
60
- ```bash
61
- cd services/ai-service
62
- docker-compose up -d
63
- ```
64
-
65
- ### 4. Hugging Face Spaces
66
-
67
- **Configuration**: `.huggingface.yaml`
68
- **Entry Point**: `services/ai-service/src/ai_med_extract/app:app`
69
-
70
- The application automatically detects HF Spaces environment and configures accordingly.
71
-
72
- ### 5. Kubernetes
73
-
74
- **Manifests**: `infra/k8s/secure_deployment.yaml`
75
-
76
- ```bash
77
- kubectl apply -f infra/k8s/secure_deployment.yaml
78
- kubectl get pods -l app=hntai
79
- ```
80
-
81
- ## Entry Points
82
-
83
- ### Primary Entry Points
84
-
85
- 1. **`asgi.py`** (Root) - Docker/Production deployment
86
- - Used by Dockerfiles
87
- - Lazy loading, optimized for production
88
- - Proper path setup for imports
89
-
90
- 2. **`start_hf_spaces.py`** (Root) - Hugging Face Spaces
91
- - Detects HF Spaces environment
92
- - Configures for Spaces constraints
93
- - Minimal preloading
94
-
95
- 3. **`services/ai-service/src/ai_med_extract/main.py`** - Development
96
- - Direct execution
97
- - Full configuration
98
- - Used by `python -m` invocation
99
-
100
- ### Application Module
101
-
102
- - **`services/ai-service/src/ai_med_extract/app.py`** - Core app
103
- - `create_app()` - Creates FastAPI instance
104
- - `initialize_agents()` - Sets up AI agents
105
- - `run_dev()` - Development server
106
-
107
- ## Environment Variables
108
-
109
- ### Required
110
- - None (application runs with sensible defaults)
111
-
112
- ### Optional
113
-
114
- | Variable | Description | Default |
115
- |----------|-------------|---------|
116
- | `REDIS_URL` | Redis connection string | Not set (disabled) |
117
- | `DATABASE_URL` | PostgreSQL connection string | Not set (disabled) |
118
- | `HF_SPACES` | Hugging Face Spaces mode | `false` |
119
- | `FAST_MODE` | Skip model preloading | `false` |
120
- | `PRELOAD_GGUF` | Preload GGUF models | `false` |
121
- | `SECRET_KEY` | Application secret key | Auto-generated |
122
- | `JWT_SECRET_KEY` | JWT signing key | Auto-generated |
123
-
124
- ### Cache Directories
125
-
126
- | Variable | Default | Purpose |
127
- |----------|---------|---------|
128
- | `HF_HOME` | `/tmp/huggingface` | Hugging Face cache |
129
- | `TORCH_HOME` | `/tmp/torch` | PyTorch cache |
130
- | `WHISPER_CACHE` | `/tmp/whisper` | Whisper models |
131
- | `XDG_CACHE_HOME` | `/tmp` | General cache |
132
-
133
- ## Configuration Profiles
134
-
135
- ### Development
136
- ```bash
137
- # Full model preloading, all features
138
- python -m uvicorn asgi:app --reload
139
- ```
140
-
141
- ### Production
142
- ```bash
143
- # Optimized, lazy loading
144
- docker run -p 7860:7860 hntai:latest
145
- ```
146
-
147
- ### HuggingFace Spaces
148
- ```bash
149
- # Minimal resources, fast startup
150
- export HF_SPACES=true
151
- export FAST_MODE=true
152
- python start_hf_spaces.py
153
- ```
154
-
155
- ## Health Checks
156
-
157
- ### Endpoints
158
-
159
- - **Liveness**: `GET /health/live`
160
- - Returns 200 if application is running
161
-
162
- - **Readiness**: `GET /health/ready`
163
- - Returns 200 if application is ready to serve requests
164
-
165
- - **Metrics**: `GET /api/performance_metrics`
166
- - Returns system metrics (memory, CPU, etc.)
167
-
168
- ### Docker Health Check
169
-
170
- ```dockerfile
171
- HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
172
- CMD curl -f http://localhost:7860/health/live || exit 1
173
- ```
174
-
175
- ## Troubleshooting
176
-
177
- ### Issue: "Could not import module 'app'"
178
-
179
- **Solution**: Use `asgi.py` instead of `app.py`
180
- ```bash
181
- # Wrong
182
- uvicorn app:app
183
-
184
- # Correct
185
- uvicorn asgi:app
186
- ```
187
-
188
- ### Issue: Models taking too long to load
189
-
190
- **Solution**: Enable fast mode
191
- ```bash
192
- export FAST_MODE=true
193
- # Models will lazy-load on first use
194
- ```
195
-
196
- ### Issue: Out of memory
197
-
198
- **Solution**: Reduce model preloading
199
- ```bash
200
- export PRELOAD_GGUF=false
201
- export FAST_MODE=true
202
- ```
203
-
204
- ### Issue: Redis/Database connection errors
205
-
206
- **Solution**: Application works without Redis/Database (features disabled gracefully)
207
- ```bash
208
- # No action needed - optional features
209
- # Or configure if needed:
210
- export REDIS_URL=redis://localhost:6379
211
- export DATABASE_URL=postgresql://user:pass@localhost:5432/db
212
- ```
213
-
214
- ## Performance Tuning
215
-
216
- ### Memory Optimization
217
-
218
- ```bash
219
- # Set conservative limits
220
- export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:64
221
- export OMP_NUM_THREADS=2
222
- export MKL_NUM_THREADS=2
223
- ```
224
-
225
- ### Thread Configuration
226
-
227
- ```bash
228
- # For CPU-bound workloads
229
- export OMP_NUM_THREADS=4
230
- export MKL_NUM_THREADS=4
231
- export NUMEXPR_NUM_THREADS=4
232
- ```
233
-
234
- ### GGUF Models
235
-
236
- ```bash
237
- # Configure GGUF behavior
238
- export GGUF_N_THREADS=4
239
- export GGUF_N_BATCH=64
240
- ```
241
-
242
- ## Monitoring
243
-
244
- ### Prometheus Metrics
245
-
246
- Available at `/metrics`
247
-
248
- ### Logging
249
-
250
- Structured JSON logs to stdout:
251
- ```json
252
- {
253
- "timestamp": "2024-11-06T18:54:42",
254
- "level": "INFO",
255
- "message": "Agent initialization complete",
256
- "memory_mb": 485.7,
257
- "cpu_percent": 33.1
258
- }
259
- ```
260
-
261
- ### API Documentation
262
-
263
- - **Swagger UI**: `http://localhost:7860/docs`
264
- - **ReDoc**: `http://localhost:7860/redoc`
265
-
266
- ## Security
267
-
268
- ### Container Security
269
-
270
- - Non-root user (where supported)
271
- - Read-only root filesystem (where possible)
272
- - Resource limits configured
273
- - Security headers enabled
274
-
275
- ### Network Security
276
-
277
- - CORS configured (customize in production)
278
- - Rate limiting available
279
- - HTTPS recommended for production
280
-
281
- ## Scaling
282
-
283
- ### Horizontal Scaling
284
-
285
- ```bash
286
- # Run multiple instances behind load balancer
287
- docker run -p 7861:7860 hntai:latest
288
- docker run -p 7862:7860 hntai:latest
289
- docker run -p 7863:7860 hntai:latest
290
- ```
291
-
292
- ### Kubernetes Scaling
293
-
294
- ```bash
295
- kubectl scale deployment hntai --replicas=3
296
- ```
297
-
298
- ## Maintenance
299
-
300
- ### Cache Clearing
301
-
302
- ```bash
303
- # Clear model caches
304
- rm -rf /tmp/huggingface/* /tmp/torch/* /tmp/whisper/*
305
-
306
- # Or restart container (caches in /tmp)
307
- docker restart <container-id>
308
- ```
309
-
310
- ### Log Rotation
311
-
312
- Logs to stdout - configure external log aggregation
313
-
314
- ### Backup
315
-
316
- - Application is stateless
317
- - Backup Redis/Database if used
318
- - Model caches are re-downloadable
319
-
320
- ## Support
321
-
322
- - **Documentation**: Check `/docs` endpoint
323
- - **Issues**: GitHub Issues
324
- - **Logs**: Check container logs `docker logs <container-id>`
325
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Dockerfile CHANGED
@@ -223,4 +223,4 @@ ENTRYPOINT ["/entrypoint.sh"]
223
  EXPOSE 7860
224
 
225
  # Use uvicorn for FastAPI (ASGI) without reload for production
226
- CMD ["uvicorn", "asgi:app", "--host", "0.0.0.0", "--port", "7860"]
 
223
  EXPOSE 7860
224
 
225
  # Use uvicorn for FastAPI (ASGI) without reload for production
226
+ CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]
Dockerfile.optimized CHANGED
@@ -100,4 +100,4 @@ ENTRYPOINT ["/entrypoint.sh"]
100
  EXPOSE 7860
101
 
102
  # Use uvicorn with no-reload to prevent duplicate route registration
103
- CMD ["uvicorn", "asgi:app", "--host", "0.0.0.0", "--port", "7860", "--no-reload"]
 
100
  EXPOSE 7860
101
 
102
  # Use uvicorn with no-reload to prevent duplicate route registration
103
+ CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860", "--no-reload"]
REFACTORING_IMPROVEMENTS.md CHANGED
@@ -216,54 +216,9 @@ No new environment variables required. Existing variables continue to work:
216
  - `DATABASE_URL` - Database connection
217
  - `PRELOAD_GGUF` - Preload GGUF models
218
 
219
- ## Deployment Fix
220
-
221
- ### Issue
222
- After initial refactoring, Docker deployments failed with:
223
- ```
224
- ERROR: Error loading ASGI app. Could not import module "app".
225
- ```
226
-
227
- ### Root Cause
228
- - Deleted root-level `app.py` caused Dockerfile CMD failure
229
- - Dockerfile was configured to use `uvicorn app:app`
230
- - Naming conflict with `services/ai-service/src/app.py`
231
-
232
- ### Solution
233
- ✅ Created `asgi.py` as the deployment entry point
234
- ✅ Updated Dockerfiles to use `uvicorn asgi:app`
235
- ✅ Configured for fast-mode initialization (lazy loading)
236
- ✅ Properly sets up Python path for imports
237
-
238
- ### Updated Files
239
- - **Created**: `asgi.py` - Deployment entry point
240
- - **Modified**: `Dockerfile` - Updated CMD to use `asgi:app`
241
- - **Modified**: `Dockerfile.optimized` - Updated CMD to use `asgi:app`
242
- - **Removed**: `app.py` (root) - Avoided naming conflicts
243
-
244
- ### Deployment Commands
245
-
246
- ```bash
247
- # Docker (root Dockerfile)
248
- docker build -t hntai:latest .
249
- docker run -p 7860:7860 hntai:latest
250
- # Uses: uvicorn asgi:app --host 0.0.0.0 --port 7860
251
-
252
- # Docker Optimized
253
- docker build -f Dockerfile.optimized -t hntai:optimized .
254
- docker run -p 7860:7860 hntai:optimized
255
- # Uses: uvicorn asgi:app --host 0.0.0.0 --port 7860 --no-reload
256
-
257
- # HuggingFace Spaces (uses .huggingface.yaml)
258
- # Automatically uses: services/ai-service/src/ai_med_extract/app:app
259
-
260
- # Local development
261
- python -m uvicorn asgi:app --reload --host 0.0.0.0 --port 7860
262
- ```
263
-
264
  ## Conclusion
265
 
266
- This refactoring significantly improves the codebase quality while maintaining 100% backward compatibility. All functionality is preserved, and the application runs successfully with the refactored code. The deployment issue has been resolved with the new `asgi.py` entry point. The foundation is now set for continued improvements and easier maintenance going forward.
267
 
268
  **Status**: ✅ Ready for Production
269
 
 
216
  - `DATABASE_URL` - Database connection
217
  - `PRELOAD_GGUF` - Preload GGUF models
218
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
219
  ## Conclusion
220
 
221
+ This refactoring significantly improves the codebase quality while maintaining 100% backward compatibility. All functionality is preserved, and the application runs successfully with the refactored code. The foundation is now set for continued improvements and easier maintenance going forward.
222
 
223
  **Status**: ✅ Ready for Production
224
 
app.py DELETED
@@ -1,31 +0,0 @@
1
- """
2
- Deployment entry point for ASGI servers (uvicorn, gunicorn, etc.).
3
- This is a thin wrapper that imports the actual FastAPI app from the ai_med_extract package.
4
-
5
- Usage:
6
- uvicorn app:app --host 0.0.0.0 --port 7860
7
- gunicorn -k uvicorn.workers.UvicornWorker app:app
8
- """
9
- import sys
10
- import os
11
- from pathlib import Path
12
-
13
- # Ensure the services/ai-service/src directory is in the Python path
14
- src_dir = Path(__file__).parent / "services" / "ai-service" / "src"
15
- if src_dir.exists():
16
- sys.path.insert(0, str(src_dir))
17
-
18
- # Set environment for deployment
19
- os.environ.setdefault('PYTHONUNBUFFERED', '1')
20
-
21
- # Import the FastAPI app from the actual implementation
22
- # Use absolute import with module name to avoid conflicts
23
- from ai_med_extract.app import create_app, initialize_agents
24
-
25
- # Create and initialize the app for deployment
26
- app = create_app(initialize=False)
27
- initialize_agents(app, preload_small_models=False)
28
-
29
- # Export for ASGI servers
30
- __all__ = ["app"]
31
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
asgi.py DELETED
@@ -1,31 +0,0 @@
1
- """
2
- Deployment entry point for ASGI servers (uvicorn, gunicorn, etc.).
3
- This is a thin wrapper that imports the actual FastAPI app from the ai_med_extract package.
4
-
5
- Usage:
6
- uvicorn asgi:app --host 0.0.0.0 --port 7860
7
- gunicorn -k uvicorn.workers.UvicornWorker asgi:app
8
- """
9
- import sys
10
- import os
11
- from pathlib import Path
12
-
13
- # Ensure the services/ai-service/src directory is in the Python path
14
- src_dir = Path(__file__).parent / "services" / "ai-service" / "src"
15
- if src_dir.exists():
16
- sys.path.insert(0, str(src_dir))
17
-
18
- # Set environment for deployment
19
- os.environ.setdefault('PYTHONUNBUFFERED', '1')
20
- os.environ.setdefault('FAST_MODE', 'true') # Don't preload models during import
21
-
22
- # Import the FastAPI app from the actual implementation
23
- from ai_med_extract.app import create_app, initialize_agents
24
-
25
- # Create and initialize the app for deployment
26
- app = create_app(initialize=False)
27
- initialize_agents(app, preload_small_models=False)
28
-
29
- # Export for ASGI servers
30
- __all__ = ["app"]
31
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
services/ai-service/src/app.py ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Top-level service app shim.
2
+
3
+ This module is intentionally a thin wrapper that re-exports the
4
+ canonical `create_app` and `initialize_agents` functions from the
5
+ `ai_med_extract` package. Keep the real implementation inside
6
+ `ai_med_extract` to avoid duplication.
7
+ """
8
+ from ai_med_extract.app import create_app, initialize_agents, run_dev # noqa: F401
9
+
10
+ __all__ = ["create_app", "initialize_agents", "run_dev"]
11
+