Spaces:

salvinjose
/

HNTAI

Paused

App Files Files Community

sachinchandrankallar commited on Nov 6, 2025

Commit

af1ef97

1 Parent(s): 2b64d2e

Revert "Update Dockerfiles to use `asgi:app` as the entry point, resolving deployment issues caused by the removal of `app.py`. This change ensures compatibility with the new structure and improves initialization for production environments."

Browse files

Files changed (7) hide show

DEPLOYMENT.md +0 -325
Dockerfile +1 -1
Dockerfile.optimized +1 -1
REFACTORING_IMPROVEMENTS.md +1 -46
app.py +0 -31
asgi.py +0 -31
services/ai-service/src/app.py +11 -0

DEPLOYMENT.md DELETED Viewed

@@ -1,325 +0,0 @@
-# Deployment Guide
-## Quick Start
-### Docker Deployment (Recommended)
-```bash
-# Build the container
-docker build -t hntai:latest .
-# Run the container
-docker run -p 7860:7860 hntai:latest
-```
-Access the application at `http://localhost:7860`
-### Local Development
-```bash
-# Install dependencies
-pip install -r requirements.txt
-# Run with uvicorn
-python -m uvicorn asgi:app --reload --host 0.0.0.0 --port 7860
-# Or using start script
-python start_hf_spaces.py
-```
-## Deployment Options
-### 1. Docker (Standard)
-**File**: `Dockerfile`
-**Entry Point**: `asgi.py`
-```bash
-docker build -t hntai:latest .
-docker run -p 7860:7860 \
-  -e REDIS_URL=redis://redis:6379 \
-  -e DATABASE_URL=postgresql://user:pass@db:5432/hntai \
-  hntai:latest
-```
-### 2. Docker (Optimized)
-**File**: `Dockerfile.optimized`
-**Entry Point**: `asgi.py`
-**Features**: Better caching, optimized layers
-```bash
-docker build -f Dockerfile.optimized -t hntai:optimized .
-docker run -p 7860:7860 hntai:optimized
-```
-### 3. Docker Compose
-**File**: `services/ai-service/docker-compose.yml`
-```bash
-cd services/ai-service
-docker-compose up -d
-```
-### 4. Hugging Face Spaces
-**Configuration**: `.huggingface.yaml`
-**Entry Point**: `services/ai-service/src/ai_med_extract/app:app`
-The application automatically detects HF Spaces environment and configures accordingly.
-### 5. Kubernetes
-**Manifests**: `infra/k8s/secure_deployment.yaml`
-```bash
-kubectl apply -f infra/k8s/secure_deployment.yaml
-kubectl get pods -l app=hntai
-```
-## Entry Points
-### Primary Entry Points
-1. **`asgi.py`** (Root) - Docker/Production deployment
-   - Used by Dockerfiles
-   - Lazy loading, optimized for production
-   - Proper path setup for imports
-2. **`start_hf_spaces.py`** (Root) - Hugging Face Spaces
-   - Detects HF Spaces environment
-   - Configures for Spaces constraints
-   - Minimal preloading
-3. **`services/ai-service/src/ai_med_extract/main.py`** - Development
-   - Direct execution
-   - Full configuration
-   - Used by `python -m` invocation
-### Application Module
-- **`services/ai-service/src/ai_med_extract/app.py`** - Core app
-  - `create_app()` - Creates FastAPI instance
-  - `initialize_agents()` - Sets up AI agents
-  - `run_dev()` - Development server
-## Environment Variables
-### Required
-- None (application runs with sensible defaults)
-### Optional
-| Variable | Description | Default |
-|----------|-------------|---------|
-| `REDIS_URL` | Redis connection string | Not set (disabled) |
-| `DATABASE_URL` | PostgreSQL connection string | Not set (disabled) |
-| `HF_SPACES` | Hugging Face Spaces mode | `false` |
-| `FAST_MODE` | Skip model preloading | `false` |
-| `PRELOAD_GGUF` | Preload GGUF models | `false` |
-| `SECRET_KEY` | Application secret key | Auto-generated |
-| `JWT_SECRET_KEY` | JWT signing key | Auto-generated |
-### Cache Directories
-| Variable | Default | Purpose |
-|----------|---------|---------|
-| `HF_HOME` | `/tmp/huggingface` | Hugging Face cache |
-| `TORCH_HOME` | `/tmp/torch` | PyTorch cache |
-| `WHISPER_CACHE` | `/tmp/whisper` | Whisper models |
-| `XDG_CACHE_HOME` | `/tmp` | General cache |
-## Configuration Profiles
-### Development
-```bash
-# Full model preloading, all features
-python -m uvicorn asgi:app --reload
-```
-### Production
-```bash
-# Optimized, lazy loading
-docker run -p 7860:7860 hntai:latest
-```
-### HuggingFace Spaces
-```bash
-# Minimal resources, fast startup
-export HF_SPACES=true
-export FAST_MODE=true
-python start_hf_spaces.py
-```
-## Health Checks
-### Endpoints
-- **Liveness**: `GET /health/live`
-  - Returns 200 if application is running
-- **Readiness**: `GET /health/ready`
-  - Returns 200 if application is ready to serve requests
-- **Metrics**: `GET /api/performance_metrics`
-  - Returns system metrics (memory, CPU, etc.)
-### Docker Health Check
-```dockerfile
-HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
-  CMD curl -f http://localhost:7860/health/live || exit 1
-```
-## Troubleshooting
-### Issue: "Could not import module 'app'"
-**Solution**: Use `asgi.py` instead of `app.py`
-```bash
-# Wrong
-uvicorn app:app
-# Correct
-uvicorn asgi:app
-```
-### Issue: Models taking too long to load
-**Solution**: Enable fast mode
-```bash
-export FAST_MODE=true
-# Models will lazy-load on first use
-```
-### Issue: Out of memory
-**Solution**: Reduce model preloading
-```bash
-export PRELOAD_GGUF=false
-export FAST_MODE=true
-```
-### Issue: Redis/Database connection errors
-**Solution**: Application works without Redis/Database (features disabled gracefully)
-```bash
-# No action needed - optional features
-# Or configure if needed:
-export REDIS_URL=redis://localhost:6379
-export DATABASE_URL=postgresql://user:pass@localhost:5432/db
-```
-## Performance Tuning
-### Memory Optimization
-```bash
-# Set conservative limits
-export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:64
-export OMP_NUM_THREADS=2
-export MKL_NUM_THREADS=2
-```
-### Thread Configuration
-```bash
-# For CPU-bound workloads
-export OMP_NUM_THREADS=4
-export MKL_NUM_THREADS=4
-export NUMEXPR_NUM_THREADS=4
-```
-### GGUF Models
-```bash
-# Configure GGUF behavior
-export GGUF_N_THREADS=4
-export GGUF_N_BATCH=64
-```
-## Monitoring
-### Prometheus Metrics
-Available at `/metrics`
-### Logging
-Structured JSON logs to stdout:
-```json
-{
-  "timestamp": "2024-11-06T18:54:42",
-  "level": "INFO",
-  "message": "Agent initialization complete",
-  "memory_mb": 485.7,
-  "cpu_percent": 33.1
-}
-```
-### API Documentation
-- **Swagger UI**: `http://localhost:7860/docs`
-- **ReDoc**: `http://localhost:7860/redoc`
-## Security
-### Container Security
-- Non-root user (where supported)
-- Read-only root filesystem (where possible)
-- Resource limits configured
-- Security headers enabled
-### Network Security
-- CORS configured (customize in production)
-- Rate limiting available
-- HTTPS recommended for production
-## Scaling
-### Horizontal Scaling
-```bash
-# Run multiple instances behind load balancer
-docker run -p 7861:7860 hntai:latest
-docker run -p 7862:7860 hntai:latest
-docker run -p 7863:7860 hntai:latest
-```
-### Kubernetes Scaling
-```bash
-kubectl scale deployment hntai --replicas=3
-```
-## Maintenance
-### Cache Clearing
-```bash
-# Clear model caches
-rm -rf /tmp/huggingface/* /tmp/torch/* /tmp/whisper/*
-# Or restart container (caches in /tmp)
-docker restart <container-id>
-```
-### Log Rotation
-Logs to stdout - configure external log aggregation
-### Backup
-- Application is stateless
-- Backup Redis/Database if used
-- Model caches are re-downloadable
-## Support
-- **Documentation**: Check `/docs` endpoint
-- **Issues**: GitHub Issues
-- **Logs**: Check container logs `docker logs <container-id>`

Dockerfile CHANGED Viewed

@@ -223,4 +223,4 @@ ENTRYPOINT ["/entrypoint.sh"]
 EXPOSE 7860
 # Use uvicorn for FastAPI (ASGI) without reload for production
-CMD ["uvicorn", "asgi:app", "--host", "0.0.0.0", "--port", "7860"]

 EXPOSE 7860
 # Use uvicorn for FastAPI (ASGI) without reload for production
+CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]

Dockerfile.optimized CHANGED Viewed

@@ -100,4 +100,4 @@ ENTRYPOINT ["/entrypoint.sh"]
 EXPOSE 7860
 # Use uvicorn with no-reload to prevent duplicate route registration
-CMD ["uvicorn", "asgi:app", "--host", "0.0.0.0", "--port", "7860", "--no-reload"]

 EXPOSE 7860
 # Use uvicorn with no-reload to prevent duplicate route registration
+CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860", "--no-reload"]

REFACTORING_IMPROVEMENTS.md CHANGED Viewed

@@ -216,54 +216,9 @@ No new environment variables required. Existing variables continue to work:
 - `DATABASE_URL` - Database connection
 - `PRELOAD_GGUF` - Preload GGUF models
-## Deployment Fix
-### Issue
-After initial refactoring, Docker deployments failed with:
-```
-ERROR: Error loading ASGI app. Could not import module "app".
-```
-### Root Cause
-- Deleted root-level `app.py` caused Dockerfile CMD failure
-- Dockerfile was configured to use `uvicorn app:app`
-- Naming conflict with `services/ai-service/src/app.py`
-### Solution
-✅ Created `asgi.py` as the deployment entry point
-✅ Updated Dockerfiles to use `uvicorn asgi:app`
-✅ Configured for fast-mode initialization (lazy loading)
-✅ Properly sets up Python path for imports
-### Updated Files
-- **Created**: `asgi.py` - Deployment entry point
-- **Modified**: `Dockerfile` - Updated CMD to use `asgi:app`
-- **Modified**: `Dockerfile.optimized` - Updated CMD to use `asgi:app`
-- **Removed**: `app.py` (root) - Avoided naming conflicts
-### Deployment Commands
-```bash
-# Docker (root Dockerfile)
-docker build -t hntai:latest .
-docker run -p 7860:7860 hntai:latest
-# Uses: uvicorn asgi:app --host 0.0.0.0 --port 7860
-# Docker Optimized
-docker build -f Dockerfile.optimized -t hntai:optimized .
-docker run -p 7860:7860 hntai:optimized
-# Uses: uvicorn asgi:app --host 0.0.0.0 --port 7860 --no-reload
-# HuggingFace Spaces (uses .huggingface.yaml)
-# Automatically uses: services/ai-service/src/ai_med_extract/app:app
-# Local development
-python -m uvicorn asgi:app --reload --host 0.0.0.0 --port 7860
-```
 ## Conclusion
-This refactoring significantly improves the codebase quality while maintaining 100% backward compatibility. All functionality is preserved, and the application runs successfully with the refactored code. The deployment issue has been resolved with the new `asgi.py` entry point. The foundation is now set for continued improvements and easier maintenance going forward.
 **Status**: ✅ Ready for Production

 - `DATABASE_URL` - Database connection
 - `PRELOAD_GGUF` - Preload GGUF models
 ## Conclusion
+This refactoring significantly improves the codebase quality while maintaining 100% backward compatibility. All functionality is preserved, and the application runs successfully with the refactored code. The foundation is now set for continued improvements and easier maintenance going forward.
 **Status**: ✅ Ready for Production

app.py DELETED Viewed

@@ -1,31 +0,0 @@
-"""
-Deployment entry point for ASGI servers (uvicorn, gunicorn, etc.).
-This is a thin wrapper that imports the actual FastAPI app from the ai_med_extract package.
-Usage:
-    uvicorn app:app --host 0.0.0.0 --port 7860
-    gunicorn -k uvicorn.workers.UvicornWorker app:app
-"""
-import sys
-import os
-from pathlib import Path
-# Ensure the services/ai-service/src directory is in the Python path
-src_dir = Path(__file__).parent / "services" / "ai-service" / "src"
-if src_dir.exists():
-    sys.path.insert(0, str(src_dir))
-# Set environment for deployment
-os.environ.setdefault('PYTHONUNBUFFERED', '1')
-# Import the FastAPI app from the actual implementation
-# Use absolute import with module name to avoid conflicts
-from ai_med_extract.app import create_app, initialize_agents
-# Create and initialize the app for deployment
-app = create_app(initialize=False)
-initialize_agents(app, preload_small_models=False)
-# Export for ASGI servers
-__all__ = ["app"]

asgi.py DELETED Viewed

@@ -1,31 +0,0 @@
-"""
-Deployment entry point for ASGI servers (uvicorn, gunicorn, etc.).
-This is a thin wrapper that imports the actual FastAPI app from the ai_med_extract package.
-Usage:
-    uvicorn asgi:app --host 0.0.0.0 --port 7860
-    gunicorn -k uvicorn.workers.UvicornWorker asgi:app
-"""
-import sys
-import os
-from pathlib import Path
-# Ensure the services/ai-service/src directory is in the Python path
-src_dir = Path(__file__).parent / "services" / "ai-service" / "src"
-if src_dir.exists():
-    sys.path.insert(0, str(src_dir))
-# Set environment for deployment
-os.environ.setdefault('PYTHONUNBUFFERED', '1')
-os.environ.setdefault('FAST_MODE', 'true')  # Don't preload models during import
-# Import the FastAPI app from the actual implementation
-from ai_med_extract.app import create_app, initialize_agents
-# Create and initialize the app for deployment
-app = create_app(initialize=False)
-initialize_agents(app, preload_small_models=False)
-# Export for ASGI servers
-__all__ = ["app"]

services/ai-service/src/app.py ADDED Viewed

	@@ -0,0 +1,11 @@

+"""Top-level service app shim.
+This module is intentionally a thin wrapper that re-exports the
+canonical `create_app` and `initialize_agents` functions from the
+`ai_med_extract` package. Keep the real implementation inside
+`ai_med_extract` to avoid duplication.
+"""
+from ai_med_extract.app import create_app, initialize_agents, run_dev  # noqa: F401
+__all__ = ["create_app", "initialize_agents", "run_dev"]