Spaces:
Paused
Paused
Update .gitignore to include additional files and directories for macOS, Linux, and application-specific configurations. Modify .huggingface.yaml to enhance Docker build settings and hardware requirements. Refactor app.py to remove legacy code and improve error handling. Remove deprecated files related to comprehensive streaming fixes, deployment scripts, and optimized Docker configurations. Update Dockerfile.prod to extend Gunicorn timeout for better performance. Enhance health endpoints and model management with improved logging and error handling. Consolidate routes and simplify architecture for better maintainability.
af75202 AI Service (ai_med_extract)
Medical AI service for data extraction, PHI scrubbing, and patient summary generation.
π Table of Contents
Quick Start
Prerequisites
- Python 3.10+
- Docker & Docker Compose (for containerized deployment)
- Optional: CUDA 11.8+ for GPU support
Quick Development Server
# From services/ai-service directory
cd src
python -m ai_med_extract.app run_dev
This runs Flask's built-in development server on port 7860.
Smoke Test (No Model Loading)
# From services/ai-service directory
python run_smoke_test.py
Local Development
Option 1: Development Server (Fast Iteration)
cd .\services\ai-service\src
python -m ai_med_extract.app run_dev
Option 2: WSGI/Gunicorn (Production-like)
cd .\services\ai-service\src
pip install gunicorn
$env:PRELOAD_SMALL_MODELS="false"
gunicorn -w 4 -b 0.0.0.0:7860 wsgi:app
Using PowerShell Script
cd .\services\ai-service
.\run_local.ps1 # Run without rebuilding
.\run_local.ps1 -Build # Build and run
Docker Deployment
Build Image
cd .\services\ai-service
docker build -f Dockerfile.prod -t ai-service:local .
Run Container
docker run --rm -p 7860:7860 \
-e PRELOAD_SMALL_MODELS=false \
-e HF_HOME=/tmp/huggingface \
-e TORCH_HOME=/tmp/torch_cache \
ai-service:local
Docker Compose
cd .\services\ai-service
docker-compose up --build # Build and run
docker-compose logs -f # Follow logs
Push to Registry
docker tag ai-service:local your-registry/ai-service:latest
docker push your-registry/ai-service:latest
Environment Variables
| Variable | Description | Default |
|---|---|---|
HF_SPACES |
Signals HF Spaces environment | false |
PRELOAD_GGUF |
Enable GGUF model preloading | false |
PRELOAD_SMALL_MODELS |
Load small models at startup | false |
HF_HOME |
Hugging Face cache directory | /tmp/huggingface |
TORCH_HOME |
PyTorch cache directory | /tmp/torch |
WHISPER_CACHE |
Whisper model cache | /tmp/whisper |
DATABASE_URL |
PostgreSQL connection string | Required for production |
REDIS_URL |
Redis connection string | Required for production |
SECRET_KEY |
Application secret key | Required |
JWT_SECRET_KEY |
JWT signing key | Required |
API Endpoints
Health & Monitoring
GET /health/live- Liveness probeGET /health/ready- Readiness probeGET /metrics- Prometheus metrics
Document Processing
POST /upload- Upload and process documentsPOST /transcribe- Transcribe audio filesGET /get_updated_medical_data- Retrieve processed dataPUT /update_medical_data- Update medical data
AI Processing
POST /generate_patient_summary- Generate comprehensive patient summariesPOST /api/generate_summary- Generate text summariesPOST /api/patient_summary_openvino- OpenVINO-optimized summariesPOST /extract_medical_data- Extract structured medical data
Model Management
POST /api/load_model- Load specific AI modelsGET /api/model_info- Get model informationPOST /api/switch_model- Switch between models
Verify Endpoints
curl http://localhost:7860/health/live
curl http://localhost:7860/health/ready
curl http://localhost:7860/metrics
Testing
Smoke Test (No Models)
python run_smoke_test.py
Unit Tests
python -m pytest tests/
Integration Tests
python -m pytest tests/integration/
Project Structure
services/ai-service/
βββ src/
β βββ ai_med_extract/
β β βββ agents/ # AI agents and processors
β β βββ api/ # FastAPI routes
β β βββ services/ # Business logic services
β β βββ utils/ # Utilities and helpers
β β βββ app.py # Flask application
β β βββ main.py # FastAPI application
β βββ app.py # Application entry point
β βββ config_settings.py # Configuration
β βββ wsgi.py # WSGI entry point
βββ k8s/
β βββ deployment.yaml # Kubernetes manifests
βββ docker-compose.yml # Local Docker Compose
βββ Dockerfile.prod # Production Docker image
βββ run_local.ps1 # PowerShell run script
βββ README.md # This file
Kubernetes Deployment
Apply the Kubernetes manifests:
kubectl apply -f k8s/deployment.yaml
kubectl get pods -l app=ai-service
kubectl logs -f <pod-name>
Notes
- Model Caching: The Docker Compose file mounts
./model_cacheto persist models between runs - GPU Support: Adjust
Dockerfile.prodfor CUDA/GPU support - Secrets: Never bake secrets into images; use environment variables or mounted secrets
- Production: Set
PRELOAD_SMALL_MODELS=trueonly if you need models at container start
Additional Documentation
- Production Deployment: See
PRODUCTION_READY_SUMMARY.mdinsrc/ai_med_extract/ - Integration Guide: See
INTEGRATION_GUIDE.mdinsrc/ai_med_extract/utils/ - Main Project README: See
../../README.mdfor overall project documentation
For detailed guides and API documentation, see the main project README and the /docs endpoint when the service is running.