Spaces:

salvinjose
/

HNTAI

Paused

Update .gitignore to include additional files and directories for macOS, Linux, and application-specific configurations. Modify .huggingface.yaml to enhance Docker build settings and hardware requirements. Refactor app.py to remove legacy code and improve error handling. Remove deprecated files related to comprehensive streaming fixes, deployment scripts, and optimized Docker configurations. Update Dockerfile.prod to extend Gunicorn timeout for better performance. Enhance health endpoints and model management with improved logging and error handling. Consolidate routes and simplify architecture for better maintainability.

af75202 8 months ago

preview code

Raw

History Blame

5.82 kB

AI Service (ai_med_extract)

Medical AI service for data extraction, PHI scrubbing, and patient summary generation.

Quick Start

Prerequisites

Python 3.10+
Docker & Docker Compose (for containerized deployment)
Optional: CUDA 11.8+ for GPU support

Quick Development Server

# From services/ai-service directory
cd src
python -m ai_med_extract.app run_dev

This runs Flask's built-in development server on port 7860.

Smoke Test (No Model Loading)

# From services/ai-service directory
python run_smoke_test.py

Local Development

Option 1: Development Server (Fast Iteration)

cd .\services\ai-service\src
python -m ai_med_extract.app run_dev

Option 2: WSGI/Gunicorn (Production-like)

cd .\services\ai-service\src
pip install gunicorn
$env:PRELOAD_SMALL_MODELS="false"
gunicorn -w 4 -b 0.0.0.0:7860 wsgi:app

Using PowerShell Script

cd .\services\ai-service
.\run_local.ps1        # Run without rebuilding
.\run_local.ps1 -Build # Build and run

Docker Deployment

Build Image

cd .\services\ai-service
docker build -f Dockerfile.prod -t ai-service:local .

Run Container

docker run --rm -p 7860:7860 \
  -e PRELOAD_SMALL_MODELS=false \
  -e HF_HOME=/tmp/huggingface \
  -e TORCH_HOME=/tmp/torch_cache \
  ai-service:local

Docker Compose

cd .\services\ai-service
docker-compose up --build     # Build and run
docker-compose logs -f        # Follow logs

Push to Registry

docker tag ai-service:local your-registry/ai-service:latest
docker push your-registry/ai-service:latest

Environment Variables

Variable	Description	Default
`HF_SPACES`	Signals HF Spaces environment	`false`
`PRELOAD_GGUF`	Enable GGUF model preloading	`false`
`PRELOAD_SMALL_MODELS`	Load small models at startup	`false`
`HF_HOME`	Hugging Face cache directory	`/tmp/huggingface`
`TORCH_HOME`	PyTorch cache directory	`/tmp/torch`
`WHISPER_CACHE`	Whisper model cache	`/tmp/whisper`
`DATABASE_URL`	PostgreSQL connection string	Required for production
`REDIS_URL`	Redis connection string	Required for production
`SECRET_KEY`	Application secret key	Required
`JWT_SECRET_KEY`	JWT signing key	Required

API Endpoints

Health & Monitoring

GET /health/live - Liveness probe
GET /health/ready - Readiness probe
GET /metrics - Prometheus metrics

Document Processing

POST /upload - Upload and process documents
POST /transcribe - Transcribe audio files
GET /get_updated_medical_data - Retrieve processed data
PUT /update_medical_data - Update medical data

AI Processing

POST /generate_patient_summary - Generate comprehensive patient summaries
POST /api/generate_summary - Generate text summaries
POST /api/patient_summary_openvino - OpenVINO-optimized summaries
POST /extract_medical_data - Extract structured medical data

Model Management

POST /api/load_model - Load specific AI models
GET /api/model_info - Get model information
POST /api/switch_model - Switch between models

Verify Endpoints

curl http://localhost:7860/health/live
curl http://localhost:7860/health/ready
curl http://localhost:7860/metrics

Testing

Smoke Test (No Models)

python run_smoke_test.py

Unit Tests

python -m pytest tests/

Integration Tests

python -m pytest tests/integration/

Project Structure

services/ai-service/
├── src/
│   ├── ai_med_extract/
│   │   ├── agents/          # AI agents and processors
│   │   ├── api/             # FastAPI routes
│   │   ├── services/        # Business logic services
│   │   ├── utils/           # Utilities and helpers
│   │   ├── app.py          # Flask application
│   │   └── main.py         # FastAPI application
│   ├── app.py              # Application entry point
│   ├── config_settings.py  # Configuration
│   └── wsgi.py             # WSGI entry point
├── k8s/
│   └── deployment.yaml     # Kubernetes manifests
├── docker-compose.yml      # Local Docker Compose
├── Dockerfile.prod         # Production Docker image
├── run_local.ps1           # PowerShell run script
└── README.md               # This file

Kubernetes Deployment

Apply the Kubernetes manifests:

kubectl apply -f k8s/deployment.yaml
kubectl get pods -l app=ai-service
kubectl logs -f <pod-name>

Notes

Model Caching: The Docker Compose file mounts ./model_cache to persist models between runs
GPU Support: Adjust Dockerfile.prod for CUDA/GPU support
Secrets: Never bake secrets into images; use environment variables or mounted secrets
Production: Set PRELOAD_SMALL_MODELS=true only if you need models at container start

Additional Documentation

Production Deployment: See PRODUCTION_READY_SUMMARY.md in src/ai_med_extract/
Integration Guide: See INTEGRATION_GUIDE.md in src/ai_med_extract/utils/
Main Project README: See ../../README.md for overall project documentation

For detailed guides and API documentation, see the main project README and the /docs endpoint when the service is running.