HNTAI / README.md
sachinchandrankallar's picture
Update .gitignore to include additional files and directories for macOS, Linux, and application-specific configurations. Modify .huggingface.yaml to enhance Docker build settings and hardware requirements. Refactor app.py to remove legacy code and improve error handling. Remove deprecated files related to comprehensive streaming fixes, deployment scripts, and optimized Docker configurations. Update Dockerfile.prod to extend Gunicorn timeout for better performance. Enhance health endpoints and model management with improved logging and error handling. Consolidate routes and simplify architecture for better maintainability.
af75202
|
Raw
History Blame
14.1 kB

HNTAI - Medical Data Extraction & AI Processing Platform

A comprehensive, scalable AI platform for medical data extraction, processing, and analysis. Built with FastAPI, supporting multiple AI model backends including Transformers, OpenVINO, and GGUF models with automatic GPU/CPU optimization.

πŸ₯ Overview

HNTAI is a production-ready medical AI platform that provides:

  • Medical Document Processing: PDF, DOCX, image, and audio transcription
  • Protected Health Information (PHI) Scrubbing: HIPAA-compliant data anonymization
  • AI-Powered Summarization: Multi-model support with automatic device optimization
  • Patient Summary Generation: Comprehensive clinical assessments
  • Simplified Architecture: Clean, maintainable codebase with essential features

πŸš€ Key Features

πŸ€– Multi-Model AI Support

  • Transformers Models: Hugging Face models with automatic GPU/CPU detection
  • OpenVINO Optimization: Intel-optimized models for production performance
  • GGUF Models: Quantized models for efficient inference
  • Automatic Device Selection: GPU when available, CPU fallback
  • Model Caching: Intelligent model management and caching

πŸ“„ Document Processing

  • Multi-format Support: PDF, DOCX, images, audio files
  • OCR Integration: Tesseract-based text extraction
  • Audio Transcription: Whisper-based speech-to-text
  • Batch Processing: Async processing for scalability

πŸ”’ Security & Compliance

  • HIPAA Compliance: PHI scrubbing with audit logging
  • Data Encryption: Secure data handling and storage
  • Audit Trails: Comprehensive logging for compliance
  • Non-root Containers: Security-hardened deployments

πŸ“Š Monitoring & Observability

  • Health Endpoints: /health/live, /health/ready
  • Basic Metrics: Simple performance tracking
  • Structured Logging: Application logging
  • Audit Logging: HIPAA-compliant audit trails

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    FastAPI Application                  β”‚
β”‚                      (main.py)                          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚                   β”‚                     β”‚
        β–Ό                   β–Ό                     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Routes     β”‚   β”‚   Agents      β”‚   β”‚   Utils       β”‚
β”‚              β”‚   β”‚               β”‚   β”‚               β”‚
β”‚ - /upload    β”‚   β”‚ - Text        β”‚   β”‚ - Model       β”‚
β”‚ - /transcribeβ”‚   β”‚   Extractor   β”‚   β”‚   Manager     β”‚
β”‚ - /generate  β”‚   β”‚ - PHI         β”‚   β”‚ - JSON        β”‚
β”‚   _summary   β”‚   β”‚   Scrubber    β”‚   β”‚   Parser      β”‚
β”‚              β”‚   β”‚ - Patient     β”‚   β”‚ - Config      β”‚
β”‚              β”‚   β”‚   Summary     β”‚   β”‚               β”‚
β”‚              β”‚   β”‚ - Whisper     β”‚   β”‚               β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        β”‚                   β”‚                     β”‚
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚                   β”‚                   β”‚
        β–Ό                   β–Ό                   β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Models     β”‚   β”‚   Database   β”‚   β”‚   Health     β”‚
β”‚              β”‚   β”‚   (Optional)  β”‚   β”‚              β”‚
β”‚ - Transformersβ”‚   β”‚ - Audit Logs β”‚   β”‚ - /health    β”‚
β”‚ - GGUF       β”‚   β”‚   (HIPAA)    β”‚   β”‚ - /metrics   β”‚
β”‚ - OpenVINO   β”‚   β”‚              β”‚   β”‚              β”‚
β”‚ - Whisper    β”‚   β”‚              β”‚   β”‚              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ› οΈ Installation

Prerequisites

  • Python 3.11+
  • CUDA 11.8+ (for GPU support)
  • Docker (for containerized deployment)
  • PostgreSQL 13+ (optional - for audit logs)

Local Development

  1. Clone the repository:
git clone <repository-url>
cd HNTAI
  1. Create virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Set up environment variables:
export DATABASE_URL="postgresql://user:password@localhost:5432/hntai"  # Optional - for audit logs
export SECRET_KEY="your-secret-key"
export JWT_SECRET_KEY="your-jwt-secret"
export HF_HOME="/tmp/huggingface"
  1. Run the application:
# Development server
python -m uvicorn services.ai-service.src.ai_med_extract.main:app --reload --host 0.0.0.0 --port 7860

# Or using the service directly
cd services/ai-service
python src/ai_med_extract/main.py

Docker Deployment

  1. Build the image:
docker build -t hntai:latest .
  1. Run with Docker Compose:
docker-compose up -d

Kubernetes Deployment

  1. Apply Kubernetes manifests:
kubectl apply -f infra/k8s/secure_deployment.yaml
  1. Check deployment status:
kubectl get pods -l app=hntai

πŸ“š API Documentation

Core Endpoints

Health & Monitoring

  • GET /health/live - Liveness probe
  • GET /health/ready - Readiness probe
  • GET /metrics - Prometheus metrics

Document Processing

  • POST /upload - Upload and process documents
  • POST /transcribe - Transcribe audio files
  • GET /get_updated_medical_data - Retrieve processed data
  • PUT /update_medical_data - Update medical data

AI Processing

  • POST /generate_patient_summary - Generate comprehensive patient summaries
  • POST /api/generate_summary - Generate text summaries
  • POST /api/patient_summary_openvino - OpenVINO-optimized summaries
  • POST /extract_medical_data - Extract structured medical data

Model Management

  • POST /api/load_model - Load specific AI models
  • GET /api/model_info - Get model information
  • POST /api/switch_model - Switch between models

πŸ€– AI Model Configuration

Supported Model Types

1. Transformers Models

{
    "model_name": "microsoft/Phi-3-mini-4k-instruct",
    "model_type": "text-generation"
}

2. OpenVINO Models

{
    "model_name": "OpenVINO/Phi-3-mini-4k-instruct-fp16-ov",
    "model_type": "openvino"
}

3. GGUF Models

{
    "model_name": "microsoft/Phi-3-mini-4k-instruct-gguf",
    "model_type": "gguf"
}

Automatic Device Detection

The system automatically detects and uses:

  • GPU: When CUDA is available
  • CPU: Fallback when GPU is not available
  • Optimization: Intel OpenVINO for production performance

πŸ”§ Configuration

Environment Variables

Variable Description Default
DATABASE_URL PostgreSQL connection string (optional - for audit logs) Not required
SECRET_KEY Application secret key Required
JWT_SECRET_KEY JWT signing key Required
HF_HOME Hugging Face cache directory /tmp/huggingface
TORCH_HOME PyTorch cache directory /tmp/torch
WHISPER_CACHE Whisper model cache /tmp/whisper
HF_SPACES Hugging Face Spaces mode false
PRELOAD_GGUF Preload GGUF models false

Model Configuration

The system supports flexible model configuration through model_config.py:

# Default models for different tasks
DEFAULT_MODELS = {
    "text-generation": {
        "primary": "microsoft/Phi-3-mini-4k-instruct",
        "fallback": "facebook/bart-base"
    },
    "openvino": {
        "primary": "OpenVINO/Phi-3-mini-4k-instruct-fp16-ov",
        "fallback": "microsoft/Phi-3-mini-4k-instruct"
    },
    "gguf": {
        "primary": "microsoft/Phi-3-mini-4k-instruct-gguf",
        "fallback": "microsoft/Phi-3-mini-4k-instruct-gguf"
    }
}

πŸ§ͺ Testing

Run Tests

# Unit tests
python -m pytest tests/

# Smoke test (no model loading)
cd services/ai-service
python run_smoke_test.py

# Integration tests
python -m pytest tests/integration/

Code Quality

# Format code
black .
isort .

# Lint code
flake8 .
mypy .

# Type checking
mypy services/ai-service/src/ai_med_extract/

πŸ“Š Monitoring

Health Checks

  • Liveness: GET /health/live - Application is running
  • Readiness: GET /health/ready - Application is ready to serve requests

Metrics

  • Prometheus: GET /metrics - Application and model metrics
  • Custom Metrics: Model inference time, success rates, error rates

Logging

  • Structured Logging: JSON-formatted logs
  • Audit Trails: PHI access and modification logs
  • Performance Logs: Model loading and inference timing

πŸ”’ Security Features

HIPAA Compliance

  • PHI Scrubbing: Automatic removal of protected health information
  • Audit Logging: Comprehensive access and modification logs
  • Data Encryption: Secure data handling and storage
  • Access Controls: Role-based access to sensitive data

Container Security

  • Non-root Containers: Security-hardened container images
  • Resource Limits: CPU and memory limits
  • Network Policies: Secure network communication
  • Secrets Management: Secure handling of sensitive configuration

πŸš€ Deployment Options

1. Local Development

python -m uvicorn services.ai-service.src.ai_med_extract.main:app --reload

2. Docker

docker run -p 7860:7860 hntai:latest

3. Kubernetes

kubectl apply -f infra/k8s/secure_deployment.yaml

4. Hugging Face Spaces

# Configure for HF Spaces
export HF_SPACES=true
# The app.py file automatically detects HF Spaces environment

πŸ“ Project Structure

HNTAI/
β”œβ”€β”€ services/
β”‚   └── ai-service/
β”‚       └── src/
β”‚           └── ai_med_extract/
β”‚               β”œβ”€β”€ agents/              # Core agents (simplified)
β”‚               β”‚   β”œβ”€β”€ text_extractor.py
β”‚               β”‚   β”œβ”€β”€ phi_scrubber.py
β”‚               β”‚   β”œβ”€β”€ patient_summary_agent.py
β”‚               β”‚   └── medical_data_extractor.py
β”‚               β”œβ”€β”€ api/
β”‚               β”‚   └── routes_fastapi.py  # All routes in one file
β”‚               β”œβ”€β”€ utils/
β”‚               β”‚   β”œβ”€β”€ unified_model_manager.py  # Single model manager
β”‚               β”‚   β”œβ”€β”€ robust_json_parser.py
β”‚               β”‚   └── model_config.py
β”‚               β”œβ”€β”€ app.py               # FastAPI app setup
β”‚               β”œβ”€β”€ main.py              # Entry point
β”‚               β”œβ”€β”€ health_endpoints.py  # Simple health checks
β”‚               └── database_audit.py     # HIPAA audit logging
β”œβ”€β”€ docs/
β”‚   β”œβ”€β”€ hf-spaces/              # HF Spaces deployment guides
β”‚   └── archive/                # Archived documentation
β”œβ”€β”€ app.py                      # HF Spaces wrapper (minimal)
β”œβ”€β”€ preload_models.py           # Model preloading
β”œβ”€β”€ requirements.txt
└── README.md

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Make your changes
  4. Run tests: python -m pytest
  5. Commit changes: git commit -m 'Add amazing feature'
  6. Push to branch: git push origin feature/amazing-feature
  7. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ“š Documentation

Main Documentation

  • README_DEPLOYMENT.md - Quick deployment reference for HF Spaces
  • services/ai-service/README.md - Detailed service documentation

Deployment Guides (docs/hf-spaces/)

  • HF_SPACES_QUICKSTART.md - 10-minute deployment guide
  • DEPLOYMENT_CHECKLIST.md - Step-by-step checklist
  • MODEL_USAGE_GUIDE.md - Model configuration and usage
  • HF_SPACES_DEPLOYMENT.md - Complete deployment reference

Additional Resources

  • docs/archive/ - Historical documentation and summaries
  • services/ai-service/src/ai_med_extract/PRODUCTION_READY_SUMMARY.md - Production notes
  • services/ai-service/src/ai_med_extract/utils/INTEGRATION_GUIDE.md - Integration guide

πŸ†˜ Support

  • Documentation: Check the /docs endpoint for interactive API documentation
  • Issues: Report bugs and feature requests via GitHub Issues
  • Discussions: Join community discussions for questions and support

πŸ”„ Changelog

Latest Updates

  • βœ… Simplified architecture - Removed over-engineered components
  • βœ… Unified model management - Single model manager for all model types
  • βœ… Consolidated routes - All API endpoints in one file
  • βœ… Simplified agents - Removed duplicate implementations
  • βœ… Enhanced security and HIPAA compliance - Maintained audit logging
  • βœ… Cleaner codebase - 50% fewer files, 40% less code

Built with ❀️ for the medical AI community