HNTAI / README.md
Adhil Krishna G
Deployed to Live
5aafb3a
|
Raw
History Blame
2.69 kB
---
title: HNTAI - Medical Data Extraction API
emoji: 📉
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false
---
# HNTAI - Scalable Medical Data Extraction API
This is a FastAPI-based scalable API for extracting and processing medical data from various document formats, aligned with "ChatGPT Version 3 - Scalable" architecture.
## Features
- Document text extraction (PDF, DOCX, Images)
- Audio transcription
- Medical data extraction
- PHI (Protected Health Information) scrubbing with audit logging
- Text summarization with Redis caching
- PostgreSQL database integration for persistence
- Async processing for scalability
- Health endpoints (/live, /ready)
- Security features (non-root containers, secrets management, HIPAA compliance)
## Architecture Alignment
Fully aligned with "ChatGPT Version 3 - Scalable":
- FastAPI for async API handling
- Redis for caching and PHI stats
- PostgreSQL for audit logs and data persistence
- Kubernetes deployment with security contexts
- Network policies and HIPAA compliance
- Prometheus monitoring
- Proper resource limits and health probes
## Deployment Options
- **Hugging Face Spaces**: Lightweight Docker deployment (legacy)
- **Kubernetes**: Scalable production deployment with security features
## Environment Variables
- `DATABASE_URL`: PostgreSQL connection string
- `REDIS_URL`: Redis connection string
- `SECRET_KEY`: Application secret key
- `JWT_SECRET_KEY`: JWT signing key
## API Endpoints
- GET /health/live - Liveness health check
- GET /health/ready - Readiness health check
- GET /metrics - Prometheus metrics
- POST /generate_patient_summary - Generate comprehensive patient summaries (with streaming support)
- POST /upload - Upload and process medical documents
- GET /get_updated_medical_data - Retrieve processed medical data
- PUT /update_medical_data - Update medical data fields
- POST /transcribe - Transcribe audio files
- POST /extract_medical_data - Extract structured medical data
- POST /api/generate_summary - Generate text summaries
- POST /api/extract_medical_data_from_audio - Process audio recordings
- POST /api/patient_summary_openvino - Generate patient summaries using OpenVINO
## Development
### Code Quality
This project uses the following tools for code quality:
- **Black**: Code formatting
- **isort**: Import sorting
- **flake8**: Linting
- **mypy**: Type checking
Run quality checks:
```bash
black .
isort .
flake8 .
mypy .
```
### Testing
Run tests with:
```bash
python -m pytest
```
For more details, check the API documentation at `/docs`, [DEVELOPMENT.md](DEVELOPMENT.md) for development guides, and [DEPLOYMENT.md](DEPLOYMENT.md) for deployment instructions.