Spaces:
Running
Running
| # ΨͺΨ¨ΩΨ§Ω Ψ§ΩΨ·Ψ¨Ω β Architecture | |
| ## System Overview | |
| Tebyan Medical is a production-grade Arabic medical report analysis platform. Users upload lab reports (PDF or image), the system extracts findings, generates a clinical interpretation in Arabic, and provides an interactive voice-enabled chat assistant. | |
| ``` | |
| Frontend (Next.js 15) Backend (FastAPI) External Services | |
| βββββββββββββββββββββ βββββββββββββββββ βββββββββββββββββ | |
| Upload β /api/analyze βββββββββΊ AgentCoordinator Groq (LLM/STT) | |
| Chat β /api/chat βββββββββΊ RAG + LLM streaming Supabase (pgvector) | |
| Voice β /api/voice βββββββββΊ WhisperSTT / TTS Cohere (rerank) | |
| Risk β /api/risk βββββββββΊ RiskEngine Google Vision/TTS | |
| ``` | |
| --- | |
| ## Backend Layer | |
| ### Entry Point | |
| `backend/main.py` β FastAPI application. Registers all routes, mounts middleware, and wires dependency-injected singletons via `@lru_cache`. | |
| ### Multi-Agent Pipeline (`services/agents/`) | |
| Replaces the flat `services/agent/pipeline.py` (kept for backward compat) with a structured agent graph: | |
| ``` | |
| AgentCoordinator | |
| βββ OCRAgent β PDF (fitz + EasyOCR) + Image (Google Vision fallback) | |
| βββ ExtractionAgent β regex parse β LLM fallback β physiological bounds filter | |
| βββ ClassificationAgent β panel detection (CBC, Thyroid, Liver, Kidney, Lipid, Diabetes) | |
| βββ MedicalReasoningAgent β RAG retrieval + panel-specific LLM prompt | |
| βββ SafetyAgent β PDPL/NDMO compliance filter + emergency detection | |
| ``` | |
| Each agent extends `AgentBase` which provides retry-with-backoff and structured logging via `AgentContext`. | |
| ### RAG Stack (`services/rag/`, `services/search/`) | |
| - **Embedding**: `intfloat/multilingual-e5-large` (1024-dim, via HuggingFace) | |
| - **Vector store**: Supabase pgvector (`match_documents` RPC, 2834+ medical chunks) | |
| - **Query expansion**: Groq LLM generates 3 alternate queries; `query_parser.py` adds Arabic synonym expansions | |
| - **Retrieval**: BM25 fallback + Cohere `rerank-v3.5` cross-encoder | |
| - **Cache**: 5-minute TTL in-memory LRU (`services/cache.py`) prevents redundant embedding calls | |
| ### Risk Engine (`services/risk/`) | |
| Evidence-based clinical threshold scoring for 6 conditions. `FeatureExtractor` normalises findings into a 35-feature vector; each scorer (`_score_diabetes`, `_score_cardiovascular`, etc.) applies WHO/ADA/ACC clinical cutoffs. ML `.pkl` model files in `services/risk/models/` override rule-based scores when present. | |
| ### Voice (`services/voice/`) | |
| - **STT**: `WhisperSTT` wraps Groq `whisper-large-v3`. Accepts WebM/MP4/OGG/WAV (25 MB max). | |
| - **TTS**: Provider chain β Google Cloud TTS (Wavenet-A) β gTTS (free fallback) β ElevenLabs. | |
| ### LLM Router (`services/llm/router.py`) | |
| `LLMRouter` wraps a primary provider (`GroqProvider`) and optional fallback (`HuggingFaceProvider`). Model selection: `llama-3.3-70b-versatile` for analysis, `llama-3.1-8b-instant` for chat. | |
| ### Security (`middleware/`) | |
| - **`AuditMiddleware`**: Writes one JSON record per request to rotating log files (`logs/audit/audit.jsonl`). Marks PDPL-sensitive paths. Skips health/docs endpoints. | |
| - **`validate_upload`**: Magic-byte sniffing (anti-MIME-spoofing), 20 MB size limit, extension blocklist. | |
| - **`sanitize_text`**: Strips HTML tags, null bytes, XSS patterns, and SQL injection signatures from all user text inputs. | |
| ### Rate Limiting (`services/ratelimit.py`) | |
| In-memory sliding window (no external dependency). Limits: analyze=5/min, chat=30/min, search=60/min. Uses `X-Forwarded-For` for IP detection behind proxies. | |
| --- | |
| ## Frontend Layer | |
| **Next.js 15 App Router**, RTL Arabic, Tailwind CSS v4, Framer Motion. | |
| | Component | Purpose | | |
| |---|---| | |
| | `upload-section.tsx` | File picker + `/api/analyze` call + loading state | | |
| | `analysis-history.tsx` | Saved analyses list + semantic search + health trend chart | | |
| | `health-trend-chart.tsx` | Recharts line chart β tracks lab values over time with alerts | | |
| | `risk-dashboard.tsx` | Calls `/api/risk` + renders 6 radial gauge cards (collapsible) | | |
| | `chat-bot.tsx` | Floating chat panel β streaming SSE + voice input/output | | |
| | `voice-recorder.tsx` | MediaRecorder β `/api/voice/transcribe` + TTS playback | | |
| | `compare-analyses.tsx` | Side-by-side analysis diff | | |
| --- | |
| ## Data Flow β Analysis Request | |
| ``` | |
| 1. User uploads PDF/image | |
| 2. validate_upload() β size + MIME + magic bytes | |
| 3. AgentCoordinator.run() | |
| a. OCRAgent β raw_text | |
| b. ExtractionAgent β findings[] (with impossible-value filter) | |
| c. ClassificationAgent β panel_code | |
| d. MedicalReasoningAgent β RAG context + LLM report | |
| e. SafetyAgent β filtered report | |
| 4. Response: { findings, summary, report } | |
| 5. Frontend saves to Supabase via /api/analyses/save | |
| 6. RiskDashboard calls /api/risk with findings | |
| ``` | |
| --- | |
| ## Key Design Decisions | |
| - **No streaming for analysis**: Analysis takes 8β15 s; streaming a partial JSON would be malformed. Response is returned in full after all agents complete. | |
| - **RAG cache before agent**: MedicalReasoningAgent checks `rag_cache` before calling pgvector β avoids redundant 300 ms embedding round-trips on identical queries. | |
| - **Agents over monolith**: Each agent is independently retryable and observable via `AgentContext.logs`. Failures degrade gracefully β OCR failure sets `raw_text=""`, downstream agents handle empty input without crashing. | |
| - **Rule-based risk scoring**: ML `.pkl` models are optional overrides. The platform is useful immediately without training data. | |
| - **In-memory rate limiter**: Avoids Redis dependency for MVP. Replace with Redis-backed limiter for multi-process deployments. | |