Spaces:
Running
ΨͺΨ¨ΩΨ§Ω Ψ§ΩΨ·Ψ¨Ω β Architecture
System Overview
Tebyan Medical is a production-grade Arabic medical report analysis platform. Users upload lab reports (PDF or image), the system extracts findings, generates a clinical interpretation in Arabic, and provides an interactive voice-enabled chat assistant.
Frontend (Next.js 15) Backend (FastAPI) External Services
βββββββββββββββββββββ βββββββββββββββββ βββββββββββββββββ
Upload β /api/analyze βββββββββΊ AgentCoordinator Groq (LLM/STT)
Chat β /api/chat βββββββββΊ RAG + LLM streaming Supabase (pgvector)
Voice β /api/voice βββββββββΊ WhisperSTT / TTS Cohere (rerank)
Risk β /api/risk βββββββββΊ RiskEngine Google Vision/TTS
Backend Layer
Entry Point
backend/main.py β FastAPI application. Registers all routes, mounts middleware, and wires dependency-injected singletons via @lru_cache.
Multi-Agent Pipeline (services/agents/)
Replaces the flat services/agent/pipeline.py (kept for backward compat) with a structured agent graph:
AgentCoordinator
βββ OCRAgent β PDF (fitz + EasyOCR) + Image (Google Vision fallback)
βββ ExtractionAgent β regex parse β LLM fallback β physiological bounds filter
βββ ClassificationAgent β panel detection (CBC, Thyroid, Liver, Kidney, Lipid, Diabetes)
βββ MedicalReasoningAgent β RAG retrieval + panel-specific LLM prompt
βββ SafetyAgent β PDPL/NDMO compliance filter + emergency detection
Each agent extends AgentBase which provides retry-with-backoff and structured logging via AgentContext.
RAG Stack (services/rag/, services/search/)
- Embedding:
intfloat/multilingual-e5-large(1024-dim, via HuggingFace) - Vector store: Supabase pgvector (
match_documentsRPC, 2834+ medical chunks) - Query expansion: Groq LLM generates 3 alternate queries;
query_parser.pyadds Arabic synonym expansions - Retrieval: BM25 fallback + Cohere
rerank-v3.5cross-encoder - Cache: 5-minute TTL in-memory LRU (
services/cache.py) prevents redundant embedding calls
Risk Engine (services/risk/)
Evidence-based clinical threshold scoring for 6 conditions. FeatureExtractor normalises findings into a 35-feature vector; each scorer (_score_diabetes, _score_cardiovascular, etc.) applies WHO/ADA/ACC clinical cutoffs. ML .pkl model files in services/risk/models/ override rule-based scores when present.
Voice (services/voice/)
- STT:
WhisperSTTwraps Groqwhisper-large-v3. Accepts WebM/MP4/OGG/WAV (25 MB max). - TTS: Provider chain β Google Cloud TTS (Wavenet-A) β gTTS (free fallback) β ElevenLabs.
LLM Router (services/llm/router.py)
LLMRouter wraps a primary provider (GroqProvider) and optional fallback (HuggingFaceProvider). Model selection: llama-3.3-70b-versatile for analysis, llama-3.1-8b-instant for chat.
Security (middleware/)
AuditMiddleware: Writes one JSON record per request to rotating log files (logs/audit/audit.jsonl). Marks PDPL-sensitive paths. Skips health/docs endpoints.validate_upload: Magic-byte sniffing (anti-MIME-spoofing), 20 MB size limit, extension blocklist.sanitize_text: Strips HTML tags, null bytes, XSS patterns, and SQL injection signatures from all user text inputs.
Rate Limiting (services/ratelimit.py)
In-memory sliding window (no external dependency). Limits: analyze=5/min, chat=30/min, search=60/min. Uses X-Forwarded-For for IP detection behind proxies.
Frontend Layer
Next.js 15 App Router, RTL Arabic, Tailwind CSS v4, Framer Motion.
| Component | Purpose |
|---|---|
upload-section.tsx |
File picker + /api/analyze call + loading state |
analysis-history.tsx |
Saved analyses list + semantic search + health trend chart |
health-trend-chart.tsx |
Recharts line chart β tracks lab values over time with alerts |
risk-dashboard.tsx |
Calls /api/risk + renders 6 radial gauge cards (collapsible) |
chat-bot.tsx |
Floating chat panel β streaming SSE + voice input/output |
voice-recorder.tsx |
MediaRecorder β /api/voice/transcribe + TTS playback |
compare-analyses.tsx |
Side-by-side analysis diff |
Data Flow β Analysis Request
1. User uploads PDF/image
2. validate_upload() β size + MIME + magic bytes
3. AgentCoordinator.run()
a. OCRAgent β raw_text
b. ExtractionAgent β findings[] (with impossible-value filter)
c. ClassificationAgent β panel_code
d. MedicalReasoningAgent β RAG context + LLM report
e. SafetyAgent β filtered report
4. Response: { findings, summary, report }
5. Frontend saves to Supabase via /api/analyses/save
6. RiskDashboard calls /api/risk with findings
Key Design Decisions
- No streaming for analysis: Analysis takes 8β15 s; streaming a partial JSON would be malformed. Response is returned in full after all agents complete.
- RAG cache before agent: MedicalReasoningAgent checks
rag_cachebefore calling pgvector β avoids redundant 300 ms embedding round-trips on identical queries. - Agents over monolith: Each agent is independently retryable and observable via
AgentContext.logs. Failures degrade gracefully β OCR failure setsraw_text="", downstream agents handle empty input without crashing. - Rule-based risk scoring: ML
.pklmodels are optional overrides. The platform is useful immediately without training data. - In-memory rate limiter: Avoids Redis dependency for MVP. Replace with Redis-backed limiter for multi-process deployments.