# Orsync Scenarist Backend — Frontend Integration Reference > **Version**: v7.0 · **Backend port**: 7860 · **Frontend target**: Next.js (App Router), TypeScript > > This is the single authoritative reference for building a frontend against this backend. It covers every endpoint with its exact request/response shape (derived from the live route files), all data models, frontend feature specifications, history/tab management, and recommended page architecture. Build from this document directly — do not guess shapes from the old doc. --- ## 1. Project Overview **Orsync Scenarist v7.0** is an enterprise pharma strategic intelligence platform. The backend is a Python FastAPI application that exposes six functional areas: | Area | What it does | |------|-------------| | **Strategy Engine** | Vectorize campaign text into a 12D behavioral vector, run GMM clustering on the HCP population, score campaign-to-cluster fit via Mahalanobis distance, auto-optimize rejected campaigns | | **HCP Knowledge Graph** | Neo4j graph of ~480 doctor nodes with institution, topic, and cluster relationships; queryable by code name, cluster, institution, or topic | | **Persona Engine** | Returns a 12-feature behavioral profile for any doctor or cluster archetype; used to initialise simulation sessions | | **Simulation Engine** | WebRTC-style multi-turn roleplay: AI physician persona responds to rep pitches; integrates D-ID avatar streams and Hume prosody analysis | | **MOHP** | Medical Objection Handling Protocol — given a rep statement, returns compliance objections keyed to cluster-specific guideline databases | | **Analytics** | Redis-backed session store (7-day TTL); full conversation, emotion timeline, adherence score, and campaign snapshot per session | **Runtime dependencies** (all optional with graceful fallback): | Service | Default address | Used for | |---------|----------------|---------| | Redis | `redis://localhost:6379/0` | Sessions, semantic cache, outbox, DLQ, event streams | | Neo4j | `bolt://localhost:7687` | Doctor knowledge graph | | ChromaDB | `localhost:8100` | Campaign memory (RAG retrieval), approximate semantic cache | | Ollama | `https://ollama.com` | LLM inference — campaign vectorization, optimization, simulation replies, MOHP | | D-ID | API key in env | WebRTC avatar stream | | Hume | API key in env | Prosody / emotion analysis on audio input | When a dependency is absent the backend degrades gracefully — it never crashes. See §8 (Fallback Behaviour) for detail. --- ## 2. Connection Details | Setting | Default | Description | |---------|---------|-------------| | Base URL | `http://localhost:7860` | Backend API server | | API Docs | `http://localhost:7860/docs` | Swagger UI (auto-generated) | | OpenAPI JSON | `http://localhost:7860/openapi.json` | Machine-readable schema | | CORS | `*` (development) | All origins allowed in dev mode | --- ## 3. Authentication The backend has **no authentication** in development mode. All endpoints are open. In production, wrap the backend behind a reverse proxy (nginx, Caddy) with your auth layer. CORS is `*` by default — restrict with `CORS_ALLOWED_ORIGINS`. --- ## 4. Complete API Reference ### 4.1 Health Check ``` GET /healthz → { "status": "ok" } ``` --- ### 4.2 Strategy Engine (Primary Frontend Flow) **Prefix**: `/api/strategy` #### 4.2.1 Full Evaluate (Main Endpoint — Start Here) This is the **primary endpoint** for the campaign analysis workflow. It takes raw campaign text and returns everything: vectorization, clustering, heatmap, rejection check, and optimization. ``` POST /api/strategy/full-evaluate Content-Type: application/json { "campaign_text": "string (required, min 1 char)", "rejection_distance_threshold": 3.0, // optional, > 0 "region": "egypt" | "saudi" | "gulf" | null // optional, filter HCPs by region } → 200 OK { // --- Campaign vector (raw 12D, snake_case FEATURE_KEYS) --- "campaign_vector_12d": [0.72, 0.65, 0.55, 0.40, 0.70, 0.60, 0.38, 0.50, 0.45, 0.55, 0.30, 0.42], // --- Campaign vector projected into GMM PCA latent space (n_pca_components-dimensional) --- "campaign_vector_pca": [1.23, -0.67, 0.11], // --- GMM clustering output --- "gmm": { "k": 4, "data_source": "gold", // or "synthetic_seed" "n_pca_components": 3, "pca_explained_variance_ratio": [0.45, 0.25, 0.15], "centroids": [[...], [...], [...], [...]], // k × pca_dim arrays "covariances": [[[...]], ...], // k × pca_dim × pca_dim matrices "assignments": [0, 1, 0, 2, 3, ...], // cluster assignment per doctor in population "member_points_2d": [[x, y], ...] // 2D PCA scatter coords for each population member }, // --- Heatmap (cluster-fit ranking) --- // ⚠️ This is an OBJECT with a "ranking" key, NOT a bare array. "heatmap": { "ranking": [ { "cluster_id": 1, "label": "The Commercial Adopter", "distance": 1.23, // Mahalanobis distance (lower = better fit) "probability": 0.45 // softmax over distances (higher = better fit) }, ... ] }, // --- Cluster cards (one per cluster, sorted by probability desc) --- "cluster_cards": [ { "id": "c1", // ⚠️ STRING, e.g. "c0", "c1", "c2", "c3" "cluster_id": 1, "name": "The Commercial Adopter", "score": 45.0, // ⚠️ 0–100 float (probability × 100) "distance": 1.2300, "probability": 0.4500 }, ... ], "rejection_distance_threshold": 3.0, "rejected": false, // true when min distance > threshold // --- Optimization result (non-null ONLY when rejected=true) --- "optimized": null, // or full optimization object — see §5.3 "vectorization_model": "gemma4:31b-cloud" // or "fallback-no-api-key" } ``` #### 4.2.2 Vectorize Campaign Text > **Important**: Campaign feature keys use `snake_case` (`therapeutic_focus` etc.) — these are different from the persona behavioral display labels used in the War Room. See §5.2 for both sets. ``` POST /api/strategy/vectorize Content-Type: application/json { "text": "Campaign message text here", "campaign_id": "optional-id" // optional } → 200 OK { "features": { "therapeutic_focus": 0.72, "messaging_tone": 0.65, "target_seniority": 0.55, "channel_preference": 0.40, "kol_alignment": 0.70, "trial_phase_relevance": 0.60, "formulary_impact": 0.38, "patient_population_size": 0.50, "competitive_positioning": 0.45, "regulatory_stage": 0.55, "budget_tier": 0.30, "urgency_score": 0.42 }, "normalized_features": { /* same 12 keys, MinMax-scaled to [0,1] */ }, "embedding": [0.12, -0.34, ...], // 384-float ONNX MiniLM vector "embedding_model": "onnx-minilm", "model": "gemma4:31b-cloud" // or "fallback-no-api-key" } ``` #### 4.2.3 Build Heatmap ``` POST /api/strategy/heatmap Content-Type: application/json { "campaign_vector": [12 floats], "centroids": [[...], [...], ...], // from gmm.centroids "covariances": [[[...]], [[...]], ...], // from gmm.covariances "cluster_top_doctors": {"0": ["HCP-00-001"], "1": ["HCP-01-042"]} // optional } → 200 OK [ { "cluster_id": 1, "label": "...", "distance": 1.2, "probability": 0.45, "top_doctors": [...] }, ... ] ``` #### 4.2.4 Optimize Campaign ``` POST /api/strategy/optimize Content-Type: application/json { "campaign_text": "Original campaign text", "target_cluster": 0, // optional — cluster to optimize toward "target_centroid_vector": [12 floats] // optional — explicit target 12D vector } → 200 OK { "original_text": "...", "optimized_text": "...", "improvements": [ "Increase therapeutic_focus emphasis from 0.25 toward 0.95.", "Reduce budget_tier emphasis from 0.40 toward 0.20." ], "reason": "heuristic_fallback_no_api_key", // or "heuristic_fallback_llm_error" | "llm_rewrite" "reasoning": "Applied deterministic optimization...", "rewrite_rationale": "...", "optimized_vector": { "therapeutic_focus": 0.85, // ... all 12 snake_case feature keys }, "optimized_normalized_vector": { /* same 12 keys, normalized */ }, "retrieved_examples": ["example text 1", "example text 2"], // RAG-retrieved similar campaigns "retrieval_diagnostics": [ { "campaign_id": "...", "campaign_text": "...", "metadata": { "outcome": "success", "cluster_id": 0, ... }, "distance": 0.12 } ], "target_cluster": 0, "target_centroid_vector": [12 floats], "alignment_score": 0.87 // cosine similarity between optimized vector and target } ``` > **Note**: When `ollama_api_key` is absent, the optimizer uses a deterministic heuristic rewrite. The `reason` field signals which path was taken. #### 4.2.5 Evaluate Strategy (Advanced — BYO Centroids) ``` POST /api/strategy/evaluate Content-Type: application/json { "campaign_text": "string", "centroids": [[...], ...], "covariances": [[[...]], ...], "cluster_top_doctors": null, "rejection_distance_threshold": 3.0 } → 200 OK { "campaign_vector": {...}, "heatmap": [...], "rejection_distance_threshold": 3.0, "rejected": false, "optimized": null } ``` #### 4.2.6 Store Campaign Memory ``` POST /api/strategy/memory/store Content-Type: application/json { "campaign_text": "string", "campaign_id": "optional-id", "outcome": "string", "success_score": 0.85, // 0.0–1.0 "is_successful": true, "cluster_id": 0, "extra_metadata": {} } → 200 OK { "stored": true, "campaign_id": "...", "embedding_model": "onnx-minilm", "is_successful": true } ``` #### 4.2.7 Get Cluster Doctors ``` GET /api/strategy/cluster/{cluster_id}/doctors?limit=50®ion=egypt → 200 OK { "cluster_id": 0, "total_in_cluster": 120, "total_in_db": 480, "k": 4, "region": "egypt", "doctors": [ { "cluster_id": 0, "name": "Dr. Ahmed Hassan", "region": "egypt", "headline": "...", "location": "Cairo", "company": "Cairo University Hospital", "job": "Professor of Oncology", "school": "Cairo University", "school_degree": "MD, PhD", "primary_specialty": "Medical Oncology", "seniority_level": "Senior", "highest_academic_degree": "PhD", "total_years_experience": 22, "expected_age": 50, "age_group": "46-55", "current_role_tenure": 8, "kol_status": "National KOL", "digital_presence": "High", "academic_affiliation": "University Professor", "workplace_category": "Academic Medical Center", "institutional_tier": "Tier 1", "adoption_profile": "Early Adopter", "channel_preference": "Conference + Digital" }, ... ] } ``` --- ### 4.3 Persona Engine **Prefix**: `/api/persona` > **Two distinct feature systems exist in this backend** — do not confuse them: > > | System | Keys | Used in | > |--------|------|---------| > | **Campaign features** | `snake_case` (`therapeutic_focus`, etc.) | `POST /api/strategy/vectorize`, `full-evaluate`, optimization, cluster matching | > | **Persona behavioral traits** | Display names (`"Scientific Rigor"`, etc.) | `GET /api/persona/*`, War Room radar charts | #### Persona Behavioral Trait Axes (12 Display-Name Axes) These are the axes returned by the persona endpoints for radar chart rendering: | # | Axis | High value means | |---|------|------------------| | 0 | Scientific Rigor | Demands strong evidence before acting | | 1 | Innovation Appetite | Willing to adopt novel treatments early | | 2 | Guideline Adherence | Strictly follows clinical guidelines | | 3 | Price Sensitivity | Cost heavily influences decision-making | | 4 | Risk Tolerance | Comfortable accepting treatment-related risks | | 5 | Peer Influence | Swayed by what respected colleagues do | | 6 | Evidence Threshold | Needs more evidence before prescribing | | 7 | Formulary Weight | Formulary listing is a prerequisite | | 8 | Patient Centricity | Patient outcomes drive decisions | | 9 | Digital Readiness | Embraces digital/remote engagement channels | | 10 | KOL Alignment | Follows Key Opinion Leader guidance | | 11 | Trial Participation | Actively participates in clinical trials | #### Cluster Archetypes | ID | Label | Key trait profile | |----|-------|-------------------| | 0 | The Academic Skeptic | Scientific Rigor 0.95, Evidence Threshold 0.92, KOL Alignment 0.85, Innovation Appetite 0.30 | | 1 | The Commercial Adopter | Innovation Appetite 0.92, Digital Readiness 0.88, Risk Tolerance 0.85, Evidence Threshold 0.45 | | 2 | The Guideline Loyalist | Guideline Adherence 0.95, Peer Influence 0.80, Evidence Threshold 0.75, Risk Tolerance 0.10 | | 3 | Price-Sensitive Pragmatist | Price Sensitivity 0.95, Formulary Weight 0.90, Patient Centricity 0.80, KOL Alignment 0.35 | #### 4.3.1 Get Persona from Cluster ``` GET /api/persona/from-cluster/{cluster_id} → 200 OK { "codeName": "HCP-00-042", "clusterId": 0, "clusterLabel": "The Academic Skeptic", "traits": [ { "axis": "Scientific Rigor", "value": 0.92 }, { "axis": "Innovation Appetite", "value": 0.35 }, ... // 12 total ] } ``` #### 4.3.2 Get Specific Doctor Persona ``` GET /api/persona/{code_name} → 200 OK { "codeName": "HCP-00-042", "clusterId": 0, "clusterLabel": "The Academic Skeptic", "traits": [...], "h_index": 41, "works_count": 129, "cited_by_count": 6958, "years_active": 16 } ``` --- ### 4.4 Simulation Engine (WebRTC-Style Roleplay) **Prefix**: `/api/simulation` The simulation flow is a multi-step conversation: ``` ┌─────────────────────────────────────────────────┐ │ 1. POST /api/simulation/start │ │ → Returns session_id + WebRTC offer │ │ │ │ 2. POST /api/simulation/handshake │ │ → Complete WebRTC signaling │ │ │ │ 3. POST /api/simulation/ice-candidate │ │ → Exchange ICE candidates (can repeat) │ │ │ │ 4. POST /api/simulation/turn ← REPEAT │ │ → Send rep's message, get AI persona reply │ │ │ │ (Optional) GET /api/simulation/cache/{key} │ │ → Check semantic cache for similar turns │ └─────────────────────────────────────────────────┘ ``` #### 4.4.1 Start Simulation ``` POST /api/simulation/start Content-Type: application/json { "persona_id": "HCP-00-042", // required — doctor code_name from cluster doctors "campaign_id": "camp-001", // optional "campaign_snapshot": { ... } // optional — full full-evaluate result for session replay } → 200 OK { "session_id": "uuid-string", "started_at_epoch": 1713350400, "target_handshake_ms": 400, "did_payload": { ... }, // D-ID stream creation payload "did_offer": { "type": "offer", "sdp": "v=0\r\n..." // WebRTC SDP offer (mock when DID_API_KEY absent) }, "did_ice_servers": [], // ICE servers (empty in mock mode) "hume_prosody_enabled": false, // true when HUME_API_KEY is set "semantic_cache_similarity_threshold": 0.95, "stream_mode": "mock" // "mock" or "did" } ``` #### 4.4.2 Complete Handshake ``` POST /api/simulation/handshake Content-Type: application/json { "session_id": "uuid-string", "answer": { "type": "answer", "sdp": "..." } // WebRTC SDP answer } → 200 OK { "session_id": "uuid-string", "status": "connected", "result": { "connected": true, "mode": "mock" } // mock mode when no DID key } ``` #### 4.4.3 ICE Candidate Exchange ``` POST /api/simulation/ice-candidate Content-Type: application/json { "session_id": "uuid-string", "candidate": { ... } // ICE candidate object } → 200 OK { "accepted": true } ``` #### 4.4.4 Simulation Turn (Core Loop) ``` POST /api/simulation/turn Content-Type: application/json { "session_id": "uuid-string", "input_text": "Doctor, our Phase 3 trial showed...", // default: empty string "input_audio_base64": "" // optional audio for Hume prosody } → 200 OK { "session_id": "uuid-string", "cache_hit": false, // true if a semantically similar response was cached "cache_entry_id": "uuid|null", "cache_similarity": 0.0, // cosine similarity to cached entry (0.0 if not cached) "response": "Interesting, but I need to see the full subgroup analysis...", // AI persona reply text "audio": null, // reserved for future D-ID audio output "prosody": null // Hume prosody result (null when no audio or no Hume key) } ``` > **Note**: To get adherence score, emotion timeline, and full conversation history for a session, call `GET /api/analytics/session/{session_id}` after the turn loop ends. #### 4.4.5 Check Semantic Cache ``` GET /api/simulation/cache/{cache_key} → 200 OK { "hit": true, "value": "cached response text" } ``` --- ### 4.5 MOHP — Objection Detection **Prefix**: `/api/mohp` ``` POST /api/mohp/evaluate Content-Type: application/json { "session_id": "uuid-string", "input_text": "This drug is 100% effective with no side effects", "cluster_id": 0, // 0–7 "persona_id": "" // optional } → 200 OK { "session_id": "uuid-string", "objections": [ { "id": "obj-uuid", "timestamp": "2026-04-17T12:00:00Z", "objection": "Absolute efficacy claim without evidence", "guideline": "Avoid absolute claims — cite specific trial data", "severity": "high", "matched_keywords": ["100%", "no side effects"] } ], "count": 1 } ``` --- ### 4.6 Knowledge Graph **Prefix**: `/api/graph` #### 4.6.1 Ingest Doctors into Graph ``` POST /api/graph/ingest Content-Type: application/json { "records": [ { ...doctor data... }, ... ] } → { "status": "ok", "ingested": 120 } ``` #### 4.6.2 Get Doctor from Graph ``` GET /api/graph/doctor/{code_name} → 200 OK { "code_name": "HCP-00-042", "cluster_id": 0, "h_index": 41, "institution": "Brigham and Women's Hospital", "topics": ["Biomarker Discovery", "Minimal Residual Disease"], ... } ``` #### 4.6.3 Get Doctors by Cluster ``` GET /api/graph/cluster/{cluster_id}/doctors?limit=50 → { "cluster_id": 0, "doctors": [...], "count": 50 } ``` #### 4.6.4 Get Doctors by Institution ``` GET /api/graph/institution/{institution_name}/doctors?limit=50 → { "institution": "Mount Sinai", "doctors": [...], "count": 12 } ``` #### 4.6.5 Get Doctors by Topic ``` GET /api/graph/topic/{topic_name}/doctors?limit=50 → { "topic": "Biomarker Discovery", "doctors": [...], "count": 8 } ``` #### 4.6.6 Institution Summary ``` GET /api/graph/institutions/summary?limit=20 → { "institutions": [{ "name": "...", "doctor_count": 15, ... }, ...] } ``` #### 4.6.7 Topic Overlap Between Doctors ``` GET /api/graph/overlap?code_name_a=HCP-00-001&code_name_b=HCP-00-042 → { "doctor_a": "HCP-00-001", "doctor_b": "HCP-00-042", "shared_topics": ["Biomarker Discovery"], "count": 1 } ``` --- ### 4.7 Analytics **Prefix**: `/api/analytics` #### 4.7.1 List Sessions ``` GET /api/analytics/sessions?limit=50 → { "sessions": [ { "sessionId": "...", "personaId": "...", "score": 0.8, ... }, ... ] } ``` #### 4.7.2 Get Session Detail ``` GET /api/analytics/session/{session_id} → 200 OK { "sessionId": "uuid", "personaId": "HCP-00-042", "campaignId": "camp-001", "clusterId": 0, "durationMs": 245000, "adherenceScore": 0.72, // 0.0–1.0 "emotionTimeline": [ { "timestampMs": 0, "userValence": 0.52, "userArousal": 0.34, "avatarResistance": 0.58 }, ... ], "totalPoints": 5, "deliveredPoints": 3, "objections": [ { "id": "mohp-abc12345", "objection": "Potential compliance concern...", "guideline": "NCCN Category 2A Evidence Requirement", "severity": "medium", "matched_keywords": ["efficacy"], "cluster_source": "0", "response_latency_ms": 1200, "mohp_aligned": true, "user_response": "..." } ], "conversation": [ { "id": "uuid", "role": "user", "text": "...", "timestamp": 1713350400000, "meta": {} }, { "id": "uuid", "role": "assistant", "text": "...", "timestamp": 1713350408000, "meta": {} } ], "campaignSnapshot": { ... } // full-evaluate result passed at session start (if any) } ``` #### 4.7.3 Delete Session ``` DELETE /api/analytics/session/{session_id} → 200 OK { "deleted": true, "session_id": "uuid" } ``` --- ### 4.8 Math Engine (Low-Level) **Prefix**: `/api/math` ``` POST /api/math/vectorize Content-Type: application/json [{doctor_record}, {doctor_record}, ...] → vectorized result ``` ``` POST /api/math/cluster Content-Type: application/json [{doctor_record}, {doctor_record}, ...] → GMM clustering result ``` --- ### 4.9 Pipeline (Data Ingestion) **Prefix**: `/api/pipeline` ``` POST /api/pipeline/ingest → { "status": "queued", "event_id": "uuid" } POST /api/pipeline/dispatch → { "processed": true } POST /api/pipeline/seed → { "status": "seeded", "records_loaded": 480, "records_ingested": 480, "source_file": "doctors_unified.json" } ``` **`/api/pipeline/seed`** loads the gold doctor dataset into Neo4j. Call this once after first setup to populate the knowledge graph. --- ### 4.10 System Stats **Prefix**: `/api/stats` ``` GET /api/stats/embedding → { "model_name": "onnx-minilm", "dimension": 384, ... } GET /api/stats/projection → { "ready": true, "input_dim": 384, "output_dim": 12, ... } GET /api/stats/cache → { "semantic_cache_keys": 42, "session_keys": 5, "simulation_session_keys": 3 } GET /api/stats/dlq → { "dlq_depth": 0 } GET /api/stats/outbox → { "pending_outbox_events": 0 } ``` --- ### 4.11 Admin — Embeddings **Prefix**: `/admin/embeddings` ``` GET /admin/embeddings/status → { "model_name": "onnx-minilm", "dimension": 384, "known_models": {...} } POST /admin/embeddings/swap → swap embedding model (body: { "model_name": "...", "reindex": true }) POST /admin/embeddings/reindex → re-embed all ChromaDB collections ``` ### 4.12 Admin — Dead Letter Queue ``` GET /admin/dlq?limit=100 → { "items": [...] } POST /admin/dlq/replay → { "replayed": true, "item": {...} } ``` --- ## 5. Data Models Reference ### 5.1 Doctor Row (from `/api/strategy/cluster/{id}/doctors`) The strategy cluster endpoint returns full doctor rows from the master CSV. Exact fields: ```json { "cluster_id": 0, "name": "Dr. Ahmed Hassan", "region": "egypt", "headline": "Professor of Medical Oncology, Cairo University", "location": "Cairo, Egypt", "company": "Cairo University Hospital", "job": "Professor of Medical Oncology", "school": "Cairo University", "school_degree": "MD, PhD", "primary_specialty": "Medical Oncology", "seniority_level": "Senior", "highest_academic_degree": "PhD", "total_years_experience": 22, "expected_age": 50, "age_group": "46-55", "current_role_tenure": 8, "kol_status": "National KOL", "digital_presence": "High", "academic_affiliation": "University Professor", "workplace_category": "Academic Medical Center", "institutional_tier": "Tier 1", "adoption_profile": "Early Adopter", "channel_preference": "Conference + Digital" } ``` ### 5.2 Campaign Feature Keys — Two Parallel Systems #### System A: Campaign Feature Keys (snake_case) — strategy & optimization Used in `POST /api/strategy/vectorize`, `full-evaluate`, `/optimize`, and all strategy endpoints: ``` therapeutic_focus messaging_tone target_seniority channel_preference kol_alignment trial_phase_relevance formulary_impact patient_population_size competitive_positioning regulatory_stage budget_tier urgency_score ``` Values are floats in `[0.0, 1.0]`. 0.5 is the neutral baseline. #### System B: Persona Trait Axes (display names) — persona & War Room Used in `GET /api/persona/*` responses, radar charts, and simulation persona cards: ``` Scientific Rigor Innovation Appetite Guideline Adherence Price Sensitivity Risk Tolerance Peer Influence Evidence Threshold Formulary Weight Patient Centricity Digital Readiness KOL Alignment Trial Participation ``` ### 5.3 Optimization Result Shape When `rejected=true` in `full-evaluate`, or from `POST /api/strategy/optimize`: ```json { "original_text": "...", "optimized_text": "...", "improvements": [ "Increase therapeutic_focus emphasis from 0.25 toward 0.95.", "Reduce budget_tier emphasis from 0.40 toward 0.20." ], "reason": "heuristic_fallback_no_api_key", "reasoning": "Applied deterministic optimization against the largest segment-fit gaps for The Academic Skeptic.", "rewrite_rationale": "...", "optimized_vector": { "therapeutic_focus": 0.85, ... }, "optimized_normalized_vector": { "therapeutic_focus": 0.85, ... }, "retrieved_examples": ["similar past campaign text...", "..."], "retrieval_diagnostics": [ { "campaign_id": "...", "campaign_text": "...", "metadata": {...}, "distance": 0.12 } ], "target_cluster": 0, "target_centroid_vector": [12 floats], "alignment_score": 0.87 } ``` **`reason` values**: - `heuristic_fallback_no_api_key` — Ollama key absent; used deterministic rewrite - `heuristic_fallback_llm_error` — LLM call failed; fell back to heuristic - `llm_rewrite` — Full LLM-powered optimization (requires Ollama API key) ### 5.4 Heatmap Ranking Entry ```json { "cluster_id": 1, "label": "The Commercial Adopter", "distance": 1.23, // Mahalanobis distance in PCA space (lower = better fit) "probability": 0.45 // softmax-normalized probability (higher = better fit) } ``` > Note: `heatmap.ranking` entries do **not** include `top_doctors` in the current backend implementation. --- ## 6. Suggested Frontend Pages / Views | Page | Route | Primary Endpoints | Description | |------|-------|-------------------|-------------| | **Dashboard** | `/` | `GET /api/stats/*`, `GET /api/analytics/sessions` | System overview + recent simulation sessions | | **Strategy Lab** | `/strategy` | `POST /api/strategy/full-evaluate` | Multi-tab campaign evaluation (see §8) | | **Evaluation History** | `/history` | localStorage (client-side store) | Browsable list of all past evaluations (see §7) | | **HCP Explorer** | `/hcp` | `GET /api/strategy/cluster/{id}/doctors`, `GET /api/persona/{code_name}` | Browse doctors by cluster, view profiles | | **Knowledge Graph** | `/graph` | `GET /api/graph/*` | Institution/topic/doctor graph visualization | | **War Room (Simulation)** | `/war-room` | `POST /api/simulation/start` → `turn` loop | Live roleplay practice sessions | | **Session Review** | `/analytics/session/{id}` | `GET /api/analytics/session/{id}` | Post-simulation review: conversation, emotion timeline, adherence | | **Admin** | `/admin` | `/admin/embeddings/*`, `/admin/dlq/*`, `/api/stats/*` | Embedding model management, DLQ monitoring | --- ## 7. Evaluation History Feature Specification > **The backend has no dedicated evaluation history endpoint.** History must be managed client-side using `localStorage` (or IndexedDB via a library like `idb` for larger payloads). ### 7.1 Data Structure Each time `POST /api/strategy/full-evaluate` completes successfully, persist a history entry: ```typescript interface EvaluationHistoryEntry { // Identity id: string; // nanoid() or crypto.randomUUID() tabId: string; // which tab originated this evaluation timestamp: number; // Date.now() // Input campaignText: string; region: string | null; rejectionThreshold: number; // Output (full backend response — stored verbatim) result: FullEvaluateResponse; // the complete JSON from /api/strategy/full-evaluate // Derived display fields (pre-computed for list rendering, avoid re-parsing) topClusterName: string; // result.cluster_cards[0].name topClusterScore: number; // result.cluster_cards[0].score rejected: boolean; vectorizationModel: string; } ``` ### 7.2 Storage Key ```typescript const HISTORY_STORAGE_KEY = "orsync:evaluation-history"; ``` Store as a JSON-serialized array. Most recent first. ### 7.3 History Page (`/history`) The history page must: 1. **List all entries** — show: timestamp (formatted), first 80 chars of `campaignText`, `topClusterName`, `topClusterScore` badge (0–100), rejected badge (red) if applicable, region tag 2. **Click to restore** — clicking any entry navigates to `/strategy?historyId={entry.id}` which opens a new tab in the Strategy Lab pre-populated with all the entry's data (no API call needed — data is already stored) 3. **Delete entry** — per-row delete button removes from localStorage 4. **Clear all** — confirm dialog → wipe all history 5. **Search/filter** — filter by campaign text substring, cluster, region, or rejected/accepted 6. **Export** — download all entries as a JSON file ### 7.4 History State Schema (Zustand / React Context) ```typescript interface HistoryStore { entries: EvaluationHistoryEntry[]; // Actions addEntry: (entry: EvaluationHistoryEntry) => void; deleteEntry: (id: string) => void; clearAll: () => void; getEntry: (id: string) => EvaluationHistoryEntry | undefined; } ``` Hydrate from `localStorage` on mount. Persist on every write. --- ## 8. Multi-Tab Evaluation Feature Specification The Strategy Lab (`/strategy`) must support multiple simultaneous independent evaluations, each in its own **virtual tab** within the page (not a browser tab — think VS Code-style tabs within the app). ### 8.1 Tab Data Model ```typescript interface EvaluationTab { id: string; // nanoid() label: string; // "Evaluation 1", or user-renamed title createdAt: number; // Form state campaignText: string; region: string | null; rejectionThreshold: number; // Execution state status: "idle" | "loading" | "success" | "error"; error: string | null; // Result (null until first successful evaluation) result: FullEvaluateResponse | null; evaluatedAt: number | null; // Active inner tab (Clusters | Features | Heatmap | GMM Details | Optimization) activeResultTab: string; } ``` ### 8.2 Behaviour | Action | Behaviour | |--------|-----------| | **New Evaluation** button | Creates a new `EvaluationTab` with `status: "idle"`, switches to it | | **Run** in a tab | Calls `POST /api/strategy/full-evaluate` with that tab's form data; sets status to "loading" | | **Success** | Sets `status: "success"`, stores `result`, appends to evaluation history (§7) | | **Error** | Sets `status: "error"`, stores error message | | **Close tab** | Removes tab; confirms if tab has unsaved/un-run changes | | **Rename tab** | Double-click the tab label → inline edit | | **Switch tab** | Immediately switches view; does not interrupt in-flight requests on other tabs | | **Reload** | All tabs and their results are persisted in `localStorage`; restored on page load | | **Click cluster card** → doctors | Opens the doctor drawer within the same tab; does not change other tabs | ### 8.3 Persistence Key ```typescript const TABS_STORAGE_KEY = "orsync:strategy-tabs"; ``` Serialize the full `EvaluationTab[]` array to localStorage on every state change. ### 8.4 Default Initial State On first load (empty localStorage): ```typescript [{ id: nanoid(), label: "Evaluation 1", campaignText: "", region: null, rejectionThreshold: 3.0, status: "idle", error: null, result: null, evaluatedAt: null, activeResultTab: "clusters" }] ``` ### 8.5 Result Panel Tabs Once a tab has a successful result, show 5 inner tabs: | Tab Key | Content | |---------|---------| | `clusters` | `cluster_cards` — score cards with name, score badge, distance, probability | | `features` | `campaign_vector_12d` — bar or radar chart using the 12 snake_case feature keys | | `heatmap` | `heatmap.ranking` — cluster heatmap sorted by probability; show distance and probability per row | | `gmm` | `gmm` metadata — k, data_source, n_pca_components, explained_variance_ratio; 2D scatter of `member_points_2d` | | `optimization` | Show if `rejected=true`: full `optimized` object — original vs optimized text diff, improvements list, alignment_score, reason badge | --- ## 9. Typical Frontend Workflows ### Workflow 1: Campaign Evaluation (Primary Flow) ``` 1. User opens /strategy (Strategy Lab) 2. A default tab "Evaluation 1" is shown 3. User types campaign text, optionally selects region and threshold 4. User clicks "Run Evaluation" 5. Frontend calls POST /api/strategy/full-evaluate 6. On success: a. Display result tabs (Clusters, Features, Heatmap, GMM, Optimization) b. Persist entry to history store (localStorage) 7. User optionally: a. Clicks a cluster card → drawer opens → GET /api/strategy/cluster/{id}/doctors b. Clicks "New Evaluation" → new empty tab is created c. Navigates to /history to revisit any past evaluation ``` ### Workflow 2: Restore Evaluation from History ``` 1. User navigates to /history 2. Sees list of past evaluations sorted by timestamp (newest first) 3. Clicks any entry 4. App opens /strategy?historyId={id} 5. Strategy Lab reads localStorage, finds entry by id 6. Creates new tab pre-populated with all data (no API call) 7. User can re-run or just view the previous result ``` ### Workflow 3: Simulation (War Room) ``` 1. User picks a doctor from HCP Explorer or from a cluster card doctor list 2. Navigates to /war-room?personaId=HCP-00-042 3. Frontend optionally pre-fetches persona: GET /api/persona/HCP-00-042 4. User clicks "Start Session" 5. POST /api/simulation/start { persona_id: "HCP-00-042", campaign_snapshot: // pass if available } 6. Complete handshake: POST /api/simulation/handshake { session_id, answer: did_offer } 7. Turn loop: - User types message → POST /api/simulation/turn { session_id, input_text } - Display response text (turn.response field) - Optionally show Hume prosody when audio is enabled 8. After session → GET /api/analytics/session/{session_id} 9. Display review: conversation, emotion timeline (userValence/userArousal/avatarResistance), adherenceScore ``` ### Workflow 4: Data Initialization (First Setup) ``` 1. POST /api/pipeline/seed → seeds Neo4j with 480 gold doctors 2. GET /api/stats/embedding → confirm embedding model 3. GET /api/stats/projection → confirm projection bridge is ready 4. GET /api/graph/cluster/0/doctors?limit=5 → verify doctors are in graph ``` --- ## 10. Error Handling All errors follow standard HTTP status codes: | Code | Meaning | When | |------|---------|------| | `400` | Bad Request | Invalid request body, validation errors | | `404` | Not Found | Doctor/session/resource not found | | `422` | Unprocessable Entity | Pydantic validation failure (body shape wrong) | | `500` | Internal Server Error | Unexpected server error, dependency unavailable | **422 response shape** (FastAPI Pydantic validation error): ```json { "detail": [ { "type": "string_too_short", "loc": ["body", "campaign_text"], "msg": "String should have at least 1 character", "input": "", "ctx": { "min_length": 1 } } ] } ``` **Graceful degradation**: When Ollama, Redis, Neo4j, or ChromaDB are unavailable, the backend does **not** return 500. It falls back to heuristics or returns empty/mock responses. The frontend should always render something — e.g. show the heuristic optimization result even when `reason = "heuristic_fallback_no_api_key"`. --- ## 11. Environment Variables The backend reads from `.env` (all have development defaults): | Variable | Default | Description | |----------|---------|-------------| | `PORT` | `7860` | Server port | | `ENVIRONMENT` | `development` | `development` / `staging` / `production` | | `REDIS_URL` | `redis://localhost:6379/0` | Redis connection | | `NEO4J_URI` | `bolt://localhost:7687` | Neo4j bolt endpoint | | `NEO4J_USER` | `neo4j` | Neo4j username | | `NEO4J_PASSWORD` | `password` | Neo4j password | | `CHROMA_HOST` | `localhost` | ChromaDB host | | `CHROMA_PORT` | `8100` | ChromaDB port | | `OLLAMA_HOST` | `https://ollama.com` | Ollama API endpoint | | `OLLAMA_API_KEY` | *(empty)* | Ollama Cloud API key — required for LLM features | | `OLLAMA_MODEL` | `gemma4:31b-cloud` | LLM model name | | `EMBEDDING_MODEL` | `onnx-minilm` | Embedding model (`onnx-minilm` runs locally, no key needed) | | `STRATEGY_REJECTION_DISTANCE_THRESHOLD` | `3.0` | Default campaign rejection threshold | | `DID_API_KEY` | *(empty)* | D-ID API key — required for real WebRTC avatar streams | | `HUME_API_KEY` | *(empty)* | Hume API key — required for prosody analysis | | `CORS_ALLOWED_ORIGINS` | `*` | Comma-separated allowed CORS origins | | `OUTBOX_TRANSPORT` | `redis` | Outbox transport: `redis` or `amqp` | | `RABBITMQ_URL` | *(empty)* | RabbitMQ URL when `OUTBOX_TRANSPORT=amqp` | --- ## 12. Frontend Next.js Configuration When building a Next.js frontend on port 3001, configure API proxying in `next.config.ts`: ```typescript // next.config.ts import type { NextConfig } from "next"; const nextConfig: NextConfig = { async rewrites() { return [ { source: "/api/:path*", destination: "http://localhost:7860/api/:path*", }, { source: "/admin/:path*", destination: "http://localhost:7860/admin/:path*", }, ]; }, }; export default nextConfig; ``` With this setup all `fetch("/api/strategy/full-evaluate")` calls from the browser are proxied to the backend. No CORS issues. No hardcoded backend URL in client code. ### TypeScript Types Quick Reference ```typescript // Campaign feature keys (snake_case) — use these for strategy/vectorize and full-evaluate type CampaignFeatureKey = | "therapeutic_focus" | "messaging_tone" | "target_seniority" | "channel_preference" | "kol_alignment" | "trial_phase_relevance" | "formulary_impact" | "patient_population_size" | "competitive_positioning" | "regulatory_stage" | "budget_tier" | "urgency_score"; // Persona trait axes (display names) — use these for persona radar charts type PersonaTraitAxis = | "Scientific Rigor" | "Innovation Appetite" | "Guideline Adherence" | "Price Sensitivity" | "Risk Tolerance" | "Peer Influence" | "Evidence Threshold" | "Formulary Weight" | "Patient Centricity" | "Digital Readiness" | "KOL Alignment" | "Trial Participation"; // Cluster IDs type ClusterId = 0 | 1 | 2 | 3; // Cluster card id format type ClusterCardId = "c0" | "c1" | "c2" | "c3"; interface HeatmapRankingEntry { cluster_id: ClusterId; label: string; distance: number; // Mahalanobis, lower = better probability: number; // softmax, higher = better } interface ClusterCard { id: ClusterCardId; // string: "c0", "c1", etc. cluster_id: ClusterId; name: string; score: number; // 0–100 float (probability × 100) distance: number; probability: number; // 0–1 } interface FullEvaluateResponse { campaign_vector_12d: number[]; campaign_vector_pca: number[]; gmm: { k: number; data_source: string; n_pca_components: number; pca_explained_variance_ratio: number[]; centroids: number[][]; covariances: number[][][]; assignments: number[]; member_points_2d: [number, number][]; }; heatmap: { ranking: HeatmapRankingEntry[] }; // NOTE: object, not array cluster_cards: ClusterCard[]; rejection_distance_threshold: number; rejected: boolean; optimized: OptimizationResult | null; vectorization_model: string; } interface OptimizationResult { original_text: string; optimized_text: string; improvements: string[]; reason: "heuristic_fallback_no_api_key" | "heuristic_fallback_llm_error" | "llm_rewrite"; reasoning: string; rewrite_rationale: string; optimized_vector: Record; optimized_normalized_vector: Record; retrieved_examples: string[]; retrieval_diagnostics: Array<{ campaign_id: string; campaign_text: string; metadata: Record; distance: number }>; target_cluster: ClusterId | null; target_centroid_vector: number[] | null; alignment_score: number | null; } ``` --- ## 13. Fallback Behaviour Reference | Dependency | Missing/down | Backend behaviour | |-----------|-------------|-------------------| | Ollama API key | Not set | Campaign vectorization uses heuristic keyword scoring; optimization uses deterministic heuristic rewrite | | Ollama model | Unavailable | Same as above — heuristic fallback | | D-ID API key | Not set | Simulation returns a mock WebRTC offer; `stream_mode: "mock"` | | Hume API key | Not set | `prosody` field in turn response is `null`; `hume_prosody_enabled: false` in start response | | Redis | Down | Session persistence fails silently; analytics endpoints return 404 for sessions | | Neo4j | Down | Graph endpoints return 404/500; persona endpoints fall back to cluster profile data | | ChromaDB | Down | RAG retrieval fails silently; optimization uses empty example pool | --- *End of FRONTEND_INTEGRATION.md — all shapes verified against live backend source code.*