# Orsync Scenarist Backend — Frontend Integration Reference

> **Version**: v7.0 · **Backend port**: 7860 · **Frontend target**: Next.js (App Router), TypeScript
>
> This is the single authoritative reference for building a frontend against this backend. It covers every endpoint with its exact request/response shape (derived from the live route files), all data models, frontend feature specifications, history/tab management, and recommended page architecture. Build from this document directly — do not guess shapes from the old doc.

---

## 1. Project Overview

**Orsync Scenarist v7.0** is an enterprise pharma strategic intelligence platform. The backend is a Python FastAPI application that exposes six functional areas:

| Area | What it does |
|------|-------------|
| **Strategy Engine** | Vectorize campaign text into a 12D behavioral vector, run GMM clustering on the HCP population, score campaign-to-cluster fit via Mahalanobis distance, auto-optimize rejected campaigns |
| **HCP Knowledge Graph** | Neo4j graph of ~480 doctor nodes with institution, topic, and cluster relationships; queryable by code name, cluster, institution, or topic |
| **Persona Engine** | Returns a 12-feature behavioral profile for any doctor or cluster archetype; used to initialise simulation sessions |
| **Simulation Engine** | WebRTC-style multi-turn roleplay: AI physician persona responds to rep pitches; integrates D-ID avatar streams and Hume prosody analysis |
| **MOHP** | Medical Objection Handling Protocol — given a rep statement, returns compliance objections keyed to cluster-specific guideline databases |
| **Analytics** | Redis-backed session store (7-day TTL); full conversation, emotion timeline, adherence score, and campaign snapshot per session |

**Runtime dependencies** (all optional with graceful fallback):

| Service | Default address | Used for |
|---------|----------------|---------|
| Redis | `redis://localhost:6379/0` | Sessions, semantic cache, outbox, DLQ, event streams |
| Neo4j | `bolt://localhost:7687` | Doctor knowledge graph |
| ChromaDB | `localhost:8100` | Campaign memory (RAG retrieval), approximate semantic cache |
| Ollama | `https://ollama.com` | LLM inference — campaign vectorization, optimization, simulation replies, MOHP |
| D-ID | API key in env | WebRTC avatar stream |
| Hume | API key in env | Prosody / emotion analysis on audio input |

When a dependency is absent the backend degrades gracefully — it never crashes. See §8 (Fallback Behaviour) for detail.

---

## 2. Connection Details

| Setting | Default | Description |
|---------|---------|-------------|
| Base URL | `http://localhost:7860` | Backend API server |
| API Docs | `http://localhost:7860/docs` | Swagger UI (auto-generated) |
| OpenAPI JSON | `http://localhost:7860/openapi.json` | Machine-readable schema |
| CORS | `*` (development) | All origins allowed in dev mode |

---

## 3. Authentication

The backend has **no authentication** in development mode. All endpoints are open. In production, wrap the backend behind a reverse proxy (nginx, Caddy) with your auth layer. CORS is `*` by default — restrict with `CORS_ALLOWED_ORIGINS`.

---

## 4. Complete API Reference

### 4.1 Health Check

```
GET /healthz

→ { "status": "ok" }
```

---

### 4.2 Strategy Engine (Primary Frontend Flow)

**Prefix**: `/api/strategy`

#### 4.2.1 Full Evaluate (Main Endpoint — Start Here)

This is the **primary endpoint** for the campaign analysis workflow. It takes raw campaign text and returns everything: vectorization, clustering, heatmap, rejection check, and optimization.

```
POST /api/strategy/full-evaluate
Content-Type: application/json

{
  "campaign_text": "string (required, min 1 char)",
  "rejection_distance_threshold": 3.0,       // optional, > 0
  "region": "egypt" | "saudi" | "gulf" | null // optional, filter HCPs by region
}

→ 200 OK
{
  // --- Campaign vector (raw 12D, snake_case FEATURE_KEYS) ---
  "campaign_vector_12d": [0.72, 0.65, 0.55, 0.40, 0.70, 0.60, 0.38, 0.50, 0.45, 0.55, 0.30, 0.42],

  // --- Campaign vector projected into GMM PCA latent space (n_pca_components-dimensional) ---
  "campaign_vector_pca": [1.23, -0.67, 0.11],

  // --- GMM clustering output ---
  "gmm": {
    "k": 4,
    "data_source": "gold",                    // or "synthetic_seed"
    "n_pca_components": 3,
    "pca_explained_variance_ratio": [0.45, 0.25, 0.15],
    "centroids": [[...], [...], [...], [...]],  // k × pca_dim arrays
    "covariances": [[[...]], ...],              // k × pca_dim × pca_dim matrices
    "assignments": [0, 1, 0, 2, 3, ...],       // cluster assignment per doctor in population
    "member_points_2d": [[x, y], ...]           // 2D PCA scatter coords for each population member
  },

  // --- Heatmap (cluster-fit ranking) ---
  // ⚠️  This is an OBJECT with a "ranking" key, NOT a bare array.
  "heatmap": {
    "ranking": [
      {
        "cluster_id": 1,
        "label": "The Commercial Adopter",
        "distance": 1.23,                      // Mahalanobis distance (lower = better fit)
        "probability": 0.45                    // softmax over distances (higher = better fit)
      },
      ...
    ]
  },

  // --- Cluster cards (one per cluster, sorted by probability desc) ---
  "cluster_cards": [
    {
      "id": "c1",                              // ⚠️  STRING, e.g. "c0", "c1", "c2", "c3"
      "cluster_id": 1,
      "name": "The Commercial Adopter",
      "score": 45.0,                           // ⚠️  0–100 float (probability × 100)
      "distance": 1.2300,
      "probability": 0.4500
    },
    ...
  ],

  "rejection_distance_threshold": 3.0,
  "rejected": false,                           // true when min distance > threshold

  // --- Optimization result (non-null ONLY when rejected=true) ---
  "optimized": null,                           // or full optimization object — see §5.3

  "vectorization_model": "gemma4:31b-cloud"    // or "fallback-no-api-key"
}
```

#### 4.2.2 Vectorize Campaign Text

> **Important**: Campaign feature keys use `snake_case` (`therapeutic_focus` etc.) — these are different from the persona behavioral display labels used in the War Room. See §5.2 for both sets.

```
POST /api/strategy/vectorize
Content-Type: application/json

{
  "text": "Campaign message text here",
  "campaign_id": "optional-id"   // optional
}

→ 200 OK
{
  "features": {
    "therapeutic_focus":        0.72,
    "messaging_tone":           0.65,
    "target_seniority":         0.55,
    "channel_preference":       0.40,
    "kol_alignment":            0.70,
    "trial_phase_relevance":    0.60,
    "formulary_impact":         0.38,
    "patient_population_size":  0.50,
    "competitive_positioning":  0.45,
    "regulatory_stage":         0.55,
    "budget_tier":              0.30,
    "urgency_score":            0.42
  },
  "normalized_features": { /* same 12 keys, MinMax-scaled to [0,1] */ },
  "embedding": [0.12, -0.34, ...],      // 384-float ONNX MiniLM vector
  "embedding_model": "onnx-minilm",
  "model": "gemma4:31b-cloud"           // or "fallback-no-api-key"
}
```

#### 4.2.3 Build Heatmap

```
POST /api/strategy/heatmap
Content-Type: application/json

{
  "campaign_vector": [12 floats],
  "centroids": [[...], [...], ...],           // from gmm.centroids
  "covariances": [[[...]], [[...]], ...],     // from gmm.covariances
  "cluster_top_doctors": {"0": ["HCP-00-001"], "1": ["HCP-01-042"]}  // optional
}

→ 200 OK
[
  { "cluster_id": 1, "label": "...", "distance": 1.2, "probability": 0.45, "top_doctors": [...] },
  ...
]
```

#### 4.2.4 Optimize Campaign

```
POST /api/strategy/optimize
Content-Type: application/json

{
  "campaign_text": "Original campaign text",
  "target_cluster": 0,                        // optional — cluster to optimize toward
  "target_centroid_vector": [12 floats]        // optional — explicit target 12D vector
}

→ 200 OK
{
  "original_text": "...",
  "optimized_text": "...",
  "improvements": [
    "Increase therapeutic_focus emphasis from 0.25 toward 0.95.",
    "Reduce budget_tier emphasis from 0.40 toward 0.20."
  ],
  "reason": "heuristic_fallback_no_api_key",  // or "heuristic_fallback_llm_error" | "llm_rewrite"
  "reasoning": "Applied deterministic optimization...",
  "rewrite_rationale": "...",
  "optimized_vector": {
    "therapeutic_focus": 0.85,
    // ... all 12 snake_case feature keys
  },
  "optimized_normalized_vector": { /* same 12 keys, normalized */ },
  "retrieved_examples": ["example text 1", "example text 2"],  // RAG-retrieved similar campaigns
  "retrieval_diagnostics": [
    {
      "campaign_id": "...",
      "campaign_text": "...",
      "metadata": { "outcome": "success", "cluster_id": 0, ... },
      "distance": 0.12
    }
  ],
  "target_cluster": 0,
  "target_centroid_vector": [12 floats],
  "alignment_score": 0.87                     // cosine similarity between optimized vector and target
}
```

> **Note**: When `ollama_api_key` is absent, the optimizer uses a deterministic heuristic rewrite. The `reason` field signals which path was taken.

#### 4.2.5 Evaluate Strategy (Advanced — BYO Centroids)

```
POST /api/strategy/evaluate
Content-Type: application/json

{
  "campaign_text": "string",
  "centroids": [[...], ...],
  "covariances": [[[...]], ...],
  "cluster_top_doctors": null,
  "rejection_distance_threshold": 3.0
}

→ 200 OK
{
  "campaign_vector": {...},
  "heatmap": [...],
  "rejection_distance_threshold": 3.0,
  "rejected": false,
  "optimized": null
}
```

#### 4.2.6 Store Campaign Memory

```
POST /api/strategy/memory/store
Content-Type: application/json

{
  "campaign_text": "string",
  "campaign_id": "optional-id",
  "outcome": "string",
  "success_score": 0.85,     // 0.0–1.0
  "is_successful": true,
  "cluster_id": 0,
  "extra_metadata": {}
}

→ 200 OK
{
  "stored": true,
  "campaign_id": "...",
  "embedding_model": "onnx-minilm",
  "is_successful": true
}
```

#### 4.2.7 Get Cluster Doctors

```
GET /api/strategy/cluster/{cluster_id}/doctors?limit=50&region=egypt

→ 200 OK
{
  "cluster_id": 0,
  "total_in_cluster": 120,
  "total_in_db": 480,
  "k": 4,
  "region": "egypt",
  "doctors": [
    {
      "cluster_id": 0,
      "name": "Dr. Ahmed Hassan",
      "region": "egypt",
      "headline": "...",
      "location": "Cairo",
      "company": "Cairo University Hospital",
      "job": "Professor of Oncology",
      "school": "Cairo University",
      "school_degree": "MD, PhD",
      "primary_specialty": "Medical Oncology",
      "seniority_level": "Senior",
      "highest_academic_degree": "PhD",
      "total_years_experience": 22,
      "expected_age": 50,
      "age_group": "46-55",
      "current_role_tenure": 8,
      "kol_status": "National KOL",
      "digital_presence": "High",
      "academic_affiliation": "University Professor",
      "workplace_category": "Academic Medical Center",
      "institutional_tier": "Tier 1",
      "adoption_profile": "Early Adopter",
      "channel_preference": "Conference + Digital"
    },
    ...
  ]
}
```

---

### 4.3 Persona Engine

**Prefix**: `/api/persona`

> **Two distinct feature systems exist in this backend** — do not confuse them:
>
> | System | Keys | Used in |
> |--------|------|---------|
> | **Campaign features** | `snake_case` (`therapeutic_focus`, etc.) | `POST /api/strategy/vectorize`, `full-evaluate`, optimization, cluster matching |
> | **Persona behavioral traits** | Display names (`"Scientific Rigor"`, etc.) | `GET /api/persona/*`, War Room radar charts |

#### Persona Behavioral Trait Axes (12 Display-Name Axes)

These are the axes returned by the persona endpoints for radar chart rendering:

| # | Axis | High value means |
|---|------|------------------|
| 0 | Scientific Rigor | Demands strong evidence before acting |
| 1 | Innovation Appetite | Willing to adopt novel treatments early |
| 2 | Guideline Adherence | Strictly follows clinical guidelines |
| 3 | Price Sensitivity | Cost heavily influences decision-making |
| 4 | Risk Tolerance | Comfortable accepting treatment-related risks |
| 5 | Peer Influence | Swayed by what respected colleagues do |
| 6 | Evidence Threshold | Needs more evidence before prescribing |
| 7 | Formulary Weight | Formulary listing is a prerequisite |
| 8 | Patient Centricity | Patient outcomes drive decisions |
| 9 | Digital Readiness | Embraces digital/remote engagement channels |
| 10 | KOL Alignment | Follows Key Opinion Leader guidance |
| 11 | Trial Participation | Actively participates in clinical trials |

#### Cluster Archetypes

| ID | Label | Key trait profile |
|----|-------|-------------------|
| 0 | The Academic Skeptic | Scientific Rigor 0.95, Evidence Threshold 0.92, KOL Alignment 0.85, Innovation Appetite 0.30 |
| 1 | The Commercial Adopter | Innovation Appetite 0.92, Digital Readiness 0.88, Risk Tolerance 0.85, Evidence Threshold 0.45 |
| 2 | The Guideline Loyalist | Guideline Adherence 0.95, Peer Influence 0.80, Evidence Threshold 0.75, Risk Tolerance 0.10 |
| 3 | Price-Sensitive Pragmatist | Price Sensitivity 0.95, Formulary Weight 0.90, Patient Centricity 0.80, KOL Alignment 0.35 |

#### 4.3.1 Get Persona from Cluster

```
GET /api/persona/from-cluster/{cluster_id}

→ 200 OK
{
  "codeName": "HCP-00-042",
  "clusterId": 0,
  "clusterLabel": "The Academic Skeptic",
  "traits": [
    { "axis": "Scientific Rigor", "value": 0.92 },
    { "axis": "Innovation Appetite", "value": 0.35 },
    ...  // 12 total
  ]
}
```

#### 4.3.2 Get Specific Doctor Persona

```
GET /api/persona/{code_name}

→ 200 OK
{
  "codeName": "HCP-00-042",
  "clusterId": 0,
  "clusterLabel": "The Academic Skeptic",
  "traits": [...],
  "h_index": 41,
  "works_count": 129,
  "cited_by_count": 6958,
  "years_active": 16
}
```

---

### 4.4 Simulation Engine (WebRTC-Style Roleplay)

**Prefix**: `/api/simulation`

The simulation flow is a multi-step conversation:

```
┌─────────────────────────────────────────────────┐
│  1. POST /api/simulation/start                  │
│     → Returns session_id + WebRTC offer         │
│                                                 │
│  2. POST /api/simulation/handshake              │
│     → Complete WebRTC signaling                 │
│                                                 │
│  3. POST /api/simulation/ice-candidate          │
│     → Exchange ICE candidates (can repeat)      │
│                                                 │
│  4. POST /api/simulation/turn   ← REPEAT        │
│     → Send rep's message, get AI persona reply  │
│                                                 │
│  (Optional) GET /api/simulation/cache/{key}     │
│     → Check semantic cache for similar turns    │
└─────────────────────────────────────────────────┘
```

#### 4.4.1 Start Simulation

```
POST /api/simulation/start
Content-Type: application/json

{
  "persona_id": "HCP-00-042",       // required — doctor code_name from cluster doctors
  "campaign_id": "camp-001",        // optional
  "campaign_snapshot": { ... }      // optional — full full-evaluate result for session replay
}

→ 200 OK
{
  "session_id": "uuid-string",
  "started_at_epoch": 1713350400,
  "target_handshake_ms": 400,
  "did_payload": { ... },            // D-ID stream creation payload
  "did_offer": {
    "type": "offer",
    "sdp": "v=0\r\n..."             // WebRTC SDP offer (mock when DID_API_KEY absent)
  },
  "did_ice_servers": [],             // ICE servers (empty in mock mode)
  "hume_prosody_enabled": false,     // true when HUME_API_KEY is set
  "semantic_cache_similarity_threshold": 0.95,
  "stream_mode": "mock"              // "mock" or "did"
}
```

#### 4.4.2 Complete Handshake

```
POST /api/simulation/handshake
Content-Type: application/json

{
  "session_id": "uuid-string",
  "answer": { "type": "answer", "sdp": "..." }   // WebRTC SDP answer
}

→ 200 OK
{
  "session_id": "uuid-string",
  "status": "connected",
  "result": { "connected": true, "mode": "mock" }   // mock mode when no DID key
}
```

#### 4.4.3 ICE Candidate Exchange

```
POST /api/simulation/ice-candidate
Content-Type: application/json

{
  "session_id": "uuid-string",
  "candidate": { ... }          // ICE candidate object
}

→ 200 OK
{ "accepted": true }
```

#### 4.4.4 Simulation Turn (Core Loop)

```
POST /api/simulation/turn
Content-Type: application/json

{
  "session_id": "uuid-string",
  "input_text": "Doctor, our Phase 3 trial showed...",   // default: empty string
  "input_audio_base64": ""                                // optional audio for Hume prosody
}

→ 200 OK
{
  "session_id": "uuid-string",
  "cache_hit": false,              // true if a semantically similar response was cached
  "cache_entry_id": "uuid|null",
  "cache_similarity": 0.0,         // cosine similarity to cached entry (0.0 if not cached)
  "response": "Interesting, but I need to see the full subgroup analysis...",   // AI persona reply text
  "audio": null,                   // reserved for future D-ID audio output
  "prosody": null                  // Hume prosody result (null when no audio or no Hume key)
}
```

> **Note**: To get adherence score, emotion timeline, and full conversation history for a session, call `GET /api/analytics/session/{session_id}` after the turn loop ends.

#### 4.4.5 Check Semantic Cache

```
GET /api/simulation/cache/{cache_key}

→ 200 OK
{
  "hit": true,
  "value": "cached response text"
}
```

---

### 4.5 MOHP — Objection Detection

**Prefix**: `/api/mohp`

```
POST /api/mohp/evaluate
Content-Type: application/json

{
  "session_id": "uuid-string",
  "input_text": "This drug is 100% effective with no side effects",
  "cluster_id": 0,               // 0–7
  "persona_id": ""               // optional
}

→ 200 OK
{
  "session_id": "uuid-string",
  "objections": [
    {
      "id": "obj-uuid",
      "timestamp": "2026-04-17T12:00:00Z",
      "objection": "Absolute efficacy claim without evidence",
      "guideline": "Avoid absolute claims — cite specific trial data",
      "severity": "high",
      "matched_keywords": ["100%", "no side effects"]
    }
  ],
  "count": 1
}
```

---

### 4.6 Knowledge Graph

**Prefix**: `/api/graph`

#### 4.6.1 Ingest Doctors into Graph

```
POST /api/graph/ingest
Content-Type: application/json

{ "records": [ { ...doctor data... }, ... ] }

→ { "status": "ok", "ingested": 120 }
```

#### 4.6.2 Get Doctor from Graph

```
GET /api/graph/doctor/{code_name}

→ 200 OK
{
  "code_name": "HCP-00-042",
  "cluster_id": 0,
  "h_index": 41,
  "institution": "Brigham and Women's Hospital",
  "topics": ["Biomarker Discovery", "Minimal Residual Disease"],
  ...
}
```

#### 4.6.3 Get Doctors by Cluster

```
GET /api/graph/cluster/{cluster_id}/doctors?limit=50

→ { "cluster_id": 0, "doctors": [...], "count": 50 }
```

#### 4.6.4 Get Doctors by Institution

```
GET /api/graph/institution/{institution_name}/doctors?limit=50

→ { "institution": "Mount Sinai", "doctors": [...], "count": 12 }
```

#### 4.6.5 Get Doctors by Topic

```
GET /api/graph/topic/{topic_name}/doctors?limit=50

→ { "topic": "Biomarker Discovery", "doctors": [...], "count": 8 }
```

#### 4.6.6 Institution Summary

```
GET /api/graph/institutions/summary?limit=20

→ { "institutions": [{ "name": "...", "doctor_count": 15, ... }, ...] }
```

#### 4.6.7 Topic Overlap Between Doctors

```
GET /api/graph/overlap?code_name_a=HCP-00-001&code_name_b=HCP-00-042

→ {
  "doctor_a": "HCP-00-001",
  "doctor_b": "HCP-00-042",
  "shared_topics": ["Biomarker Discovery"],
  "count": 1
}
```

---

### 4.7 Analytics

**Prefix**: `/api/analytics`

#### 4.7.1 List Sessions

```
GET /api/analytics/sessions?limit=50

→ { "sessions": [ { "sessionId": "...", "personaId": "...", "score": 0.8, ... }, ... ] }
```

#### 4.7.2 Get Session Detail

```
GET /api/analytics/session/{session_id}

→ 200 OK
{
  "sessionId": "uuid",
  "personaId": "HCP-00-042",
  "campaignId": "camp-001",
  "clusterId": 0,
  "durationMs": 245000,
  "adherenceScore": 0.72,                  // 0.0–1.0
  "emotionTimeline": [
    {
      "timestampMs": 0,
      "userValence": 0.52,
      "userArousal": 0.34,
      "avatarResistance": 0.58
    },
    ...
  ],
  "totalPoints": 5,
  "deliveredPoints": 3,
  "objections": [
    {
      "id": "mohp-abc12345",
      "objection": "Potential compliance concern...",
      "guideline": "NCCN Category 2A Evidence Requirement",
      "severity": "medium",
      "matched_keywords": ["efficacy"],
      "cluster_source": "0",
      "response_latency_ms": 1200,
      "mohp_aligned": true,
      "user_response": "..."
    }
  ],
  "conversation": [
    { "id": "uuid", "role": "user", "text": "...", "timestamp": 1713350400000, "meta": {} },
    { "id": "uuid", "role": "assistant", "text": "...", "timestamp": 1713350408000, "meta": {} }
  ],
  "campaignSnapshot": { ... }              // full-evaluate result passed at session start (if any)
}
```

#### 4.7.3 Delete Session

```
DELETE /api/analytics/session/{session_id}

→ 200 OK
{ "deleted": true, "session_id": "uuid" }
```

---

### 4.8 Math Engine (Low-Level)

**Prefix**: `/api/math`

```
POST /api/math/vectorize
Content-Type: application/json
[{doctor_record}, {doctor_record}, ...]

→ vectorized result
```

```
POST /api/math/cluster
Content-Type: application/json
[{doctor_record}, {doctor_record}, ...]

→ GMM clustering result
```

---

### 4.9 Pipeline (Data Ingestion)

**Prefix**: `/api/pipeline`

```
POST /api/pipeline/ingest     → { "status": "queued", "event_id": "uuid" }
POST /api/pipeline/dispatch   → { "processed": true }
POST /api/pipeline/seed       → { "status": "seeded", "records_loaded": 480, "records_ingested": 480, "source_file": "doctors_unified.json" }
```

**`/api/pipeline/seed`** loads the gold doctor dataset into Neo4j. Call this once after first setup to populate the knowledge graph.

---

### 4.10 System Stats

**Prefix**: `/api/stats`

```
GET /api/stats/embedding    → { "model_name": "onnx-minilm", "dimension": 384, ... }
GET /api/stats/projection   → { "ready": true, "input_dim": 384, "output_dim": 12, ... }
GET /api/stats/cache        → { "semantic_cache_keys": 42, "session_keys": 5, "simulation_session_keys": 3 }
GET /api/stats/dlq          → { "dlq_depth": 0 }
GET /api/stats/outbox       → { "pending_outbox_events": 0 }
```

---

### 4.11 Admin — Embeddings

**Prefix**: `/admin/embeddings`

```
GET  /admin/embeddings/status  → { "model_name": "onnx-minilm", "dimension": 384, "known_models": {...} }
POST /admin/embeddings/swap    → swap embedding model (body: { "model_name": "...", "reindex": true })
POST /admin/embeddings/reindex → re-embed all ChromaDB collections
```

### 4.12 Admin — Dead Letter Queue

```
GET  /admin/dlq?limit=100       → { "items": [...] }
POST /admin/dlq/replay          → { "replayed": true, "item": {...} }
```

---

## 5. Data Models Reference

### 5.1 Doctor Row (from `/api/strategy/cluster/{id}/doctors`)

The strategy cluster endpoint returns full doctor rows from the master CSV. Exact fields:

```json
{
  "cluster_id": 0,
  "name": "Dr. Ahmed Hassan",
  "region": "egypt",
  "headline": "Professor of Medical Oncology, Cairo University",
  "location": "Cairo, Egypt",
  "company": "Cairo University Hospital",
  "job": "Professor of Medical Oncology",
  "school": "Cairo University",
  "school_degree": "MD, PhD",
  "primary_specialty": "Medical Oncology",
  "seniority_level": "Senior",
  "highest_academic_degree": "PhD",
  "total_years_experience": 22,
  "expected_age": 50,
  "age_group": "46-55",
  "current_role_tenure": 8,
  "kol_status": "National KOL",
  "digital_presence": "High",
  "academic_affiliation": "University Professor",
  "workplace_category": "Academic Medical Center",
  "institutional_tier": "Tier 1",
  "adoption_profile": "Early Adopter",
  "channel_preference": "Conference + Digital"
}
```

### 5.2 Campaign Feature Keys — Two Parallel Systems

#### System A: Campaign Feature Keys (snake_case) — strategy & optimization

Used in `POST /api/strategy/vectorize`, `full-evaluate`, `/optimize`, and all strategy endpoints:

```
therapeutic_focus        messaging_tone          target_seniority
channel_preference       kol_alignment           trial_phase_relevance
formulary_impact         patient_population_size competitive_positioning
regulatory_stage         budget_tier             urgency_score
```

Values are floats in `[0.0, 1.0]`. 0.5 is the neutral baseline.

#### System B: Persona Trait Axes (display names) — persona & War Room

Used in `GET /api/persona/*` responses, radar charts, and simulation persona cards:

```
Scientific Rigor    Innovation Appetite    Guideline Adherence    Price Sensitivity
Risk Tolerance      Peer Influence         Evidence Threshold     Formulary Weight
Patient Centricity  Digital Readiness      KOL Alignment          Trial Participation
```

### 5.3 Optimization Result Shape

When `rejected=true` in `full-evaluate`, or from `POST /api/strategy/optimize`:

```json
{
  "original_text": "...",
  "optimized_text": "...",
  "improvements": [
    "Increase therapeutic_focus emphasis from 0.25 toward 0.95.",
    "Reduce budget_tier emphasis from 0.40 toward 0.20."
  ],
  "reason": "heuristic_fallback_no_api_key",
  "reasoning": "Applied deterministic optimization against the largest segment-fit gaps for The Academic Skeptic.",
  "rewrite_rationale": "...",
  "optimized_vector": { "therapeutic_focus": 0.85, ... },
  "optimized_normalized_vector": { "therapeutic_focus": 0.85, ... },
  "retrieved_examples": ["similar past campaign text...", "..."],
  "retrieval_diagnostics": [
    { "campaign_id": "...", "campaign_text": "...", "metadata": {...}, "distance": 0.12 }
  ],
  "target_cluster": 0,
  "target_centroid_vector": [12 floats],
  "alignment_score": 0.87
}
```

**`reason` values**:
- `heuristic_fallback_no_api_key` — Ollama key absent; used deterministic rewrite
- `heuristic_fallback_llm_error` — LLM call failed; fell back to heuristic
- `llm_rewrite` — Full LLM-powered optimization (requires Ollama API key)

### 5.4 Heatmap Ranking Entry

```json
{
  "cluster_id": 1,
  "label": "The Commercial Adopter",
  "distance": 1.23,     // Mahalanobis distance in PCA space (lower = better fit)
  "probability": 0.45   // softmax-normalized probability (higher = better fit)
}
```

> Note: `heatmap.ranking` entries do **not** include `top_doctors` in the current backend implementation.

---

## 6. Suggested Frontend Pages / Views

| Page | Route | Primary Endpoints | Description |
|------|-------|-------------------|-------------|
| **Dashboard** | `/` | `GET /api/stats/*`, `GET /api/analytics/sessions` | System overview + recent simulation sessions |
| **Strategy Lab** | `/strategy` | `POST /api/strategy/full-evaluate` | Multi-tab campaign evaluation (see §8) |
| **Evaluation History** | `/history` | localStorage (client-side store) | Browsable list of all past evaluations (see §7) |
| **HCP Explorer** | `/hcp` | `GET /api/strategy/cluster/{id}/doctors`, `GET /api/persona/{code_name}` | Browse doctors by cluster, view profiles |
| **Knowledge Graph** | `/graph` | `GET /api/graph/*` | Institution/topic/doctor graph visualization |
| **War Room (Simulation)** | `/war-room` | `POST /api/simulation/start` → `turn` loop | Live roleplay practice sessions |
| **Session Review** | `/analytics/session/{id}` | `GET /api/analytics/session/{id}` | Post-simulation review: conversation, emotion timeline, adherence |
| **Admin** | `/admin` | `/admin/embeddings/*`, `/admin/dlq/*`, `/api/stats/*` | Embedding model management, DLQ monitoring |

---

## 7. Evaluation History Feature Specification

> **The backend has no dedicated evaluation history endpoint.** History must be managed client-side using `localStorage` (or IndexedDB via a library like `idb` for larger payloads).

### 7.1 Data Structure

Each time `POST /api/strategy/full-evaluate` completes successfully, persist a history entry:

```typescript
interface EvaluationHistoryEntry {
  // Identity
  id: string;                    // nanoid() or crypto.randomUUID()
  tabId: string;                 // which tab originated this evaluation
  timestamp: number;             // Date.now()

  // Input
  campaignText: string;
  region: string | null;
  rejectionThreshold: number;

  // Output (full backend response — stored verbatim)
  result: FullEvaluateResponse;  // the complete JSON from /api/strategy/full-evaluate

  // Derived display fields (pre-computed for list rendering, avoid re-parsing)
  topClusterName: string;        // result.cluster_cards[0].name
  topClusterScore: number;       // result.cluster_cards[0].score
  rejected: boolean;
  vectorizationModel: string;
}
```

### 7.2 Storage Key

```typescript
const HISTORY_STORAGE_KEY = "orsync:evaluation-history";
```

Store as a JSON-serialized array. Most recent first.

### 7.3 History Page (`/history`)

The history page must:

1. **List all entries** — show: timestamp (formatted), first 80 chars of `campaignText`, `topClusterName`, `topClusterScore` badge (0–100), rejected badge (red) if applicable, region tag
2. **Click to restore** — clicking any entry navigates to `/strategy?historyId={entry.id}` which opens a new tab in the Strategy Lab pre-populated with all the entry's data (no API call needed — data is already stored)
3. **Delete entry** — per-row delete button removes from localStorage
4. **Clear all** — confirm dialog → wipe all history
5. **Search/filter** — filter by campaign text substring, cluster, region, or rejected/accepted
6. **Export** — download all entries as a JSON file

### 7.4 History State Schema (Zustand / React Context)

```typescript
interface HistoryStore {
  entries: EvaluationHistoryEntry[];

  // Actions
  addEntry: (entry: EvaluationHistoryEntry) => void;
  deleteEntry: (id: string) => void;
  clearAll: () => void;
  getEntry: (id: string) => EvaluationHistoryEntry | undefined;
}
```

Hydrate from `localStorage` on mount. Persist on every write.

---

## 8. Multi-Tab Evaluation Feature Specification

The Strategy Lab (`/strategy`) must support multiple simultaneous independent evaluations, each in its own **virtual tab** within the page (not a browser tab — think VS Code-style tabs within the app).

### 8.1 Tab Data Model

```typescript
interface EvaluationTab {
  id: string;                        // nanoid()
  label: string;                     // "Evaluation 1", or user-renamed title
  createdAt: number;

  // Form state
  campaignText: string;
  region: string | null;
  rejectionThreshold: number;

  // Execution state
  status: "idle" | "loading" | "success" | "error";
  error: string | null;

  // Result (null until first successful evaluation)
  result: FullEvaluateResponse | null;
  evaluatedAt: number | null;

  // Active inner tab (Clusters | Features | Heatmap | GMM Details | Optimization)
  activeResultTab: string;
}
```

### 8.2 Behaviour

| Action | Behaviour |
|--------|-----------|
| **New Evaluation** button | Creates a new `EvaluationTab` with `status: "idle"`, switches to it |
| **Run** in a tab | Calls `POST /api/strategy/full-evaluate` with that tab's form data; sets status to "loading" |
| **Success** | Sets `status: "success"`, stores `result`, appends to evaluation history (§7) |
| **Error** | Sets `status: "error"`, stores error message |
| **Close tab** | Removes tab; confirms if tab has unsaved/un-run changes |
| **Rename tab** | Double-click the tab label → inline edit |
| **Switch tab** | Immediately switches view; does not interrupt in-flight requests on other tabs |
| **Reload** | All tabs and their results are persisted in `localStorage`; restored on page load |
| **Click cluster card** → doctors | Opens the doctor drawer within the same tab; does not change other tabs |

### 8.3 Persistence Key

```typescript
const TABS_STORAGE_KEY = "orsync:strategy-tabs";
```

Serialize the full `EvaluationTab[]` array to localStorage on every state change.

### 8.4 Default Initial State

On first load (empty localStorage):

```typescript
[{
  id: nanoid(),
  label: "Evaluation 1",
  campaignText: "",
  region: null,
  rejectionThreshold: 3.0,
  status: "idle",
  error: null,
  result: null,
  evaluatedAt: null,
  activeResultTab: "clusters"
}]
```

### 8.5 Result Panel Tabs

Once a tab has a successful result, show 5 inner tabs:

| Tab Key | Content |
|---------|---------|
| `clusters` | `cluster_cards` — score cards with name, score badge, distance, probability |
| `features` | `campaign_vector_12d` — bar or radar chart using the 12 snake_case feature keys |
| `heatmap` | `heatmap.ranking` — cluster heatmap sorted by probability; show distance and probability per row |
| `gmm` | `gmm` metadata — k, data_source, n_pca_components, explained_variance_ratio; 2D scatter of `member_points_2d` |
| `optimization` | Show if `rejected=true`: full `optimized` object — original vs optimized text diff, improvements list, alignment_score, reason badge |

---

## 9. Typical Frontend Workflows

### Workflow 1: Campaign Evaluation (Primary Flow)

```
1. User opens /strategy (Strategy Lab)
2. A default tab "Evaluation 1" is shown
3. User types campaign text, optionally selects region and threshold
4. User clicks "Run Evaluation"
5. Frontend calls POST /api/strategy/full-evaluate
6. On success:
   a. Display result tabs (Clusters, Features, Heatmap, GMM, Optimization)
   b. Persist entry to history store (localStorage)
7. User optionally:
   a. Clicks a cluster card → drawer opens → GET /api/strategy/cluster/{id}/doctors
   b. Clicks "New Evaluation" → new empty tab is created
   c. Navigates to /history to revisit any past evaluation
```

### Workflow 2: Restore Evaluation from History

```
1. User navigates to /history
2. Sees list of past evaluations sorted by timestamp (newest first)
3. Clicks any entry
4. App opens /strategy?historyId={id}
5. Strategy Lab reads localStorage, finds entry by id
6. Creates new tab pre-populated with all data (no API call)
7. User can re-run or just view the previous result
```

### Workflow 3: Simulation (War Room)

```
1. User picks a doctor from HCP Explorer or from a cluster card doctor list
2. Navigates to /war-room?personaId=HCP-00-042
3. Frontend optionally pre-fetches persona: GET /api/persona/HCP-00-042
4. User clicks "Start Session"
5. POST /api/simulation/start {
     persona_id: "HCP-00-042",
     campaign_snapshot: <current tab's full-evaluate result>  // pass if available
   }
6. Complete handshake: POST /api/simulation/handshake { session_id, answer: did_offer }
7. Turn loop:
   - User types message → POST /api/simulation/turn { session_id, input_text }
   - Display response text (turn.response field)
   - Optionally show Hume prosody when audio is enabled
8. After session → GET /api/analytics/session/{session_id}
9. Display review: conversation, emotion timeline (userValence/userArousal/avatarResistance), adherenceScore
```

### Workflow 4: Data Initialization (First Setup)

```
1. POST /api/pipeline/seed        → seeds Neo4j with 480 gold doctors
2. GET /api/stats/embedding       → confirm embedding model
3. GET /api/stats/projection      → confirm projection bridge is ready
4. GET /api/graph/cluster/0/doctors?limit=5  → verify doctors are in graph
```

---

## 10. Error Handling

All errors follow standard HTTP status codes:

| Code | Meaning | When |
|------|---------|------|
| `400` | Bad Request | Invalid request body, validation errors |
| `404` | Not Found | Doctor/session/resource not found |
| `422` | Unprocessable Entity | Pydantic validation failure (body shape wrong) |
| `500` | Internal Server Error | Unexpected server error, dependency unavailable |

**422 response shape** (FastAPI Pydantic validation error):

```json
{
  "detail": [
    {
      "type": "string_too_short",
      "loc": ["body", "campaign_text"],
      "msg": "String should have at least 1 character",
      "input": "",
      "ctx": { "min_length": 1 }
    }
  ]
}
```

**Graceful degradation**: When Ollama, Redis, Neo4j, or ChromaDB are unavailable, the backend does **not** return 500. It falls back to heuristics or returns empty/mock responses. The frontend should always render something — e.g. show the heuristic optimization result even when `reason = "heuristic_fallback_no_api_key"`.

---

## 11. Environment Variables

The backend reads from `.env` (all have development defaults):

| Variable | Default | Description |
|----------|---------|-------------|
| `PORT` | `7860` | Server port |
| `ENVIRONMENT` | `development` | `development` / `staging` / `production` |
| `REDIS_URL` | `redis://localhost:6379/0` | Redis connection |
| `NEO4J_URI` | `bolt://localhost:7687` | Neo4j bolt endpoint |
| `NEO4J_USER` | `neo4j` | Neo4j username |
| `NEO4J_PASSWORD` | `password` | Neo4j password |
| `CHROMA_HOST` | `localhost` | ChromaDB host |
| `CHROMA_PORT` | `8100` | ChromaDB port |
| `OLLAMA_HOST` | `https://ollama.com` | Ollama API endpoint |
| `OLLAMA_API_KEY` | *(empty)* | Ollama Cloud API key — required for LLM features |
| `OLLAMA_MODEL` | `gemma4:31b-cloud` | LLM model name |
| `EMBEDDING_MODEL` | `onnx-minilm` | Embedding model (`onnx-minilm` runs locally, no key needed) |
| `STRATEGY_REJECTION_DISTANCE_THRESHOLD` | `3.0` | Default campaign rejection threshold |
| `DID_API_KEY` | *(empty)* | D-ID API key — required for real WebRTC avatar streams |
| `HUME_API_KEY` | *(empty)* | Hume API key — required for prosody analysis |
| `CORS_ALLOWED_ORIGINS` | `*` | Comma-separated allowed CORS origins |
| `OUTBOX_TRANSPORT` | `redis` | Outbox transport: `redis` or `amqp` |
| `RABBITMQ_URL` | *(empty)* | RabbitMQ URL when `OUTBOX_TRANSPORT=amqp` |

---

## 12. Frontend Next.js Configuration

When building a Next.js frontend on port 3001, configure API proxying in `next.config.ts`:

```typescript
// next.config.ts
import type { NextConfig } from "next";

const nextConfig: NextConfig = {
  async rewrites() {
    return [
      {
        source: "/api/:path*",
        destination: "http://localhost:7860/api/:path*",
      },
      {
        source: "/admin/:path*",
        destination: "http://localhost:7860/admin/:path*",
      },
    ];
  },
};

export default nextConfig;
```

With this setup all `fetch("/api/strategy/full-evaluate")` calls from the browser are proxied to the backend. No CORS issues. No hardcoded backend URL in client code.

### TypeScript Types Quick Reference

```typescript
// Campaign feature keys (snake_case) — use these for strategy/vectorize and full-evaluate
type CampaignFeatureKey =
  | "therapeutic_focus" | "messaging_tone" | "target_seniority"
  | "channel_preference" | "kol_alignment" | "trial_phase_relevance"
  | "formulary_impact" | "patient_population_size" | "competitive_positioning"
  | "regulatory_stage" | "budget_tier" | "urgency_score";

// Persona trait axes (display names) — use these for persona radar charts
type PersonaTraitAxis =
  | "Scientific Rigor" | "Innovation Appetite" | "Guideline Adherence"
  | "Price Sensitivity" | "Risk Tolerance" | "Peer Influence"
  | "Evidence Threshold" | "Formulary Weight" | "Patient Centricity"
  | "Digital Readiness" | "KOL Alignment" | "Trial Participation";

// Cluster IDs
type ClusterId = 0 | 1 | 2 | 3;

// Cluster card id format
type ClusterCardId = "c0" | "c1" | "c2" | "c3";

interface HeatmapRankingEntry {
  cluster_id: ClusterId;
  label: string;
  distance: number;   // Mahalanobis, lower = better
  probability: number; // softmax, higher = better
}

interface ClusterCard {
  id: ClusterCardId;       // string: "c0", "c1", etc.
  cluster_id: ClusterId;
  name: string;
  score: number;           // 0–100 float (probability × 100)
  distance: number;
  probability: number;     // 0–1
}

interface FullEvaluateResponse {
  campaign_vector_12d: number[];
  campaign_vector_pca: number[];
  gmm: {
    k: number;
    data_source: string;
    n_pca_components: number;
    pca_explained_variance_ratio: number[];
    centroids: number[][];
    covariances: number[][][];
    assignments: number[];
    member_points_2d: [number, number][];
  };
  heatmap: { ranking: HeatmapRankingEntry[] };  // NOTE: object, not array
  cluster_cards: ClusterCard[];
  rejection_distance_threshold: number;
  rejected: boolean;
  optimized: OptimizationResult | null;
  vectorization_model: string;
}

interface OptimizationResult {
  original_text: string;
  optimized_text: string;
  improvements: string[];
  reason: "heuristic_fallback_no_api_key" | "heuristic_fallback_llm_error" | "llm_rewrite";
  reasoning: string;
  rewrite_rationale: string;
  optimized_vector: Record<CampaignFeatureKey, number>;
  optimized_normalized_vector: Record<CampaignFeatureKey, number>;
  retrieved_examples: string[];
  retrieval_diagnostics: Array<{ campaign_id: string; campaign_text: string; metadata: Record<string, unknown>; distance: number }>;
  target_cluster: ClusterId | null;
  target_centroid_vector: number[] | null;
  alignment_score: number | null;
}
```

---

## 13. Fallback Behaviour Reference

| Dependency | Missing/down | Backend behaviour |
|-----------|-------------|-------------------|
| Ollama API key | Not set | Campaign vectorization uses heuristic keyword scoring; optimization uses deterministic heuristic rewrite |
| Ollama model | Unavailable | Same as above — heuristic fallback |
| D-ID API key | Not set | Simulation returns a mock WebRTC offer; `stream_mode: "mock"` |
| Hume API key | Not set | `prosody` field in turn response is `null`; `hume_prosody_enabled: false` in start response |
| Redis | Down | Session persistence fails silently; analytics endpoints return 404 for sessions |
| Neo4j | Down | Graph endpoints return 404/500; persona endpoints fall back to cluster profile data |
| ChromaDB | Down | RAG retrieval fails silently; optimization uses empty example pool |

---

*End of FRONTEND_INTEGRATION.md — all shapes verified against live backend source code.*