scenarist / BACKEND_WORKFLOW.html
github-actions[bot]
Sync backend to Hugging Face Space (commit: 39b5c807918249fa80049d49f4b6a74d6a0ed1fc)
6d86412
Raw
History Blame Contribute Delete
37.1 kB
<!DOCTYPE html>
<html>
<head>
<title>BACKEND_WORKFLOW.md</title>
<meta http-equiv="Content-type" content="text/html;charset=UTF-8">
<style>
/* https://github.com/microsoft/vscode/blob/master/extensions/markdown-language-features/media/markdown.css */
/*---------------------------------------------------------------------------------------------
* Copyright (c) Microsoft Corporation. All rights reserved.
* Licensed under the MIT License. See License.txt in the project root for license information.
*--------------------------------------------------------------------------------------------*/
body {
font-family: var(--vscode-markdown-font-family, -apple-system, BlinkMacSystemFont, "Segoe WPC", "Segoe UI", "Ubuntu", "Droid Sans", sans-serif);
font-size: var(--vscode-markdown-font-size, 14px);
padding: 0 26px;
line-height: var(--vscode-markdown-line-height, 22px);
word-wrap: break-word;
}
#code-csp-warning {
position: fixed;
top: 0;
right: 0;
color: white;
margin: 16px;
text-align: center;
font-size: 12px;
font-family: sans-serif;
background-color:#444444;
cursor: pointer;
padding: 6px;
box-shadow: 1px 1px 1px rgba(0,0,0,.25);
}
#code-csp-warning:hover {
text-decoration: none;
background-color:#007acc;
box-shadow: 2px 2px 2px rgba(0,0,0,.25);
}
body.scrollBeyondLastLine {
margin-bottom: calc(100vh - 22px);
}
body.showEditorSelection .code-line {
position: relative;
}
body.showEditorSelection .code-active-line:before,
body.showEditorSelection .code-line:hover:before {
content: "";
display: block;
position: absolute;
top: 0;
left: -12px;
height: 100%;
}
body.showEditorSelection li.code-active-line:before,
body.showEditorSelection li.code-line:hover:before {
left: -30px;
}
.vscode-light.showEditorSelection .code-active-line:before {
border-left: 3px solid rgba(0, 0, 0, 0.15);
}
.vscode-light.showEditorSelection .code-line:hover:before {
border-left: 3px solid rgba(0, 0, 0, 0.40);
}
.vscode-light.showEditorSelection .code-line .code-line:hover:before {
border-left: none;
}
.vscode-dark.showEditorSelection .code-active-line:before {
border-left: 3px solid rgba(255, 255, 255, 0.4);
}
.vscode-dark.showEditorSelection .code-line:hover:before {
border-left: 3px solid rgba(255, 255, 255, 0.60);
}
.vscode-dark.showEditorSelection .code-line .code-line:hover:before {
border-left: none;
}
.vscode-high-contrast.showEditorSelection .code-active-line:before {
border-left: 3px solid rgba(255, 160, 0, 0.7);
}
.vscode-high-contrast.showEditorSelection .code-line:hover:before {
border-left: 3px solid rgba(255, 160, 0, 1);
}
.vscode-high-contrast.showEditorSelection .code-line .code-line:hover:before {
border-left: none;
}
img {
max-width: 100%;
max-height: 100%;
}
a {
text-decoration: none;
}
a:hover {
text-decoration: underline;
}
a:focus,
input:focus,
select:focus,
textarea:focus {
outline: 1px solid -webkit-focus-ring-color;
outline-offset: -1px;
}
hr {
border: 0;
height: 2px;
border-bottom: 2px solid;
}
h1 {
padding-bottom: 0.3em;
line-height: 1.2;
border-bottom-width: 1px;
border-bottom-style: solid;
}
h1, h2, h3 {
font-weight: normal;
}
table {
border-collapse: collapse;
}
table > thead > tr > th {
text-align: left;
border-bottom: 1px solid;
}
table > thead > tr > th,
table > thead > tr > td,
table > tbody > tr > th,
table > tbody > tr > td {
padding: 5px 10px;
}
table > tbody > tr + tr > td {
border-top: 1px solid;
}
blockquote {
margin: 0 7px 0 5px;
padding: 0 16px 0 10px;
border-left-width: 5px;
border-left-style: solid;
}
code {
font-family: Menlo, Monaco, Consolas, "Droid Sans Mono", "Courier New", monospace, "Droid Sans Fallback";
font-size: 1em;
line-height: 1.357em;
}
body.wordWrap pre {
white-space: pre-wrap;
}
pre:not(.hljs),
pre.hljs code > div {
padding: 16px;
border-radius: 3px;
overflow: auto;
}
pre code {
color: var(--vscode-editor-foreground);
tab-size: 4;
}
/** Theming */
.vscode-light pre {
background-color: rgba(220, 220, 220, 0.4);
}
.vscode-dark pre {
background-color: rgba(10, 10, 10, 0.4);
}
.vscode-high-contrast pre {
background-color: rgb(0, 0, 0);
}
.vscode-high-contrast h1 {
border-color: rgb(0, 0, 0);
}
.vscode-light table > thead > tr > th {
border-color: rgba(0, 0, 0, 0.69);
}
.vscode-dark table > thead > tr > th {
border-color: rgba(255, 255, 255, 0.69);
}
.vscode-light h1,
.vscode-light hr,
.vscode-light table > tbody > tr + tr > td {
border-color: rgba(0, 0, 0, 0.18);
}
.vscode-dark h1,
.vscode-dark hr,
.vscode-dark table > tbody > tr + tr > td {
border-color: rgba(255, 255, 255, 0.18);
}
</style>
<style>
/* Tomorrow Theme */
/* http://jmblog.github.com/color-themes-for-google-code-highlightjs */
/* Original theme - https://github.com/chriskempson/tomorrow-theme */
/* Tomorrow Comment */
.hljs-comment,
.hljs-quote {
color: #8e908c;
}
/* Tomorrow Red */
.hljs-variable,
.hljs-template-variable,
.hljs-tag,
.hljs-name,
.hljs-selector-id,
.hljs-selector-class,
.hljs-regexp,
.hljs-deletion {
color: #c82829;
}
/* Tomorrow Orange */
.hljs-number,
.hljs-built_in,
.hljs-builtin-name,
.hljs-literal,
.hljs-type,
.hljs-params,
.hljs-meta,
.hljs-link {
color: #f5871f;
}
/* Tomorrow Yellow */
.hljs-attribute {
color: #eab700;
}
/* Tomorrow Green */
.hljs-string,
.hljs-symbol,
.hljs-bullet,
.hljs-addition {
color: #718c00;
}
/* Tomorrow Blue */
.hljs-title,
.hljs-section {
color: #4271ae;
}
/* Tomorrow Purple */
.hljs-keyword,
.hljs-selector-tag {
color: #8959a8;
}
.hljs {
display: block;
overflow-x: auto;
color: #4d4d4c;
padding: 0.5em;
}
.hljs-emphasis {
font-style: italic;
}
.hljs-strong {
font-weight: bold;
}
</style>
<style>
/*
* Markdown PDF CSS
*/
body {
font-family: -apple-system, BlinkMacSystemFont, "Segoe WPC", "Segoe UI", "Ubuntu", "Droid Sans", sans-serif, "Meiryo";
padding: 0 12px;
}
pre {
background-color: #f8f8f8;
border: 1px solid #cccccc;
border-radius: 3px;
overflow-x: auto;
white-space: pre-wrap;
overflow-wrap: break-word;
}
pre:not(.hljs) {
padding: 23px;
line-height: 19px;
}
blockquote {
background: rgba(127, 127, 127, 0.1);
border-color: rgba(0, 122, 204, 0.5);
}
.emoji {
height: 1.4em;
}
code {
font-size: 14px;
line-height: 19px;
}
/* for inline code */
:not(pre):not(.hljs) > code {
color: #C9AE75; /* Change the old color so it seems less like an error */
font-size: inherit;
}
/* Page Break : use <div class="page"/> or <div class="page"></div> to insert page break
-------------------------------------------------------- */
.page {
page-break-after: always;
}
</style>
<script src="https://unpkg.com/mermaid/dist/mermaid.min.js"></script>
</head>
<body>
<script>
mermaid.initialize({
startOnLoad: true,
theme: document.body.classList.contains('vscode-dark') || document.body.classList.contains('vscode-high-contrast')
? 'dark'
: 'default'
});
</script>
<h1 id="orsync-scenarist-backend-workflow">Orsync Scenarist Backend Workflow</h1>
<h2 id="overview">Overview</h2>
<p>The backend is a FastAPI application that combines three main systems:</p>
<ol>
<li>A strategy engine for campaign analysis and optimization.</li>
<li>A knowledge graph layer for doctor, institution, and topic exploration.</li>
<li>A simulation and analytics system for persona-based sales training.</li>
</ol>
<p>At runtime, the backend can also use Redis, Neo4j, ChromaDB, Ollama, D-ID, Hume, and optionally RabbitMQ. Some of these are optional and have fallbacks.</p>
<h2 id="high-level-architecture">High-Level Architecture</h2>
<p>Standalone Mermaid source for direct Mermaid Preview:</p>
<ul>
<li><a href="backend/diagrams/high_level_architecture.mmd">backend/diagrams/high_level_architecture.mmd</a></li>
</ul>
<pre><code class="language-mermaid"><div class="mermaid">flowchart LR
A[run.py] --> B[FastAPI app in app/main.py]
subgraph Runtime[Startup and background runtime]
direction TB
C[Startup model preload]
D[Outbox worker thread]
E[Event consumer thread]
end
subgraph Routes[API surface]
direction TB
F[Pipeline routes]
G[Strategy routes]
H[Graph routes]
I[Persona routes]
J[Simulation routes]
K[MOHP routes]
L[Analytics routes]
M[Stats and admin routes]
end
subgraph Services[Core services]
direction TB
N[Redis outbox]
O[Redis Streams or RabbitMQ]
P[Neo4j ingestion]
Q[Campaign vectorizer]
R[GMM clustering]
S[Heatmap scoring]
T[RAG optimizer]
U[Neo4j graph queries]
V[Redis session storage]
W[Ollama responses]
X[D-ID WebRTC stream]
Y[Hume prosody]
Z[Rule-based or LLM objection analysis]
AA[ChromaDB campaign memory]
end
B --> Runtime
B --> Routes
D --> N --> O
E --> P
F --> N
G --> Q
G --> R
G --> S
G --> T --> AA
H --> U
I --> U
J --> V
J --> W
J --> X
J --> Y
K --> Z
L --> V
</div></code></pre>
<h2 id="main-runtime-entry-points">Main Runtime Entry Points</h2>
<h3 id="1-local-bootstrap">1. Local bootstrap</h3>
<p>The server starts from <a href="backend/run.py#L1">backend/run.py</a>.</p>
<p>Responsibilities:</p>
<ul>
<li>Ensures first-time setup has been run.</li>
<li>Tries to start embedded Redis if available.</li>
<li>Tries to start embedded Neo4j if available.</li>
<li>Tries to start embedded ChromaDB if available.</li>
<li>Launches the FastAPI app.</li>
</ul>
<p>Startup flags include:</p>
<ul>
<li>--reload</li>
<li>--no-redis</li>
<li>--no-neo4j</li>
<li>--no-chroma</li>
</ul>
<h3 id="2-fastapi-application">2. FastAPI application</h3>
<p>The app is created in <a href="backend/app/main.py#L1">backend/app/main.py</a>.</p>
<p>The lifespan startup sequence does three important things:</p>
<ol>
<li>Preloads the embedding model and projection weights.</li>
<li>Starts the transactional outbox worker thread.</li>
<li>Starts the event consumer thread.</li>
</ol>
<p>Routers mounted in the app:</p>
<ul>
<li>pipeline</li>
<li>math_engine</li>
<li>strategy</li>
<li>simulation</li>
<li>graph</li>
<li>analytics</li>
<li>persona</li>
<li>mohp</li>
<li>stats</li>
<li>admin</li>
</ul>
<p>Also mounted directly:</p>
<ul>
<li>GET /healthz</li>
<li>GET /</li>
<li>GET /admin/dlq</li>
<li>POST /admin/dlq/replay</li>
</ul>
<h3 id="important-implementation-note">Important implementation note</h3>
<p>There is an auth router in <a href="backend/app/api/routes/auth.py#L24">backend/app/api/routes/auth.py</a>, but it is not included in <a href="backend/app/main.py#L68">backend/app/main.py</a>. That means auth endpoints exist in code but are not active in the running app.</p>
<h2 id="configuration-and-infrastructure">Configuration and Infrastructure</h2>
<p>Configuration is defined in <a href="backend/app/core/config.py#L1">backend/app/core/config.py</a>.</p>
<p>Important settings:</p>
<ul>
<li>app_name</li>
<li>environment</li>
<li>port</li>
<li>redis_url</li>
<li>rabbitmq_url</li>
<li>neo4j_uri</li>
<li>neo4j_username</li>
<li>neo4j_password</li>
<li>chroma_host</li>
<li>chroma_port</li>
<li>outbox_transport</li>
<li>ollama_host</li>
<li>ollama_api_key</li>
<li>ollama_model</li>
<li>embedding_model</li>
<li>jwt_secret_key</li>
<li>hume_api_key</li>
<li>did_api_key</li>
<li>cors_allowed_origins</li>
<li>projection_weights_path</li>
</ul>
<p>Production validation fails fast if placeholder secrets are still configured.</p>
<h2 id="fallback-and-degradation-model">Fallback and Degradation Model</h2>
<p>The backend is designed to boot even when some dependencies are missing.</p>
<h3 id="redis">Redis</h3>
<p>Defined in <a href="backend/app/db/redis_client.py#L1">backend/app/db/redis_client.py</a>.</p>
<p>If Redis is unavailable, the app falls back to an in-memory stub. This keeps the app running, but sessions, cache, outbox, and analytics become ephemeral.</p>
<h3 id="neo4j">Neo4j</h3>
<p>Defined in <a href="backend/app/db/neo4j_client.py#L1">backend/app/db/neo4j_client.py</a>.</p>
<p>If Neo4j is unavailable, the app falls back to a no-op driver. Graph queries then return empty results.</p>
<h3 id="chromadb">ChromaDB</h3>
<p>Defined in <a href="backend/app/db/chroma_client.py#L1">backend/app/db/chroma_client.py</a>.</p>
<p>If ChromaDB is unavailable, a temporary no-op client is used. Optimization and semantic retrieval still work in degraded mode but without real vector search.</p>
<h3 id="ollama">Ollama</h3>
<p>Defined in <a href="backend/app/core/llm_client.py#L1">backend/app/core/llm_client.py</a>.</p>
<p>If an Ollama API key is not set, the backend switches several features to deterministic fallback behavior:</p>
<ul>
<li>campaign feature extraction uses heuristics</li>
<li>campaign optimization uses heuristic rewrite logic</li>
<li>MOHP uses rule-based objections</li>
<li>simulation replies use a fallback text response instead of an LLM</li>
</ul>
<h3 id="d-id-and-hume">D-ID and Hume</h3>
<p>Defined in <a href="backend/app/services/webrtc_simulation.py#L1">backend/app/services/webrtc_simulation.py</a>.</p>
<ul>
<li>Without D-ID credentials, simulation start returns a mock stream.</li>
<li>Without Hume credentials, prosody analysis is skipped.</li>
</ul>
<h2 id="core-backend-workflows">Core Backend Workflows</h2>
<h2 id="workflow-1-data-seeding-and-graph-setup">Workflow 1: Data Seeding and Graph Setup</h2>
<p>This is the first operational step when bringing the backend online.</p>
<h3 id="endpoint">Endpoint</h3>
<ul>
<li>POST /api/pipeline/seed</li>
</ul>
<h3 id="code-path">Code path</h3>
<p><a href="backend/app/api/routes/pipeline.py#L46">backend/app/api/routes/pipeline.py</a></p>
<h3 id="what-happens">What happens</h3>
<ol>
<li>The route looks for doctors_unified.json in the gold data locations.</li>
<li>It loads the doctor records from disk.</li>
<li>It calls ingest_doctors from <a href="backend/app/services/neo4j_graph.py#L102">backend/app/services/neo4j_graph.py</a>.</li>
<li>Neo4j schema constraints are created if missing.</li>
<li>Doctor nodes are merged.</li>
<li>Institution nodes are merged.</li>
<li>Topic nodes are merged.</li>
<li>Relationships are created:
<ul>
<li>Doctor -&gt; Institution via AFFILIATED_WITH</li>
<li>Doctor -&gt; Topic via RESEARCHES</li>
</ul>
</li>
</ol>
<h3 id="result">Result</h3>
<p>The knowledge graph is now queryable by graph and persona endpoints.</p>
<h2 id="workflow-2-transactional-outbox-and-async-ingestion">Workflow 2: Transactional Outbox and Async Ingestion</h2>
<p>This is the event-driven ingestion path.</p>
<h3 id="endpoints">Endpoints</h3>
<ul>
<li>POST /api/pipeline/ingest</li>
<li>POST /api/pipeline/dispatch</li>
</ul>
<h3 id="code-path-1">Code path</h3>
<ul>
<li><a href="backend/app/api/routes/pipeline.py#L18">backend/app/api/routes/pipeline.py</a></li>
<li><a href="backend/app/core/outbox.py#L1">backend/app/core/outbox.py</a></li>
<li><a href="backend/app/services/event_consumer.py#L1">backend/app/services/event_consumer.py</a></li>
</ul>
<h3 id="what-happens-1">What happens</h3>
<ol>
<li>A client posts ingestion payload to /api/pipeline/ingest.</li>
<li>The route adds an event to the Redis-backed outbox hash.</li>
<li>The background outbox worker running from <a href="backend/app/main.py#L101">backend/app/main.py</a> polls the outbox.</li>
<li>When an event is due, it publishes the event to:
<ul>
<li>Redis Streams by default</li>
<li>RabbitMQ if AMQP transport is configured</li>
</ul>
</li>
<li>The consumer thread reads events from the stream.</li>
<li>The gold.ingest handler sends the records into Neo4j.</li>
</ol>
<h3 id="reliability-model">Reliability model</h3>
<ul>
<li>publish-first dispatch pattern</li>
<li>idempotency tracking</li>
<li>retry with exponential backoff</li>
<li>dead letter queue after repeated failures</li>
</ul>
<h2 id="workflow-3-campaign-strategy-evaluation">Workflow 3: Campaign Strategy Evaluation</h2>
<p>This is the main analytical workflow.</p>
<h3 id="primary-endpoint">Primary endpoint</h3>
<ul>
<li>POST /api/strategy/full-evaluate</li>
</ul>
<h3 id="code-path-2">Code path</h3>
<p><a href="backend/app/api/routes/strategy.py#L235">backend/app/api/routes/strategy.py</a></p>
<h3 id="step-by-step-flow">Step-by-step flow</h3>
<ol>
<li>The route loads doctor records from local gold JSON.</li>
<li>If the local dataset is missing or too small, it creates a synthetic seed population.</li>
<li>It runs GMM clustering using <a href="backend/app/services/gmm_engine.py#L57">backend/app/services/gmm_engine.py</a>.</li>
<li>It extracts a 12-feature campaign vector using <a href="backend/app/services/campaign_vectorizer.py#L131">backend/app/services/campaign_vectorizer.py</a>.</li>
<li>It projects the campaign into the same PCA subspace used during clustering.</li>
<li>It computes Mahalanobis distances using <a href="backend/app/services/heatmap.py#L18">backend/app/services/heatmap.py</a>.</li>
<li>It ranks clusters by fit and builds cluster cards.</li>
<li>It checks whether the best fit distance exceeds the rejection threshold.</li>
<li>If rejected, it attempts campaign optimization using <a href="backend/app/services/rag_optimizer.py#L258">backend/app/services/rag_optimizer.py</a>.</li>
</ol>
<h3 id="important-design-point">Important design point</h3>
<p>The strategy pipeline does not depend on Neo4j for scoring. It operates on local datasets and local ML logic.</p>
<h3 id="strategy-endpoint-family">Strategy endpoint family</h3>
<p>Defined in <a href="backend/app/api/routes/strategy.py#L29">backend/app/api/routes/strategy.py</a>.</p>
<ul>
<li>POST /api/strategy/vectorize</li>
<li>POST /api/strategy/heatmap</li>
<li>POST /api/strategy/optimize</li>
<li>POST /api/strategy/evaluate</li>
<li>POST /api/strategy/memory/store</li>
<li>POST /api/strategy/full-evaluate</li>
<li>GET /api/strategy/blueprint/{segment_id}</li>
<li>POST /api/strategy/blueprint/{segment_id}</li>
<li>GET /api/strategy/cluster/{cluster_id}/doctors</li>
</ul>
<h2 id="how-campaign-vectorization-works">How campaign vectorization works</h2>
<p>Implemented in <a href="backend/app/services/campaign_vectorizer.py#L1">backend/app/services/campaign_vectorizer.py</a>.</p>
<p>The service works in two modes:</p>
<h3 id="heuristic-mode">Heuristic mode</h3>
<p>If no Ollama API key is configured:</p>
<ul>
<li>tokenizes the text</li>
<li>counts keyword hints for each feature</li>
<li>computes feature signals</li>
<li>normalizes them into a 12-feature vector</li>
<li>also generates an embedding</li>
</ul>
<h3 id="llm-assisted-mode">LLM-assisted mode</h3>
<p>If Ollama is configured:</p>
<ul>
<li>asks the model to produce a strict JSON object with 12 feature values</li>
<li>validates and normalizes those values</li>
<li>falls back to heuristics if parsing fails</li>
</ul>
<h3 id="feature-keys">Feature keys</h3>
<ul>
<li>therapeutic_focus</li>
<li>messaging_tone</li>
<li>target_seniority</li>
<li>channel_preference</li>
<li>kol_alignment</li>
<li>trial_phase_relevance</li>
<li>formulary_impact</li>
<li>patient_population_size</li>
<li>competitive_positioning</li>
<li>regulatory_stage</li>
<li>budget_tier</li>
<li>urgency_score</li>
</ul>
<h2 id="how-doctor-clustering-works">How doctor clustering works</h2>
<p>Implemented in <a href="backend/app/services/gmm_engine.py#L1">backend/app/services/gmm_engine.py</a>.</p>
<p>Pipeline:</p>
<ol>
<li>Extract numeric columns from doctor records.</li>
<li>Apply robust scaling.</li>
<li>Run PCA and keep enough components to explain about 90 percent of variance.</li>
<li>Select the best cluster count using BIC.</li>
<li>Fit a Gaussian Mixture Model.</li>
<li>Return:
<ul>
<li>centroids</li>
<li>covariance matrices</li>
<li>cluster probabilities</li>
<li>cluster assignments</li>
<li>2D member points for visualization</li>
<li>PCA transform metadata</li>
</ul>
</li>
</ol>
<h2 id="how-heatmap-scoring-works">How heatmap scoring works</h2>
<p>Implemented in <a href="backend/app/services/heatmap.py#L1">backend/app/services/heatmap.py</a>.</p>
<p>For each cluster:</p>
<ol>
<li>compute Mahalanobis distance between campaign vector and cluster centroid</li>
<li>convert inverse distances into probabilities</li>
<li>sort clusters by lowest distance</li>
</ol>
<p>The best cluster is the closest cluster. A campaign is rejected when that best distance is above the configured rejection threshold.</p>
<h2 id="how-campaign-optimization-works">How campaign optimization works</h2>
<p>Implemented in <a href="backend/app/services/rag_optimizer.py#L1">backend/app/services/rag_optimizer.py</a>.</p>
<h3 id="if-ollama-is-configured">If Ollama is configured</h3>
<ol>
<li>Retrieve similar campaigns from ChromaDB campaign memory.</li>
<li>Prefer successful examples.</li>
<li>Build a prompt with target cluster context and retrieved examples.</li>
<li>Ask the model to rewrite the campaign.</li>
<li>Re-vectorize the optimized output.</li>
</ol>
<h3 id="if-ollama-is-not-configured">If Ollama is not configured</h3>
<ol>
<li>Compute feature gaps against the target profile.</li>
<li>Apply deterministic rewrite guidance.</li>
<li>Return an optimized text with improvement notes.</li>
</ol>
<h3 id="campaign-memory">Campaign memory</h3>
<p>The endpoint POST /api/strategy/memory/store stores campaign embeddings in ChromaDB so future optimizations can retrieve similar examples.</p>
<h2 id="workflow-4-cluster-doctors-and-segment-targeting">Workflow 4: Cluster Doctors and Segment Targeting</h2>
<p>This is how the frontend gets doctors for a selected strategy segment.</p>
<h3 id="endpoint-1">Endpoint</h3>
<ul>
<li>GET /api/strategy/cluster/{cluster_id}/doctors</li>
</ul>
<h3 id="code-path-3">Code path</h3>
<p><a href="backend/app/api/routes/strategy.py#L614">backend/app/api/routes/strategy.py</a></p>
<h3 id="what-happens-2">What happens</h3>
<ol>
<li>The route first attempts to cluster the gold JSON doctor dataset.</li>
<li>If gold data is unavailable, it falls back to the bronze master CSV.</li>
<li>It returns frontend-friendly doctor rows for the selected cluster.</li>
<li>It can optionally filter by region.</li>
</ol>
<h3 id="important-design-point-1">Important design point</h3>
<p>This route is separate from the Neo4j graph query path. It is driven by local clustering datasets, not the graph database.</p>
<h2 id="workflow-5-persona-retrieval">Workflow 5: Persona Retrieval</h2>
<p>Personas are managed by <a href="backend/app/api/routes/persona.py#L1">backend/app/api/routes/persona.py</a>.</p>
<h3 id="endpoints-1">Endpoints</h3>
<ul>
<li>GET /api/persona/from-cluster/{cluster_id}</li>
<li>GET /api/persona/{code_name}</li>
</ul>
<h3 id="how-from-cluster-works">How from-cluster works</h3>
<ol>
<li>Look up a hardcoded cluster profile.</li>
<li>Build a trait list with slight random jitter.</li>
<li>Pull a representative doctor code from the cluster if available.</li>
<li>Return a synthetic persona response.</li>
</ol>
<h3 id="how-by-name-works">How by-name works</h3>
<ol>
<li>Try to fetch the doctor from Neo4j.</li>
<li>Derive cluster ID from the code name if needed.</li>
<li>If the doctor is missing, synthesize a persona from the hardcoded cluster profile.</li>
<li>If the doctor exists, add doctor metrics like h_index and works_count.</li>
</ol>
<h3 id="important-design-point-2">Important design point</h3>
<p>The cluster personality system is hardcoded. It is not dynamically learned from the current GMM output.</p>
<h2 id="workflow-6-graph-exploration">Workflow 6: Graph Exploration</h2>
<p>Graph querying lives in <a href="backend/app/api/routes/graph.py#L1">backend/app/api/routes/graph.py</a> and <a href="backend/app/services/neo4j_graph.py#L1">backend/app/services/neo4j_graph.py</a>.</p>
<h3 id="endpoints-2">Endpoints</h3>
<ul>
<li>POST /api/graph/ingest</li>
<li>GET /api/graph/doctor/{code_name}</li>
<li>GET /api/graph/cluster/{cluster_id}/doctors</li>
<li>GET /api/graph/institution/{institution_name}/doctors</li>
<li>GET /api/graph/topic/{topic_name}/doctors</li>
<li>GET /api/graph/institutions/summary</li>
<li>GET /api/graph/overlap</li>
</ul>
<h3 id="what-the-graph-stores">What the graph stores</h3>
<ul>
<li>Doctor nodes</li>
<li>Institution nodes</li>
<li>Topic nodes</li>
<li>doctor to institution edges</li>
<li>doctor to topic edges</li>
</ul>
<h3 id="common-graph-use-cases">Common graph use cases</h3>
<ul>
<li>list doctors in a cluster</li>
<li>inspect a single doctor node</li>
<li>list institution doctors</li>
<li>list topic researchers</li>
<li>compute shared topics between two doctors</li>
</ul>
<h2 id="workflow-7-simulation-and-roleplay">Workflow 7: Simulation and Roleplay</h2>
<p>Simulation lives in <a href="backend/app/api/routes/simulation.py#L1">backend/app/api/routes/simulation.py</a> and <a href="backend/app/services/webrtc_simulation.py#L1">backend/app/services/webrtc_simulation.py</a>.</p>
<h3 id="endpoints-3">Endpoints</h3>
<ul>
<li>POST /api/simulation/start</li>
<li>POST /api/simulation/handshake</li>
<li>POST /api/simulation/ice-candidate</li>
<li>POST /api/simulation/turn</li>
<li>GET /api/simulation/cache/{cache_key}</li>
</ul>
<h3 id="simulation-start-flow">Simulation start flow</h3>
<ol>
<li>The client sends persona_id and optional campaign context.</li>
<li>The backend creates a session ID.</li>
<li>It tries to create a D-ID stream.</li>
<li>If D-ID is unavailable, it returns a mock stream offer.</li>
<li>It stores simulation session state in Redis.</li>
<li>It also creates a persistent analytics session in the session store.</li>
</ol>
<h3 id="webrtc-flow">WebRTC flow</h3>
<ul>
<li>handshake sends the SDP answer</li>
<li>ice-candidate sends NAT traversal candidates</li>
</ul>
<h3 id="turn-processing-flow">Turn processing flow</h3>
<p>The turn loop in <a href="backend/app/services/webrtc_simulation.py#L424">backend/app/services/webrtc_simulation.py</a> does this:</p>
<ol>
<li>Load the simulation session from Redis.</li>
<li>Add the user turn to the persistent session store.</li>
<li>Check exact semantic cache in Redis.</li>
<li>Check approximate semantic cache in ChromaDB.</li>
<li>If cache miss, generate a doctor response.</li>
<li>Save the assistant turn.</li>
<li>Estimate emotion metrics and append them to the session timeline.</li>
<li>Optionally send audio to Hume for prosody analysis.</li>
<li>Return the response and cache metadata.</li>
</ol>
<h3 id="how-response-generation-works">How response generation works</h3>
<p>If Ollama is configured:</p>
<ul>
<li>build a persona-aware system prompt</li>
<li>attach recent conversation history</li>
<li>call the LLM</li>
</ul>
<p>If Ollama is not configured:</p>
<ul>
<li>return a deterministic fallback physician reply</li>
</ul>
<h3 id="important-implementation-note-1">Important implementation note</h3>
<p>Simulation turn handling does not automatically invoke MOHP. Compliance analysis is a separate endpoint.</p>
<h2 id="workflow-8-mohp-compliance-and-objection-analysis">Workflow 8: MOHP Compliance and Objection Analysis</h2>
<p>MOHP lives in <a href="backend/app/api/routes/mohp.py#L1">backend/app/api/routes/mohp.py</a>.</p>
<h3 id="endpoint-2">Endpoint</h3>
<ul>
<li>POST /api/mohp/evaluate</li>
</ul>
<h3 id="what-happens-3">What happens</h3>
<ol>
<li>Accept session_id, input_text, cluster_id, and optional persona_id.</li>
<li>If Ollama is configured, ask the model to generate structured objections.</li>
<li>Otherwise, use rule-based keyword matching against a hardcoded guideline database.</li>
<li>Add objections to the session store.</li>
<li>Return the objection list.</li>
</ol>
<h3 id="important-design-point-3">Important design point</h3>
<p>MOHP is not inside the simulation turn loop. Clients need to call it separately if they want live compliance analysis.</p>
<h2 id="workflow-9-session-analytics">Workflow 9: Session Analytics</h2>
<p>Analytics lives in <a href="backend/app/api/routes/analytics.py#L1">backend/app/api/routes/analytics.py</a> and <a href="backend/app/services/session_store.py#L1">backend/app/services/session_store.py</a>.</p>
<h3 id="endpoints-4">Endpoints</h3>
<ul>
<li>GET /api/analytics/sessions</li>
<li>GET /api/analytics/session/{session_id}</li>
<li>DELETE /api/analytics/session/{session_id}</li>
</ul>
<h3 id="what-is-stored-for-each-session">What is stored for each session</h3>
<ul>
<li>session_id</li>
<li>persona_id</li>
<li>campaign_id</li>
<li>cluster_id</li>
<li>created_at and ended_at</li>
<li>conversation turns</li>
<li>objections</li>
<li>emotion timeline</li>
<li>adherence_score</li>
<li>total_points and delivered_points</li>
<li>campaign_snapshot</li>
</ul>
<h3 id="how-analytics-are-built">How analytics are built</h3>
<p>The analytics route reads the stored session, resolves duration, resolves score fields, and returns a frontend-ready session detail object.</p>
<h2 id="workflow-10-stats-and-admin-operations">Workflow 10: Stats and Admin Operations</h2>
<h3 id="stats-endpoints">Stats endpoints</h3>
<p>Defined in <a href="backend/app/api/routes/stats.py#L1">backend/app/api/routes/stats.py</a>.</p>
<ul>
<li>GET /api/stats/embedding</li>
<li>GET /api/stats/projection</li>
<li>GET /api/stats/cache</li>
<li>GET /api/stats/dlq</li>
<li>GET /api/stats/outbox</li>
</ul>
<p>These expose model info, projection bridge status, cache counts, DLQ depth, and outbox depth.</p>
<h3 id="admin-endpoints">Admin endpoints</h3>
<p>Defined in <a href="backend/app/api/routes/admin.py#L1">backend/app/api/routes/admin.py</a>.</p>
<ul>
<li>GET /admin/embeddings/status</li>
<li>POST /admin/embeddings/swap</li>
<li>POST /admin/embeddings/reindex</li>
</ul>
<p>These manage the active embedding model and reindex ChromaDB collections if the vector space changes.</p>
<h3 id="embedding-registry">Embedding registry</h3>
<p>Implemented in <a href="backend/app/core/embedder.py#L1">backend/app/core/embedder.py</a>.</p>
<p>Supported embedding backends:</p>
<ul>
<li>built-in ONNX MiniLM</li>
<li>Ollama embedding API</li>
<li>sentence-transformers based models</li>
</ul>
<h2 id="third-party-services-summary">Third-Party Services Summary</h2>
<h3 id="redis-1">Redis</h3>
<p>Used for:</p>
<ul>
<li>session storage</li>
<li>analytics</li>
<li>simulation state</li>
<li>outbox</li>
<li>dead letter queue</li>
<li>semantic cache</li>
<li>event stream transport</li>
</ul>
<h3 id="neo4j-1">Neo4j</h3>
<p>Used for:</p>
<ul>
<li>doctor graph storage</li>
<li>graph exploration queries</li>
<li>doctor-backed persona lookups</li>
</ul>
<h3 id="chromadb-1">ChromaDB</h3>
<p>Used for:</p>
<ul>
<li>campaign memory retrieval</li>
<li>semantic cache retrieval</li>
<li>embedding-backed optimization</li>
</ul>
<h3 id="ollama-cloud">Ollama Cloud</h3>
<p>Used for:</p>
<ul>
<li>campaign vector extraction when configured</li>
<li>campaign optimization</li>
<li>simulation physician responses</li>
<li>MOHP objection generation</li>
</ul>
<p>All outgoing LLM content is scrubbed through <a href="backend/app/core/pii_scrubber.py#L1">backend/app/core/pii_scrubber.py</a>.</p>
<h3 id="d-id">D-ID</h3>
<p>Used for:</p>
<ul>
<li>WebRTC stream creation</li>
<li>SDP handshake exchange</li>
<li>ICE candidate transport</li>
</ul>
<h3 id="hume">Hume</h3>
<p>Used for:</p>
<ul>
<li>prosody and emotion analysis on input audio</li>
</ul>
<h3 id="rabbitmq">RabbitMQ</h3>
<p>Optional transport for outbox publishing. Default behavior uses Redis Streams.</p>
<h2 id="end-to-end-api-call-sequence">End-to-End API Call Sequence</h2>
<p>This is the most practical sequence for using the backend from start to end.</p>
<h3 id="1-health-check">1. Health check</h3>
<ul>
<li>GET /healthz</li>
</ul>
<h3 id="2-seed-the-graph">2. Seed the graph</h3>
<ul>
<li>POST /api/pipeline/seed</li>
</ul>
<h3 id="3-evaluate-a-campaign">3. Evaluate a campaign</h3>
<ul>
<li>POST /api/strategy/full-evaluate</li>
</ul>
<p>Typical output includes:</p>
<ul>
<li>campaign_vector_12d</li>
<li>campaign_vector_pca</li>
<li>gmm metadata</li>
<li>heatmap ranking</li>
<li>cluster_cards</li>
<li>rejected flag</li>
<li>optimized output if rejected</li>
</ul>
<h3 id="4-get-doctors-for-the-selected-cluster">4. Get doctors for the selected cluster</h3>
<ul>
<li>GET /api/strategy/cluster/{cluster_id}/doctors</li>
</ul>
<h3 id="5-get-a-persona">5. Get a persona</h3>
<ul>
<li>GET /api/persona/{code_name}
or</li>
<li>GET /api/persona/from-cluster/{cluster_id}</li>
</ul>
<h3 id="6-start-simulation">6. Start simulation</h3>
<ul>
<li>POST /api/simulation/start</li>
</ul>
<h3 id="7-complete-handshake">7. Complete handshake</h3>
<ul>
<li>POST /api/simulation/handshake</li>
<li>POST /api/simulation/ice-candidate</li>
</ul>
<h3 id="8-run-conversation-turns">8. Run conversation turns</h3>
<ul>
<li>POST /api/simulation/turn</li>
</ul>
<h3 id="9-run-compliance-analysis-if-needed">9. Run compliance analysis if needed</h3>
<ul>
<li>POST /api/mohp/evaluate</li>
</ul>
<h3 id="10-review-session-analytics">10. Review session analytics</h3>
<ul>
<li>GET /api/analytics/session/{session_id}</li>
</ul>
<h3 id="11-inspect-runtime-status-if-needed">11. Inspect runtime status if needed</h3>
<ul>
<li>GET /api/stats/embedding</li>
<li>GET /api/stats/projection</li>
<li>GET /api/stats/cache</li>
<li>GET /api/stats/dlq</li>
<li>GET /api/stats/outbox</li>
</ul>
<h2 id="real-request-flow-summary">Real Request Flow Summary</h2>
<p>Standalone Mermaid source for direct Mermaid Preview:</p>
<ul>
<li><a href="backend/diagrams/end_to_end_sequence.mmd">backend/diagrams/end_to_end_sequence.mmd</a></li>
</ul>
<pre><code class="language-mermaid"><div class="mermaid">sequenceDiagram
participant Client
participant API as FastAPI Backend
participant Strategy as Strategy Services
participant Graph as Neo4j
participant Redis as Redis
participant Chroma as ChromaDB
participant Ollama as Ollama
participant DID as D-ID
participant Hume as Hume
Client->>API: GET /healthz
API-->>Client: status ok
Client->>API: POST /api/pipeline/seed
API->>Graph: ingest doctors
Graph-->>API: seeded result
API-->>Client: graph ready
Client->>API: POST /api/strategy/full-evaluate
API->>Strategy: load records and cluster
Strategy->>Ollama: feature extraction or optimization if configured
Strategy->>Chroma: retrieve campaign examples if needed
Strategy-->>API: heatmap and cluster result
API-->>Client: strategy response
Client->>API: GET /api/strategy/cluster/{cluster_id}/doctors
API-->>Client: doctors in chosen segment
Client->>API: GET /api/persona/{code_name}
API->>Graph: doctor lookup
Graph-->>API: doctor data or empty
API-->>Client: persona payload
Client->>API: POST /api/simulation/start
API->>DID: create stream if configured
API->>Redis: store session
API-->>Client: session and stream details
Client->>API: POST /api/simulation/turn
API->>Redis: read and update session
API->>Chroma: semantic cache lookup
API->>Ollama: generate reply if needed
API->>Hume: analyze prosody if audio present
API-->>Client: simulation response
Client->>API: POST /api/mohp/evaluate
API->>Ollama: generate objections if configured
API->>Redis: store objections
API-->>Client: objections
Client->>API: GET /api/analytics/session/{session_id}
API->>Redis: load session analytics
API-->>Client: full review
</div></code></pre>
<h2 id="notable-implementation-details-and-discrepancies">Notable Implementation Details and Discrepancies</h2>
<ol>
<li>Auth routes exist in code but are not mounted in the app.</li>
<li>Strategy scoring uses local datasets and local ML, not the graph database.</li>
<li>Persona cluster profiles are hardcoded rather than learned dynamically.</li>
<li>Simulation turn processing does not automatically call MOHP.</li>
<li>Redis, Neo4j, and Chroma each have graceful degradation modes.</li>
<li>RabbitMQ is optional. Default outbox transport is Redis Streams.</li>
<li>Projection weights are preloaded at startup and can fall back to generated defaults if the weights file is missing.</li>
</ol>
<h2 id="recommended-mental-model-for-the-project">Recommended Mental Model for the Project</h2>
<p>The backend is best understood as three connected subsystems rather than one monolithic pipeline:</p>
<h3 id="subsystem-1-strategy-intelligence">Subsystem 1: Strategy intelligence</h3>
<ul>
<li>local doctor dataset</li>
<li>campaign vectorization</li>
<li>clustering</li>
<li>distance-based fit scoring</li>
<li>optimization and blueprint generation</li>
</ul>
<h3 id="subsystem-2-knowledge-graph">Subsystem 2: Knowledge graph</h3>
<ul>
<li>doctor graph persistence in Neo4j</li>
<li>institution and topic exploration</li>
<li>graph-backed doctor lookups</li>
</ul>
<h3 id="subsystem-3-simulation-and-analytics">Subsystem 3: Simulation and analytics</h3>
<ul>
<li>persona-driven simulation sessions</li>
<li>WebRTC integration</li>
<li>caching and session persistence</li>
<li>MOHP objections</li>
<li>analytics and history</li>
</ul>
<p>Together, these subsystems support the full product flow from doctor data setup to campaign evaluation to live roleplay and post-session review.</p>
</body>
</html>