Orsync Scenarist Backend Workflow
Overview
The backend is a FastAPI application that combines three main systems:
- A strategy engine for campaign analysis and optimization.
- A knowledge graph layer for doctor, institution, and topic exploration.
- A simulation and analytics system for persona-based sales training.
At runtime, the backend can also use Redis, Neo4j, ChromaDB, Ollama, D-ID, Hume, and optionally RabbitMQ. Some of these are optional and have fallbacks.
High-Level Architecture
Standalone Mermaid source for direct Mermaid Preview:
flowchart LR
A[run.py] --> B[FastAPI app in app/main.py]
subgraph Runtime[Startup and background runtime]
direction TB
C[Startup model preload]
D[Outbox worker thread]
E[Event consumer thread]
end
subgraph Routes[API surface]
direction TB
F[Pipeline routes]
G[Strategy routes]
H[Graph routes]
I[Persona routes]
J[Simulation routes]
K[MOHP routes]
L[Analytics routes]
M[Stats and admin routes]
end
subgraph Services[Core services]
direction TB
N[Redis outbox]
O[Redis Streams or RabbitMQ]
P[Neo4j ingestion]
Q[Campaign vectorizer]
R[GMM clustering]
S[Heatmap scoring]
T[RAG optimizer]
U[Neo4j graph queries]
V[Redis session storage]
W[Ollama responses]
X[D-ID WebRTC stream]
Y[Hume prosody]
Z[Rule-based or LLM objection analysis]
AA[ChromaDB campaign memory]
end
B --> Runtime
B --> Routes
D --> N --> O
E --> P
F --> N
G --> Q
G --> R
G --> S
G --> T --> AA
H --> U
I --> U
J --> V
J --> W
J --> X
J --> Y
K --> Z
L --> V
Main Runtime Entry Points
1. Local bootstrap
The server starts from backend/run.py.
Responsibilities:
- Ensures first-time setup has been run.
- Tries to start embedded Redis if available.
- Tries to start embedded Neo4j if available.
- Tries to start embedded ChromaDB if available.
- Launches the FastAPI app.
Startup flags include:
- --reload
- --no-redis
- --no-neo4j
- --no-chroma
2. FastAPI application
The app is created in backend/app/main.py.
The lifespan startup sequence does three important things:
- Preloads the embedding model and projection weights.
- Starts the transactional outbox worker thread.
- Starts the event consumer thread.
Routers mounted in the app:
- pipeline
- math_engine
- strategy
- simulation
- graph
- analytics
- persona
- mohp
- stats
- admin
Also mounted directly:
- GET /healthz
- GET /
- GET /admin/dlq
- POST /admin/dlq/replay
Important implementation note
There is an auth router in backend/app/api/routes/auth.py, but it is not included in backend/app/main.py. That means auth endpoints exist in code but are not active in the running app.
Configuration and Infrastructure
Configuration is defined in backend/app/core/config.py.
Important settings:
- app_name
- environment
- port
- redis_url
- rabbitmq_url
- neo4j_uri
- neo4j_username
- neo4j_password
- chroma_host
- chroma_port
- outbox_transport
- ollama_host
- ollama_api_key
- ollama_model
- embedding_model
- jwt_secret_key
- hume_api_key
- did_api_key
- cors_allowed_origins
- projection_weights_path
Production validation fails fast if placeholder secrets are still configured.
Fallback and Degradation Model
The backend is designed to boot even when some dependencies are missing.
Redis
Defined in backend/app/db/redis_client.py.
If Redis is unavailable, the app falls back to an in-memory stub. This keeps the app running, but sessions, cache, outbox, and analytics become ephemeral.
Neo4j
Defined in backend/app/db/neo4j_client.py.
If Neo4j is unavailable, the app falls back to a no-op driver. Graph queries then return empty results.
ChromaDB
Defined in backend/app/db/chroma_client.py.
If ChromaDB is unavailable, a temporary no-op client is used. Optimization and semantic retrieval still work in degraded mode but without real vector search.
Ollama
Defined in backend/app/core/llm_client.py.
If an Ollama API key is not set, the backend switches several features to deterministic fallback behavior:
- campaign feature extraction uses heuristics
- campaign optimization uses heuristic rewrite logic
- MOHP uses rule-based objections
- simulation replies use a fallback text response instead of an LLM
D-ID and Hume
Defined in backend/app/services/webrtc_simulation.py.
- Without D-ID credentials, simulation start returns a mock stream.
- Without Hume credentials, prosody analysis is skipped.
Core Backend Workflows
Workflow 1: Data Seeding and Graph Setup
This is the first operational step when bringing the backend online.
Endpoint
Code path
backend/app/api/routes/pipeline.py
What happens
- The route looks for doctors_unified.json in the gold data locations.
- It loads the doctor records from disk.
- It calls ingest_doctors from backend/app/services/neo4j_graph.py.
- Neo4j schema constraints are created if missing.
- Doctor nodes are merged.
- Institution nodes are merged.
- Topic nodes are merged.
- Relationships are created:
- Doctor -> Institution via AFFILIATED_WITH
- Doctor -> Topic via RESEARCHES
Result
The knowledge graph is now queryable by graph and persona endpoints.
Workflow 2: Transactional Outbox and Async Ingestion
This is the event-driven ingestion path.
Endpoints
- POST /api/pipeline/ingest
- POST /api/pipeline/dispatch
Code path
What happens
- A client posts ingestion payload to /api/pipeline/ingest.
- The route adds an event to the Redis-backed outbox hash.
- The background outbox worker running from backend/app/main.py polls the outbox.
- When an event is due, it publishes the event to:
- Redis Streams by default
- RabbitMQ if AMQP transport is configured
- The consumer thread reads events from the stream.
- The gold.ingest handler sends the records into Neo4j.
Reliability model
- publish-first dispatch pattern
- idempotency tracking
- retry with exponential backoff
- dead letter queue after repeated failures
Workflow 3: Campaign Strategy Evaluation
This is the main analytical workflow.
Primary endpoint
- POST /api/strategy/full-evaluate
Code path
backend/app/api/routes/strategy.py
Step-by-step flow
- The route loads doctor records from local gold JSON.
- If the local dataset is missing or too small, it creates a synthetic seed population.
- It runs GMM clustering using backend/app/services/gmm_engine.py.
- It extracts a 12-feature campaign vector using backend/app/services/campaign_vectorizer.py.
- It projects the campaign into the same PCA subspace used during clustering.
- It computes Mahalanobis distances using backend/app/services/heatmap.py.
- It ranks clusters by fit and builds cluster cards.
- It checks whether the best fit distance exceeds the rejection threshold.
- If rejected, it attempts campaign optimization using backend/app/services/rag_optimizer.py.
Important design point
The strategy pipeline does not depend on Neo4j for scoring. It operates on local datasets and local ML logic.
Strategy endpoint family
Defined in backend/app/api/routes/strategy.py.
- POST /api/strategy/vectorize
- POST /api/strategy/heatmap
- POST /api/strategy/optimize
- POST /api/strategy/evaluate
- POST /api/strategy/memory/store
- POST /api/strategy/full-evaluate
- GET /api/strategy/blueprint/{segment_id}
- POST /api/strategy/blueprint/{segment_id}
- GET /api/strategy/cluster/{cluster_id}/doctors
How campaign vectorization works
Implemented in backend/app/services/campaign_vectorizer.py.
The service works in two modes:
Heuristic mode
If no Ollama API key is configured:
- tokenizes the text
- counts keyword hints for each feature
- computes feature signals
- normalizes them into a 12-feature vector
- also generates an embedding
LLM-assisted mode
If Ollama is configured:
- asks the model to produce a strict JSON object with 12 feature values
- validates and normalizes those values
- falls back to heuristics if parsing fails
Feature keys
- therapeutic_focus
- messaging_tone
- target_seniority
- channel_preference
- kol_alignment
- trial_phase_relevance
- formulary_impact
- patient_population_size
- competitive_positioning
- regulatory_stage
- budget_tier
- urgency_score
How doctor clustering works
Implemented in backend/app/services/gmm_engine.py.
Pipeline:
- Extract numeric columns from doctor records.
- Apply robust scaling.
- Run PCA and keep enough components to explain about 90 percent of variance.
- Select the best cluster count using BIC.
- Fit a Gaussian Mixture Model.
- Return:
- centroids
- covariance matrices
- cluster probabilities
- cluster assignments
- 2D member points for visualization
- PCA transform metadata
How heatmap scoring works
Implemented in backend/app/services/heatmap.py.
For each cluster:
- compute Mahalanobis distance between campaign vector and cluster centroid
- convert inverse distances into probabilities
- sort clusters by lowest distance
The best cluster is the closest cluster. A campaign is rejected when that best distance is above the configured rejection threshold.
How campaign optimization works
Implemented in backend/app/services/rag_optimizer.py.
- Retrieve similar campaigns from ChromaDB campaign memory.
- Prefer successful examples.
- Build a prompt with target cluster context and retrieved examples.
- Ask the model to rewrite the campaign.
- Re-vectorize the optimized output.
- Compute feature gaps against the target profile.
- Apply deterministic rewrite guidance.
- Return an optimized text with improvement notes.
Campaign memory
The endpoint POST /api/strategy/memory/store stores campaign embeddings in ChromaDB so future optimizations can retrieve similar examples.
Workflow 4: Cluster Doctors and Segment Targeting
This is how the frontend gets doctors for a selected strategy segment.
Endpoint
- GET /api/strategy/cluster/{cluster_id}/doctors
Code path
backend/app/api/routes/strategy.py
What happens
- The route first attempts to cluster the gold JSON doctor dataset.
- If gold data is unavailable, it falls back to the bronze master CSV.
- It returns frontend-friendly doctor rows for the selected cluster.
- It can optionally filter by region.
Important design point
This route is separate from the Neo4j graph query path. It is driven by local clustering datasets, not the graph database.
Workflow 5: Persona Retrieval
Personas are managed by backend/app/api/routes/persona.py.
Endpoints
- GET /api/persona/from-cluster/{cluster_id}
- GET /api/persona/{code_name}
How from-cluster works
- Look up a hardcoded cluster profile.
- Build a trait list with slight random jitter.
- Pull a representative doctor code from the cluster if available.
- Return a synthetic persona response.
How by-name works
- Try to fetch the doctor from Neo4j.
- Derive cluster ID from the code name if needed.
- If the doctor is missing, synthesize a persona from the hardcoded cluster profile.
- If the doctor exists, add doctor metrics like h_index and works_count.
Important design point
The cluster personality system is hardcoded. It is not dynamically learned from the current GMM output.
Workflow 6: Graph Exploration
Graph querying lives in backend/app/api/routes/graph.py and backend/app/services/neo4j_graph.py.
Endpoints
- POST /api/graph/ingest
- GET /api/graph/doctor/{code_name}
- GET /api/graph/cluster/{cluster_id}/doctors
- GET /api/graph/institution/{institution_name}/doctors
- GET /api/graph/topic/{topic_name}/doctors
- GET /api/graph/institutions/summary
- GET /api/graph/overlap
What the graph stores
- Doctor nodes
- Institution nodes
- Topic nodes
- doctor to institution edges
- doctor to topic edges
Common graph use cases
- list doctors in a cluster
- inspect a single doctor node
- list institution doctors
- list topic researchers
- compute shared topics between two doctors
Workflow 7: Simulation and Roleplay
Simulation lives in backend/app/api/routes/simulation.py and backend/app/services/webrtc_simulation.py.
Endpoints
- POST /api/simulation/start
- POST /api/simulation/handshake
- POST /api/simulation/ice-candidate
- POST /api/simulation/turn
- GET /api/simulation/cache/{cache_key}
Simulation start flow
- The client sends persona_id and optional campaign context.
- The backend creates a session ID.
- It tries to create a D-ID stream.
- If D-ID is unavailable, it returns a mock stream offer.
- It stores simulation session state in Redis.
- It also creates a persistent analytics session in the session store.
WebRTC flow
- handshake sends the SDP answer
- ice-candidate sends NAT traversal candidates
Turn processing flow
The turn loop in backend/app/services/webrtc_simulation.py does this:
- Load the simulation session from Redis.
- Add the user turn to the persistent session store.
- Check exact semantic cache in Redis.
- Check approximate semantic cache in ChromaDB.
- If cache miss, generate a doctor response.
- Save the assistant turn.
- Estimate emotion metrics and append them to the session timeline.
- Optionally send audio to Hume for prosody analysis.
- Return the response and cache metadata.
How response generation works
If Ollama is configured:
- build a persona-aware system prompt
- attach recent conversation history
- call the LLM
If Ollama is not configured:
- return a deterministic fallback physician reply
Important implementation note
Simulation turn handling does not automatically invoke MOHP. Compliance analysis is a separate endpoint.
Workflow 8: MOHP Compliance and Objection Analysis
MOHP lives in backend/app/api/routes/mohp.py.
Endpoint
What happens
- Accept session_id, input_text, cluster_id, and optional persona_id.
- If Ollama is configured, ask the model to generate structured objections.
- Otherwise, use rule-based keyword matching against a hardcoded guideline database.
- Add objections to the session store.
- Return the objection list.
Important design point
MOHP is not inside the simulation turn loop. Clients need to call it separately if they want live compliance analysis.
Workflow 9: Session Analytics
Analytics lives in backend/app/api/routes/analytics.py and backend/app/services/session_store.py.
Endpoints
- GET /api/analytics/sessions
- GET /api/analytics/session/{session_id}
- DELETE /api/analytics/session/{session_id}
What is stored for each session
- session_id
- persona_id
- campaign_id
- cluster_id
- created_at and ended_at
- conversation turns
- objections
- emotion timeline
- adherence_score
- total_points and delivered_points
- campaign_snapshot
How analytics are built
The analytics route reads the stored session, resolves duration, resolves score fields, and returns a frontend-ready session detail object.
Workflow 10: Stats and Admin Operations
Stats endpoints
Defined in backend/app/api/routes/stats.py.
- GET /api/stats/embedding
- GET /api/stats/projection
- GET /api/stats/cache
- GET /api/stats/dlq
- GET /api/stats/outbox
These expose model info, projection bridge status, cache counts, DLQ depth, and outbox depth.
Admin endpoints
Defined in backend/app/api/routes/admin.py.
- GET /admin/embeddings/status
- POST /admin/embeddings/swap
- POST /admin/embeddings/reindex
These manage the active embedding model and reindex ChromaDB collections if the vector space changes.
Embedding registry
Implemented in backend/app/core/embedder.py.
Supported embedding backends:
- built-in ONNX MiniLM
- Ollama embedding API
- sentence-transformers based models
Third-Party Services Summary
Redis
Used for:
- session storage
- analytics
- simulation state
- outbox
- dead letter queue
- semantic cache
- event stream transport
Neo4j
Used for:
- doctor graph storage
- graph exploration queries
- doctor-backed persona lookups
ChromaDB
Used for:
- campaign memory retrieval
- semantic cache retrieval
- embedding-backed optimization
Ollama Cloud
Used for:
- campaign vector extraction when configured
- campaign optimization
- simulation physician responses
- MOHP objection generation
All outgoing LLM content is scrubbed through backend/app/core/pii_scrubber.py.
D-ID
Used for:
- WebRTC stream creation
- SDP handshake exchange
- ICE candidate transport
Hume
Used for:
- prosody and emotion analysis on input audio
RabbitMQ
Optional transport for outbox publishing. Default behavior uses Redis Streams.
End-to-End API Call Sequence
This is the most practical sequence for using the backend from start to end.
1. Health check
2. Seed the graph
3. Evaluate a campaign
- POST /api/strategy/full-evaluate
Typical output includes:
- campaign_vector_12d
- campaign_vector_pca
- gmm metadata
- heatmap ranking
- cluster_cards
- rejected flag
- optimized output if rejected
4. Get doctors for the selected cluster
- GET /api/strategy/cluster/{cluster_id}/doctors
5. Get a persona
- GET /api/persona/{code_name}
or
- GET /api/persona/from-cluster/{cluster_id}
6. Start simulation
- POST /api/simulation/start
7. Complete handshake
- POST /api/simulation/handshake
- POST /api/simulation/ice-candidate
8. Run conversation turns
- POST /api/simulation/turn
9. Run compliance analysis if needed
10. Review session analytics
- GET /api/analytics/session/{session_id}
11. Inspect runtime status if needed
- GET /api/stats/embedding
- GET /api/stats/projection
- GET /api/stats/cache
- GET /api/stats/dlq
- GET /api/stats/outbox
Real Request Flow Summary
Standalone Mermaid source for direct Mermaid Preview:
sequenceDiagram
participant Client
participant API as FastAPI Backend
participant Strategy as Strategy Services
participant Graph as Neo4j
participant Redis as Redis
participant Chroma as ChromaDB
participant Ollama as Ollama
participant DID as D-ID
participant Hume as Hume
Client->>API: GET /healthz
API-->>Client: status ok
Client->>API: POST /api/pipeline/seed
API->>Graph: ingest doctors
Graph-->>API: seeded result
API-->>Client: graph ready
Client->>API: POST /api/strategy/full-evaluate
API->>Strategy: load records and cluster
Strategy->>Ollama: feature extraction or optimization if configured
Strategy->>Chroma: retrieve campaign examples if needed
Strategy-->>API: heatmap and cluster result
API-->>Client: strategy response
Client->>API: GET /api/strategy/cluster/{cluster_id}/doctors
API-->>Client: doctors in chosen segment
Client->>API: GET /api/persona/{code_name}
API->>Graph: doctor lookup
Graph-->>API: doctor data or empty
API-->>Client: persona payload
Client->>API: POST /api/simulation/start
API->>DID: create stream if configured
API->>Redis: store session
API-->>Client: session and stream details
Client->>API: POST /api/simulation/turn
API->>Redis: read and update session
API->>Chroma: semantic cache lookup
API->>Ollama: generate reply if needed
API->>Hume: analyze prosody if audio present
API-->>Client: simulation response
Client->>API: POST /api/mohp/evaluate
API->>Ollama: generate objections if configured
API->>Redis: store objections
API-->>Client: objections
Client->>API: GET /api/analytics/session/{session_id}
API->>Redis: load session analytics
API-->>Client: full review
Notable Implementation Details and Discrepancies
- Auth routes exist in code but are not mounted in the app.
- Strategy scoring uses local datasets and local ML, not the graph database.
- Persona cluster profiles are hardcoded rather than learned dynamically.
- Simulation turn processing does not automatically call MOHP.
- Redis, Neo4j, and Chroma each have graceful degradation modes.
- RabbitMQ is optional. Default outbox transport is Redis Streams.
- Projection weights are preloaded at startup and can fall back to generated defaults if the weights file is missing.
Recommended Mental Model for the Project
The backend is best understood as three connected subsystems rather than one monolithic pipeline:
Subsystem 1: Strategy intelligence
- local doctor dataset
- campaign vectorization
- clustering
- distance-based fit scoring
- optimization and blueprint generation
Subsystem 2: Knowledge graph
- doctor graph persistence in Neo4j
- institution and topic exploration
- graph-backed doctor lookups
Subsystem 3: Simulation and analytics
- persona-driven simulation sessions
- WebRTC integration
- caching and session persistence
- MOHP objections
- analytics and history
Together, these subsystems support the full product flow from doctor data setup to campaign evaluation to live roleplay and post-session review.