Orsync Scenarist Backend Workflow

Overview

The backend is a FastAPI application that combines three main systems:

  1. A strategy engine for campaign analysis and optimization.
  2. A knowledge graph layer for doctor, institution, and topic exploration.
  3. A simulation and analytics system for persona-based sales training.

At runtime, the backend can also use Redis, Neo4j, ChromaDB, Ollama, D-ID, Hume, and optionally RabbitMQ. Some of these are optional and have fallbacks.

High-Level Architecture

Standalone Mermaid source for direct Mermaid Preview:

flowchart LR A[run.py] --> B[FastAPI app in app/main.py] subgraph Runtime[Startup and background runtime] direction TB C[Startup model preload] D[Outbox worker thread] E[Event consumer thread] end subgraph Routes[API surface] direction TB F[Pipeline routes] G[Strategy routes] H[Graph routes] I[Persona routes] J[Simulation routes] K[MOHP routes] L[Analytics routes] M[Stats and admin routes] end subgraph Services[Core services] direction TB N[Redis outbox] O[Redis Streams or RabbitMQ] P[Neo4j ingestion] Q[Campaign vectorizer] R[GMM clustering] S[Heatmap scoring] T[RAG optimizer] U[Neo4j graph queries] V[Redis session storage] W[Ollama responses] X[D-ID WebRTC stream] Y[Hume prosody] Z[Rule-based or LLM objection analysis] AA[ChromaDB campaign memory] end B --> Runtime B --> Routes D --> N --> O E --> P F --> N G --> Q G --> R G --> S G --> T --> AA H --> U I --> U J --> V J --> W J --> X J --> Y K --> Z L --> V

Main Runtime Entry Points

1. Local bootstrap

The server starts from backend/run.py.

Responsibilities:

Startup flags include:

2. FastAPI application

The app is created in backend/app/main.py.

The lifespan startup sequence does three important things:

  1. Preloads the embedding model and projection weights.
  2. Starts the transactional outbox worker thread.
  3. Starts the event consumer thread.

Routers mounted in the app:

Also mounted directly:

Important implementation note

There is an auth router in backend/app/api/routes/auth.py, but it is not included in backend/app/main.py. That means auth endpoints exist in code but are not active in the running app.

Configuration and Infrastructure

Configuration is defined in backend/app/core/config.py.

Important settings:

Production validation fails fast if placeholder secrets are still configured.

Fallback and Degradation Model

The backend is designed to boot even when some dependencies are missing.

Redis

Defined in backend/app/db/redis_client.py.

If Redis is unavailable, the app falls back to an in-memory stub. This keeps the app running, but sessions, cache, outbox, and analytics become ephemeral.

Neo4j

Defined in backend/app/db/neo4j_client.py.

If Neo4j is unavailable, the app falls back to a no-op driver. Graph queries then return empty results.

ChromaDB

Defined in backend/app/db/chroma_client.py.

If ChromaDB is unavailable, a temporary no-op client is used. Optimization and semantic retrieval still work in degraded mode but without real vector search.

Ollama

Defined in backend/app/core/llm_client.py.

If an Ollama API key is not set, the backend switches several features to deterministic fallback behavior:

D-ID and Hume

Defined in backend/app/services/webrtc_simulation.py.

Core Backend Workflows

Workflow 1: Data Seeding and Graph Setup

This is the first operational step when bringing the backend online.

Endpoint

Code path

backend/app/api/routes/pipeline.py

What happens

  1. The route looks for doctors_unified.json in the gold data locations.
  2. It loads the doctor records from disk.
  3. It calls ingest_doctors from backend/app/services/neo4j_graph.py.
  4. Neo4j schema constraints are created if missing.
  5. Doctor nodes are merged.
  6. Institution nodes are merged.
  7. Topic nodes are merged.
  8. Relationships are created:

Result

The knowledge graph is now queryable by graph and persona endpoints.

Workflow 2: Transactional Outbox and Async Ingestion

This is the event-driven ingestion path.

Endpoints

Code path

What happens

  1. A client posts ingestion payload to /api/pipeline/ingest.
  2. The route adds an event to the Redis-backed outbox hash.
  3. The background outbox worker running from backend/app/main.py polls the outbox.
  4. When an event is due, it publishes the event to:
  5. The consumer thread reads events from the stream.
  6. The gold.ingest handler sends the records into Neo4j.

Reliability model

Workflow 3: Campaign Strategy Evaluation

This is the main analytical workflow.

Primary endpoint

Code path

backend/app/api/routes/strategy.py

Step-by-step flow

  1. The route loads doctor records from local gold JSON.
  2. If the local dataset is missing or too small, it creates a synthetic seed population.
  3. It runs GMM clustering using backend/app/services/gmm_engine.py.
  4. It extracts a 12-feature campaign vector using backend/app/services/campaign_vectorizer.py.
  5. It projects the campaign into the same PCA subspace used during clustering.
  6. It computes Mahalanobis distances using backend/app/services/heatmap.py.
  7. It ranks clusters by fit and builds cluster cards.
  8. It checks whether the best fit distance exceeds the rejection threshold.
  9. If rejected, it attempts campaign optimization using backend/app/services/rag_optimizer.py.

Important design point

The strategy pipeline does not depend on Neo4j for scoring. It operates on local datasets and local ML logic.

Strategy endpoint family

Defined in backend/app/api/routes/strategy.py.

How campaign vectorization works

Implemented in backend/app/services/campaign_vectorizer.py.

The service works in two modes:

Heuristic mode

If no Ollama API key is configured:

LLM-assisted mode

If Ollama is configured:

Feature keys

How doctor clustering works

Implemented in backend/app/services/gmm_engine.py.

Pipeline:

  1. Extract numeric columns from doctor records.
  2. Apply robust scaling.
  3. Run PCA and keep enough components to explain about 90 percent of variance.
  4. Select the best cluster count using BIC.
  5. Fit a Gaussian Mixture Model.
  6. Return:

How heatmap scoring works

Implemented in backend/app/services/heatmap.py.

For each cluster:

  1. compute Mahalanobis distance between campaign vector and cluster centroid
  2. convert inverse distances into probabilities
  3. sort clusters by lowest distance

The best cluster is the closest cluster. A campaign is rejected when that best distance is above the configured rejection threshold.

How campaign optimization works

Implemented in backend/app/services/rag_optimizer.py.

If Ollama is configured

  1. Retrieve similar campaigns from ChromaDB campaign memory.
  2. Prefer successful examples.
  3. Build a prompt with target cluster context and retrieved examples.
  4. Ask the model to rewrite the campaign.
  5. Re-vectorize the optimized output.

If Ollama is not configured

  1. Compute feature gaps against the target profile.
  2. Apply deterministic rewrite guidance.
  3. Return an optimized text with improvement notes.

Campaign memory

The endpoint POST /api/strategy/memory/store stores campaign embeddings in ChromaDB so future optimizations can retrieve similar examples.

Workflow 4: Cluster Doctors and Segment Targeting

This is how the frontend gets doctors for a selected strategy segment.

Endpoint

Code path

backend/app/api/routes/strategy.py

What happens

  1. The route first attempts to cluster the gold JSON doctor dataset.
  2. If gold data is unavailable, it falls back to the bronze master CSV.
  3. It returns frontend-friendly doctor rows for the selected cluster.
  4. It can optionally filter by region.

Important design point

This route is separate from the Neo4j graph query path. It is driven by local clustering datasets, not the graph database.

Workflow 5: Persona Retrieval

Personas are managed by backend/app/api/routes/persona.py.

Endpoints

How from-cluster works

  1. Look up a hardcoded cluster profile.
  2. Build a trait list with slight random jitter.
  3. Pull a representative doctor code from the cluster if available.
  4. Return a synthetic persona response.

How by-name works

  1. Try to fetch the doctor from Neo4j.
  2. Derive cluster ID from the code name if needed.
  3. If the doctor is missing, synthesize a persona from the hardcoded cluster profile.
  4. If the doctor exists, add doctor metrics like h_index and works_count.

Important design point

The cluster personality system is hardcoded. It is not dynamically learned from the current GMM output.

Workflow 6: Graph Exploration

Graph querying lives in backend/app/api/routes/graph.py and backend/app/services/neo4j_graph.py.

Endpoints

What the graph stores

Common graph use cases

Workflow 7: Simulation and Roleplay

Simulation lives in backend/app/api/routes/simulation.py and backend/app/services/webrtc_simulation.py.

Endpoints

Simulation start flow

  1. The client sends persona_id and optional campaign context.
  2. The backend creates a session ID.
  3. It tries to create a D-ID stream.
  4. If D-ID is unavailable, it returns a mock stream offer.
  5. It stores simulation session state in Redis.
  6. It also creates a persistent analytics session in the session store.

WebRTC flow

Turn processing flow

The turn loop in backend/app/services/webrtc_simulation.py does this:

  1. Load the simulation session from Redis.
  2. Add the user turn to the persistent session store.
  3. Check exact semantic cache in Redis.
  4. Check approximate semantic cache in ChromaDB.
  5. If cache miss, generate a doctor response.
  6. Save the assistant turn.
  7. Estimate emotion metrics and append them to the session timeline.
  8. Optionally send audio to Hume for prosody analysis.
  9. Return the response and cache metadata.

How response generation works

If Ollama is configured:

If Ollama is not configured:

Important implementation note

Simulation turn handling does not automatically invoke MOHP. Compliance analysis is a separate endpoint.

Workflow 8: MOHP Compliance and Objection Analysis

MOHP lives in backend/app/api/routes/mohp.py.

Endpoint

What happens

  1. Accept session_id, input_text, cluster_id, and optional persona_id.
  2. If Ollama is configured, ask the model to generate structured objections.
  3. Otherwise, use rule-based keyword matching against a hardcoded guideline database.
  4. Add objections to the session store.
  5. Return the objection list.

Important design point

MOHP is not inside the simulation turn loop. Clients need to call it separately if they want live compliance analysis.

Workflow 9: Session Analytics

Analytics lives in backend/app/api/routes/analytics.py and backend/app/services/session_store.py.

Endpoints

What is stored for each session

How analytics are built

The analytics route reads the stored session, resolves duration, resolves score fields, and returns a frontend-ready session detail object.

Workflow 10: Stats and Admin Operations

Stats endpoints

Defined in backend/app/api/routes/stats.py.

These expose model info, projection bridge status, cache counts, DLQ depth, and outbox depth.

Admin endpoints

Defined in backend/app/api/routes/admin.py.

These manage the active embedding model and reindex ChromaDB collections if the vector space changes.

Embedding registry

Implemented in backend/app/core/embedder.py.

Supported embedding backends:

Third-Party Services Summary

Redis

Used for:

Neo4j

Used for:

ChromaDB

Used for:

Ollama Cloud

Used for:

All outgoing LLM content is scrubbed through backend/app/core/pii_scrubber.py.

D-ID

Used for:

Hume

Used for:

RabbitMQ

Optional transport for outbox publishing. Default behavior uses Redis Streams.

End-to-End API Call Sequence

This is the most practical sequence for using the backend from start to end.

1. Health check

2. Seed the graph

3. Evaluate a campaign

Typical output includes:

4. Get doctors for the selected cluster

5. Get a persona

6. Start simulation

7. Complete handshake

8. Run conversation turns

9. Run compliance analysis if needed

10. Review session analytics

11. Inspect runtime status if needed

Real Request Flow Summary

Standalone Mermaid source for direct Mermaid Preview:

sequenceDiagram participant Client participant API as FastAPI Backend participant Strategy as Strategy Services participant Graph as Neo4j participant Redis as Redis participant Chroma as ChromaDB participant Ollama as Ollama participant DID as D-ID participant Hume as Hume Client->>API: GET /healthz API-->>Client: status ok Client->>API: POST /api/pipeline/seed API->>Graph: ingest doctors Graph-->>API: seeded result API-->>Client: graph ready Client->>API: POST /api/strategy/full-evaluate API->>Strategy: load records and cluster Strategy->>Ollama: feature extraction or optimization if configured Strategy->>Chroma: retrieve campaign examples if needed Strategy-->>API: heatmap and cluster result API-->>Client: strategy response Client->>API: GET /api/strategy/cluster/{cluster_id}/doctors API-->>Client: doctors in chosen segment Client->>API: GET /api/persona/{code_name} API->>Graph: doctor lookup Graph-->>API: doctor data or empty API-->>Client: persona payload Client->>API: POST /api/simulation/start API->>DID: create stream if configured API->>Redis: store session API-->>Client: session and stream details Client->>API: POST /api/simulation/turn API->>Redis: read and update session API->>Chroma: semantic cache lookup API->>Ollama: generate reply if needed API->>Hume: analyze prosody if audio present API-->>Client: simulation response Client->>API: POST /api/mohp/evaluate API->>Ollama: generate objections if configured API->>Redis: store objections API-->>Client: objections Client->>API: GET /api/analytics/session/{session_id} API->>Redis: load session analytics API-->>Client: full review

Notable Implementation Details and Discrepancies

  1. Auth routes exist in code but are not mounted in the app.
  2. Strategy scoring uses local datasets and local ML, not the graph database.
  3. Persona cluster profiles are hardcoded rather than learned dynamically.
  4. Simulation turn processing does not automatically call MOHP.
  5. Redis, Neo4j, and Chroma each have graceful degradation modes.
  6. RabbitMQ is optional. Default outbox transport is Redis Streams.
  7. Projection weights are preloaded at startup and can fall back to generated defaults if the weights file is missing.

The backend is best understood as three connected subsystems rather than one monolithic pipeline:

Subsystem 1: Strategy intelligence

Subsystem 2: Knowledge graph

Subsystem 3: Simulation and analytics

Together, these subsystems support the full product flow from doctor data setup to campaign evaluation to live roleplay and post-session review.