---
title: Pharmaspine Backend
emoji: ⚕️
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
---

# PharmaSpine AI
Welcome to the AI Knowledge Spine project. This repository contains the complete infrastructure for a medical-grade AI assistant, including a sophisticated Governance Gateway, multi-database architecture, and highly optimized RAG pipelines.

---

## 🏗️ Current Architecture (As of June 2026)
<img width="1536" height="1024" alt="pharmaspine_AI" src="https://github.com/user-attachments/assets/8ce901ef-420c-4598-beaf-0ac11ccf3271" />

### 🗂️ Directory Structure & Code Layout

#### 🎨 Frontend (`/frontend/`)
* **`src/App.tsx` & `src/App.css`**: Main entry point and global styling.
* **`src/components/ChatInterface.tsx`**: Manages the Chat state, auto-scrolling, and the slide-out **Settings Sidebar** (featuring active System Modules and Database-backed Chat History).
* **`src/components/MessageBubble.tsx`**: Renders messages with a typing effect and interactive Governance Metadata (Citations JSON, Retrieval Scores, Decision Tags).

#### 🛡️ Backend (`/services/governance-gateway/`)
* **`app/main.py` & `routes/gateway.py`**: Initializes the FastAPI server and exposes endpoints (`/answer`, `/history`, `/metrics`).
* **`app/services/orchestrator.py`**: The "brain" of the backend that orchestrates Qdrant, the CRAG AI grader, off-label policies, and final synthesis.
* **`app/services/memory_client.py`**: Connects to Qdrant and Neo4j for Hybrid Search (Dense + Sparse embeddings).
* **`app/services/crag.py`**: The Corrective RAG Grader using local Ollama (`phi3.5`).
* **`app/services/gateway_answer_store.py`**: Connects to Postgres to save chat logs and fetch the latest past queries (`list_history`) for the UI.

#### 🧠 Medical Data Injection (`/src/`)
* **`src/embedding.py`**: Loads `fastembed` (SPLADE) and `MedCPT` models for vectorizing text.
* **`src/retrieval.py`**: The raw math engine behind Hybrid Search: `(0.45 * lexical) + (0.20 * vector)...`
* **`KAGGLE_INGESTION_GUIDE.md`**: Master Jupyter Notebook code used on Kaggle to process millions of FDA documents via GPU.

### Databases
* **PostgreSQL (`Ai_knowledge_spine_DB`)**: Stores relational metadata, application state, and strict immutable audit logs. The tables and compliance triggers are fully managed by Alembic migrations.
* **Qdrant Cloud**: A dedicated high-speed Vector Database for mathematical text embeddings. Fully populated via our GPU-accelerated Kaggle ingestion pipeline.
* **Neo4j Aura**: A Knowledge Graph for complex relationships between molecules, diseases, and side effects. Fully integrated into the retrieval layer and populated via the internal Python pipeline.

### Governance Gateway (`services/governance-gateway/`)
The Gateway is a rigorous security and optimization layer that intercepts all traffic to and from the LLM.

* **Semantic Caching**: Zero-latency responses for exact matches using an in-memory LRU cache.
* **Pre-RAG Intent Classifier**: Bypasses the vector DB for simple conversational greetings and strictly blocks out-of-domain prompts.
* **Parallel RAG Execution**: Runs Self-RAG query refinement and the baseline Vector DB lookup simultaneously to minimize latency.
* **Adversarial Scanning**: Uses `llm-guard` to instantly block prompt injections, fake citation requests, and banned topics (e.g., off-label regimens, "cure" claims).
* **Pharmacovigilance (Adverse Event) Detection**: Automatically flags mentions of injury or side effects, injecting an emergency warning for the user and recording the flag in the audit database.
* **Strict Off-Label Enforcement**: Enforces that any requests related to `"dose"` or `"line_of_therapy"` strictly cite an official drug Label (`"LBL"`).
* **Output Guardrails**: Post-generation toxicity scanning and automated medical disclaimers for patient-facing queries.
* **Immutable Audit Logging**: Every gateway interaction is recorded permanently to PostgreSQL via an Alembic-managed table equipped with anti-mutation triggers.

### Intelligence Layer & Retrieval
* **Generation**: `llama-3.3-70b-versatile` (via Groq Cloud) for primary synthesis.
* **Routing/Grading**: `phi3.5:latest` (via local Ollama).
* **Dense Embedding**: `ncbi/MedCPT-Query-Encoder` (Medical-specific embeddings via HuggingFace).
* **Sparse Search (BM25)**: `prithivida/Splade_PP_en_v1` (via fastembed) for exact lexical keyword matching.
* **Retrieval Scoring**: Uses a strict deterministic Heuristic Formula instead of a neural Re-Ranker to ensure mathematical predictability:
  `final_score = (0.45 * lexical) + (0.20 * vector) + (0.25 * evidence) + (0.10 * graph_bonus)`

---

## 🚀 Getting Started (How to Run the Application)

The project features a **React Vite Frontend** and a **FastAPI Governance Gateway Backend**.

### 1. Prerequisites
Ensure you have the following running on your local machine:
* **PostgreSQL Server**: Running locally on port `5432` with your `Ai_knowledge_spine_DB`.
* **Ollama**: Running locally with the following models pulled:
  * `ollama pull ncbi/MedCPT-Query-Encoder`
  * `ollama pull phi3.5:latest`
  * `ollama pull qwen3.5:9b`
* **API Keys**: Ensure your `.env` file is populated with your `GROQ_API_KEY`, `QDRANT_API_KEY`, and `NEO4J_PASSWORD`.

### 2. Start the Governance Gateway
Open your terminal, navigate to the Gateway service directory, and start the FastAPI server:
```bash
cd services/governance-gateway
uvicorn app.main:app --reload --port 8000
```

### 3. Start the React Frontend UI
Open a new terminal window, navigate to the frontend directory, and start the Vite development server:
```bash
cd frontend
npm run dev
```

### 4. Interact via the Application
Once both servers are running, open your web browser and navigate to:
**👉 http://localhost:5173**

You can now ask complex medical questions directly through the beautiful, auto-scrolling chat interface! The UI automatically connects to the Governance Gateway backend to execute Hybrid Search, Adverse Event detection, and Policy Guardrails. You can also view backend API docs at `http://127.0.0.1:8000/docs`.

**Example Request Payload:**
```json
{
  "question": "What is the recommended dosage of Pemetrexed?",
  "user_role": "Doctor",
  "audience": "Professional",
  "therapy_area": "Oncology",
  "geography": "US",
  "policy_profile": "strict_medical"
}
```

The Gateway will run the 1-loop Self-RAG, query Neo4j and Qdrant (using the Hybrid Heuristic Formula), scan for Adverse Events, and return a strictly governed and cited answer!

---

## 🚨 Pending Next Steps

1. **Production Deployment (Dockerization)**: Create `Dockerfile`s and `docker-compose.yml` to containerize the FastAPI backend, React frontend, and infrastructure for 1-click cloud deployments.
2. **Automated Data Ingestion Pipeline**: Transition the manual `KAGGLE_INGESTION_GUIDE.md` notebook into an automated pipeline (e.g., Apache Airflow or GitHub Actions) to continuously ingest new FDA labels into Qdrant and Neo4j.
3. **Frontend Authentication & Profiles**: Add user login screens to allow switching personas (e.g., Doctor vs Patient), so the Gateway automatically adapts policy rules and answer formatting based on the authenticated profile.
4. **Analytics & Auditing Dashboard**: The foundational `GET /gateway/history` API is now complete! Next step is to build a dedicated React Dashboard tab to visualize Postgres `gateway_answers` and `audit_logs` (e.g., tracking total queries, blocked off-label requests, and AI confidence scores over time).