--- title: Pharmaspine Backend emoji: ⚕️ colorFrom: blue colorTo: green sdk: docker app_port: 7860 --- # PharmaSpine AI Welcome to the AI Knowledge Spine project. This repository contains the complete infrastructure for a medical-grade AI assistant, including a sophisticated Governance Gateway, multi-database architecture, and highly optimized RAG pipelines. --- ## 🏗️ Current Architecture (As of June 2026) pharmaspine_AI ### 🗂️ Directory Structure & Code Layout #### 🎨 Frontend (`/frontend/`) * **`src/App.tsx` & `src/App.css`**: Main entry point and global styling. * **`src/components/ChatInterface.tsx`**: Manages the Chat state, auto-scrolling, and the slide-out **Settings Sidebar** (featuring active System Modules and Database-backed Chat History). * **`src/components/MessageBubble.tsx`**: Renders messages with a typing effect and interactive Governance Metadata (Citations JSON, Retrieval Scores, Decision Tags). #### 🛡️ Backend (`/services/governance-gateway/`) * **`app/main.py` & `routes/gateway.py`**: Initializes the FastAPI server and exposes endpoints (`/answer`, `/history`, `/metrics`). * **`app/services/orchestrator.py`**: The "brain" of the backend that orchestrates Qdrant, the CRAG AI grader, off-label policies, and final synthesis. * **`app/services/memory_client.py`**: Connects to Qdrant and Neo4j for Hybrid Search (Dense + Sparse embeddings). * **`app/services/crag.py`**: The Corrective RAG Grader using local Ollama (`phi3.5`). * **`app/services/gateway_answer_store.py`**: Connects to Postgres to save chat logs and fetch the latest past queries (`list_history`) for the UI. #### 🧠 Medical Data Injection (`/src/`) * **`src/embedding.py`**: Loads `fastembed` (SPLADE) and `MedCPT` models for vectorizing text. * **`src/retrieval.py`**: The raw math engine behind Hybrid Search: `(0.45 * lexical) + (0.20 * vector)...` * **`KAGGLE_INGESTION_GUIDE.md`**: Master Jupyter Notebook code used on Kaggle to process millions of FDA documents via GPU. ### Databases * **PostgreSQL (`Ai_knowledge_spine_DB`)**: Stores relational metadata, application state, and strict immutable audit logs. The tables and compliance triggers are fully managed by Alembic migrations. * **Qdrant Cloud**: A dedicated high-speed Vector Database for mathematical text embeddings. Fully populated via our GPU-accelerated Kaggle ingestion pipeline. * **Neo4j Aura**: A Knowledge Graph for complex relationships between molecules, diseases, and side effects. Fully integrated into the retrieval layer and populated via the internal Python pipeline. ### Governance Gateway (`services/governance-gateway/`) The Gateway is a rigorous security and optimization layer that intercepts all traffic to and from the LLM. * **Semantic Caching**: Zero-latency responses for exact matches using an in-memory LRU cache. * **Pre-RAG Intent Classifier**: Bypasses the vector DB for simple conversational greetings and strictly blocks out-of-domain prompts. * **Parallel RAG Execution**: Runs Self-RAG query refinement and the baseline Vector DB lookup simultaneously to minimize latency. * **Adversarial Scanning**: Uses `llm-guard` to instantly block prompt injections, fake citation requests, and banned topics (e.g., off-label regimens, "cure" claims). * **Pharmacovigilance (Adverse Event) Detection**: Automatically flags mentions of injury or side effects, injecting an emergency warning for the user and recording the flag in the audit database. * **Strict Off-Label Enforcement**: Enforces that any requests related to `"dose"` or `"line_of_therapy"` strictly cite an official drug Label (`"LBL"`). * **Output Guardrails**: Post-generation toxicity scanning and automated medical disclaimers for patient-facing queries. * **Immutable Audit Logging**: Every gateway interaction is recorded permanently to PostgreSQL via an Alembic-managed table equipped with anti-mutation triggers. ### Intelligence Layer & Retrieval * **Generation**: `llama-3.3-70b-versatile` (via Groq Cloud) for primary synthesis. * **Routing/Grading**: `phi3.5:latest` (via local Ollama). * **Dense Embedding**: `ncbi/MedCPT-Query-Encoder` (Medical-specific embeddings via HuggingFace). * **Sparse Search (BM25)**: `prithivida/Splade_PP_en_v1` (via fastembed) for exact lexical keyword matching. * **Retrieval Scoring**: Uses a strict deterministic Heuristic Formula instead of a neural Re-Ranker to ensure mathematical predictability: `final_score = (0.45 * lexical) + (0.20 * vector) + (0.25 * evidence) + (0.10 * graph_bonus)` --- ## 🚀 Getting Started (How to Run the Application) The project features a **React Vite Frontend** and a **FastAPI Governance Gateway Backend**. ### 1. Prerequisites Ensure you have the following running on your local machine: * **PostgreSQL Server**: Running locally on port `5432` with your `Ai_knowledge_spine_DB`. * **Ollama**: Running locally with the following models pulled: * `ollama pull ncbi/MedCPT-Query-Encoder` * `ollama pull phi3.5:latest` * `ollama pull qwen3.5:9b` * **API Keys**: Ensure your `.env` file is populated with your `GROQ_API_KEY`, `QDRANT_API_KEY`, and `NEO4J_PASSWORD`. ### 2. Start the Governance Gateway Open your terminal, navigate to the Gateway service directory, and start the FastAPI server: ```bash cd services/governance-gateway uvicorn app.main:app --reload --port 8000 ``` ### 3. Start the React Frontend UI Open a new terminal window, navigate to the frontend directory, and start the Vite development server: ```bash cd frontend npm run dev ``` ### 4. Interact via the Application Once both servers are running, open your web browser and navigate to: **👉 http://localhost:5173** You can now ask complex medical questions directly through the beautiful, auto-scrolling chat interface! The UI automatically connects to the Governance Gateway backend to execute Hybrid Search, Adverse Event detection, and Policy Guardrails. You can also view backend API docs at `http://127.0.0.1:8000/docs`. **Example Request Payload:** ```json { "question": "What is the recommended dosage of Pemetrexed?", "user_role": "Doctor", "audience": "Professional", "therapy_area": "Oncology", "geography": "US", "policy_profile": "strict_medical" } ``` The Gateway will run the 1-loop Self-RAG, query Neo4j and Qdrant (using the Hybrid Heuristic Formula), scan for Adverse Events, and return a strictly governed and cited answer! --- ## 🚨 Pending Next Steps 1. **Production Deployment (Dockerization)**: Create `Dockerfile`s and `docker-compose.yml` to containerize the FastAPI backend, React frontend, and infrastructure for 1-click cloud deployments. 2. **Automated Data Ingestion Pipeline**: Transition the manual `KAGGLE_INGESTION_GUIDE.md` notebook into an automated pipeline (e.g., Apache Airflow or GitHub Actions) to continuously ingest new FDA labels into Qdrant and Neo4j. 3. **Frontend Authentication & Profiles**: Add user login screens to allow switching personas (e.g., Doctor vs Patient), so the Gateway automatically adapts policy rules and answer formatting based on the authenticated profile. 4. **Analytics & Auditing Dashboard**: The foundational `GET /gateway/history` API is now complete! Next step is to build a dedicated React Dashboard tab to visualize Postgres `gateway_answers` and `audit_logs` (e.g., tracking total queries, blocked off-label requests, and AI confidence scores over time).