pharmaspine-backend / README.md
ashish1265659565's picture
Upload README.md with huggingface_hub
9ba4317 verified
|
Raw
History Blame Contribute Delete
7.66 kB
metadata
title: Pharmaspine Backend
emoji: ⚕️
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860

PharmaSpine AI

Welcome to the AI Knowledge Spine project. This repository contains the complete infrastructure for a medical-grade AI assistant, including a sophisticated Governance Gateway, multi-database architecture, and highly optimized RAG pipelines.


🏗️ Current Architecture (As of June 2026)

pharmaspine_AI

🗂️ Directory Structure & Code Layout

🎨 Frontend (/frontend/)

  • src/App.tsx & src/App.css: Main entry point and global styling.
  • src/components/ChatInterface.tsx: Manages the Chat state, auto-scrolling, and the slide-out Settings Sidebar (featuring active System Modules and Database-backed Chat History).
  • src/components/MessageBubble.tsx: Renders messages with a typing effect and interactive Governance Metadata (Citations JSON, Retrieval Scores, Decision Tags).

🛡️ Backend (/services/governance-gateway/)

  • app/main.py & routes/gateway.py: Initializes the FastAPI server and exposes endpoints (/answer, /history, /metrics).
  • app/services/orchestrator.py: The "brain" of the backend that orchestrates Qdrant, the CRAG AI grader, off-label policies, and final synthesis.
  • app/services/memory_client.py: Connects to Qdrant and Neo4j for Hybrid Search (Dense + Sparse embeddings).
  • app/services/crag.py: The Corrective RAG Grader using local Ollama (phi3.5).
  • app/services/gateway_answer_store.py: Connects to Postgres to save chat logs and fetch the latest past queries (list_history) for the UI.

🧠 Medical Data Injection (/src/)

  • src/embedding.py: Loads fastembed (SPLADE) and MedCPT models for vectorizing text.
  • src/retrieval.py: The raw math engine behind Hybrid Search: (0.45 * lexical) + (0.20 * vector)...
  • KAGGLE_INGESTION_GUIDE.md: Master Jupyter Notebook code used on Kaggle to process millions of FDA documents via GPU.

Databases

  • PostgreSQL (Ai_knowledge_spine_DB): Stores relational metadata, application state, and strict immutable audit logs. The tables and compliance triggers are fully managed by Alembic migrations.
  • Qdrant Cloud: A dedicated high-speed Vector Database for mathematical text embeddings. Fully populated via our GPU-accelerated Kaggle ingestion pipeline.
  • Neo4j Aura: A Knowledge Graph for complex relationships between molecules, diseases, and side effects. Fully integrated into the retrieval layer and populated via the internal Python pipeline.

Governance Gateway (services/governance-gateway/)

The Gateway is a rigorous security and optimization layer that intercepts all traffic to and from the LLM.

  • Semantic Caching: Zero-latency responses for exact matches using an in-memory LRU cache.
  • Pre-RAG Intent Classifier: Bypasses the vector DB for simple conversational greetings and strictly blocks out-of-domain prompts.
  • Parallel RAG Execution: Runs Self-RAG query refinement and the baseline Vector DB lookup simultaneously to minimize latency.
  • Adversarial Scanning: Uses llm-guard to instantly block prompt injections, fake citation requests, and banned topics (e.g., off-label regimens, "cure" claims).
  • Pharmacovigilance (Adverse Event) Detection: Automatically flags mentions of injury or side effects, injecting an emergency warning for the user and recording the flag in the audit database.
  • Strict Off-Label Enforcement: Enforces that any requests related to "dose" or "line_of_therapy" strictly cite an official drug Label ("LBL").
  • Output Guardrails: Post-generation toxicity scanning and automated medical disclaimers for patient-facing queries.
  • Immutable Audit Logging: Every gateway interaction is recorded permanently to PostgreSQL via an Alembic-managed table equipped with anti-mutation triggers.

Intelligence Layer & Retrieval

  • Generation: llama-3.3-70b-versatile (via Groq Cloud) for primary synthesis.
  • Routing/Grading: phi3.5:latest (via local Ollama).
  • Dense Embedding: ncbi/MedCPT-Query-Encoder (Medical-specific embeddings via HuggingFace).
  • Sparse Search (BM25): prithivida/Splade_PP_en_v1 (via fastembed) for exact lexical keyword matching.
  • Retrieval Scoring: Uses a strict deterministic Heuristic Formula instead of a neural Re-Ranker to ensure mathematical predictability: final_score = (0.45 * lexical) + (0.20 * vector) + (0.25 * evidence) + (0.10 * graph_bonus)

🚀 Getting Started (How to Run the Application)

The project features a React Vite Frontend and a FastAPI Governance Gateway Backend.

1. Prerequisites

Ensure you have the following running on your local machine:

  • PostgreSQL Server: Running locally on port 5432 with your Ai_knowledge_spine_DB.
  • Ollama: Running locally with the following models pulled:
    • ollama pull ncbi/MedCPT-Query-Encoder
    • ollama pull phi3.5:latest
    • ollama pull qwen3.5:9b
  • API Keys: Ensure your .env file is populated with your GROQ_API_KEY, QDRANT_API_KEY, and NEO4J_PASSWORD.

2. Start the Governance Gateway

Open your terminal, navigate to the Gateway service directory, and start the FastAPI server:

cd services/governance-gateway
uvicorn app.main:app --reload --port 8000

3. Start the React Frontend UI

Open a new terminal window, navigate to the frontend directory, and start the Vite development server:

cd frontend
npm run dev

4. Interact via the Application

Once both servers are running, open your web browser and navigate to: 👉 http://localhost:5173

You can now ask complex medical questions directly through the beautiful, auto-scrolling chat interface! The UI automatically connects to the Governance Gateway backend to execute Hybrid Search, Adverse Event detection, and Policy Guardrails. You can also view backend API docs at http://127.0.0.1:8000/docs.

Example Request Payload:

{
  "question": "What is the recommended dosage of Pemetrexed?",
  "user_role": "Doctor",
  "audience": "Professional",
  "therapy_area": "Oncology",
  "geography": "US",
  "policy_profile": "strict_medical"
}

The Gateway will run the 1-loop Self-RAG, query Neo4j and Qdrant (using the Hybrid Heuristic Formula), scan for Adverse Events, and return a strictly governed and cited answer!


🚨 Pending Next Steps

  1. Production Deployment (Dockerization): Create Dockerfiles and docker-compose.yml to containerize the FastAPI backend, React frontend, and infrastructure for 1-click cloud deployments.
  2. Automated Data Ingestion Pipeline: Transition the manual KAGGLE_INGESTION_GUIDE.md notebook into an automated pipeline (e.g., Apache Airflow or GitHub Actions) to continuously ingest new FDA labels into Qdrant and Neo4j.
  3. Frontend Authentication & Profiles: Add user login screens to allow switching personas (e.g., Doctor vs Patient), so the Gateway automatically adapts policy rules and answer formatting based on the authenticated profile.
  4. Analytics & Auditing Dashboard: The foundational GET /gateway/history API is now complete! Next step is to build a dedicated React Dashboard tab to visualize Postgres gateway_answers and audit_logs (e.g., tracking total queries, blocked off-label requests, and AI confidence scores over time).