bioflow / BIOFLOW_README.md
yassinekolsi
fix: PR review fixes - dockerfile, encoders, orchestrator, paths
5770d80
|
raw
history blame
7.77 kB

BioFlow - AI-Powered Drug Discovery Platform

Version License

BioFlow is a unified AI platform for drug discovery, combining molecular encoding, protein analysis, and drug-target interaction prediction in a modern web interface.

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Next.js Frontend                          β”‚
β”‚                   (React 19 + Tailwind)                      β”‚
β”‚                     localhost:3000                           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚ HTTP/REST
                        β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   FastAPI Backend                            β”‚
β”‚                    localhost:8000                            β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚ ModelService β”‚  β”‚QdrantServiceβ”‚  β”‚ DTI Predictor    β”‚   β”‚
β”‚  β”‚ (Encoders)   β”‚  β”‚ (VectorDB)  β”‚  β”‚ (DeepPurpose)    β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚
                        β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    OpenBioMed Core                           β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚   Models   β”‚  β”‚  Datasets  β”‚  β”‚       Tasks         β”‚   β”‚
β”‚  β”‚ BioT5,ESM  β”‚  β”‚ DAVIS,KIBA β”‚  β”‚ Property Prediction β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸš€ Quick Start

Prerequisites

  • Python 3.10+
  • Node.js 18+ with pnpm
  • (Optional) CUDA-compatible GPU

Installation

# Clone the repository
git clone https://github.com/hamzasammoud11-dotcom/lacoste001.git
cd lacoste001

# Install Python dependencies
pip install -r bioflow/api/requirements.txt

# Install frontend dependencies
cd lacoste001/ui
pnpm install
cd ../..

Running

Option 1: Using the launch script (Windows)

launch_bioflow_full.bat

Option 2: Manual start

# Terminal 1: Start FastAPI backend
python -m uvicorn bioflow.api.server:app --reload --port 8000

# Terminal 2: Start Next.js frontend
cd lacoste001/ui
pnpm dev

Access

πŸ“ Project Structure

OpenBioMed/
β”œβ”€β”€ bioflow/                    # BioFlow Platform
β”‚   β”œβ”€β”€ api/                    # FastAPI Backend
β”‚   β”‚   β”œβ”€β”€ server.py           # Main API server
β”‚   β”‚   β”œβ”€β”€ model_service.py    # Unified model access
β”‚   β”‚   β”œβ”€β”€ qdrant_service.py   # Vector database
β”‚   β”‚   └── dti_predictor.py    # DTI prediction
β”‚   β”œβ”€β”€ core/                   # Core abstractions
β”‚   β”œβ”€β”€ plugins/                # Encoders & retrievers
β”‚   └── workflows/              # Pipeline definitions
β”‚
β”œβ”€β”€ lacoste001/
β”‚   └── ui/                     # Next.js Frontend
β”‚       β”œβ”€β”€ app/
β”‚       β”‚   β”œβ”€β”€ api/            # API routes
β”‚       β”‚   └── dashboard/      # UI pages
β”‚       β”œβ”€β”€ components/         # React components
β”‚       └── lib/                # Services & utilities
β”‚
β”œβ”€β”€ open_biomed/                # OpenBioMed Research Engine
β”‚   β”œβ”€β”€ models/                 # BioT5, ESM, GraphMVP
β”‚   β”œβ”€β”€ datasets/               # Dataset loaders
β”‚   └── tasks/                  # Task implementations
β”‚
└── configs/                    # YAML configurations

πŸ”Œ API Endpoints

Discovery Pipeline

  • POST /api/discovery - Start discovery job
  • GET /api/discovery/{job_id} - Get job status

Predictions

  • POST /api/predict - DTI prediction
  • POST /api/encode - Encode molecule/protein/text

Data Management

  • POST /api/ingest - Add data to vector DB
  • GET /api/molecules - List molecules
  • GET /api/proteins - List proteins
  • GET /api/collections - List vector collections

Visualization

  • GET /api/explorer/embeddings - Get 2D projections
  • GET /api/similarity - Compute similarity scores

πŸ§ͺ Features

Drug Discovery Pipeline

  • Natural language, SMILES, or FASTA input
  • Automatic modality detection
  • Vector similarity search
  • Property prediction (MW, LogP, TPSA)
  • Binding affinity prediction

Molecular Analysis

  • 2D/3D molecule visualization
  • SMILES validation
  • Property calculation via RDKit

Protein Analysis

  • 3D protein structure viewing
  • Sequence embedding
  • DTI prediction

Explorer

  • UMAP/t-SNE embedding visualization
  • Cluster analysis
  • Interactive filtering

πŸ”§ Configuration

Environment Variables

# .env file
NEXT_PUBLIC_API_URL=http://localhost:8000
QDRANT_URL=http://localhost:6333  # Optional: remote Qdrant
QDRANT_PATH=./qdrant_data          # Local Qdrant storage

API Configuration

Edit lacoste001/ui/config/api.config.ts:

export const API_CONFIG = {
  baseUrl: process.env.NEXT_PUBLIC_API_URL || "http://localhost:8000",
  // ...
}

🧬 Model Support

Model Type Use Case
ChemBERTa Molecule Encoder SMILES embeddings
ESM-2 Protein Encoder Sequence embeddings
PubMedBERT Text Encoder Biomedical text
DeepPurpose DTI Binding prediction
GraphMVP Property Molecular properties
BioT5 Generation Molecule generation

πŸ“Š Development

Verify Installation

python scripts/verify_phase3.py

Run Tests

pytest tests/

Type Checking (Frontend)

cd lacoste001/ui
pnpm tsc --noEmit

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Submit a pull request

πŸ“„ License

Apache 2.0 - See LICENSE

πŸ™ Acknowledgments