--- language: - en tags: - paligemma2 - lora - multi-lora - serving - openai-api - visual-inspection - peft - google license: apache-2.0 --- # Anchor — PaliGemma2 Multi-LoRA Server **Load multiple LoRA adapters once. Switch between them at inference time — 216ms, no reload.** → **GitHub:** [recursia-lab/anchor](https://github.com/recursia-lab/anchor) ## What is this? Anchor is a lightweight serving server for PaliGemma2 with multiple LoRA adapters. Unlike frameworks that load adapters per-request from disk, Anchor keeps all adapters in GPU memory simultaneously — switching is a pointer swap. ``` Request: model="open_circuit" → set_adapter() → generate() → 216ms Request: model="missing_hole" → set_adapter() → generate() → 216ms Request: model="base" → disable_adapters() → generate() ``` ## Quick Start ```bash git clone https://github.com/recursia-lab/anchor docker build -t anchor . docker run --gpus all -v /model:/model -v /lora:/lora -p 8080:8080 anchor ``` ## API (OpenAI-compatible) ```bash curl http://localhost:8080/v1/chat/completions \ -d '{"model": "your_adapter", "messages": [...]}' ``` ## Framework Support | Framework | PaliGemma2 LoRA | |---|---| | **Anchor** | ✅ pre-loaded, 216ms switch | | vLLM | ✅ per-request load | | SGLang | 🚧 [PR #24034](https://github.com/sgl-project/sglang/pull/24034) | ## Community Adapters See [recursia-lab/paligemma2-adapters](https://github.com/recursia-lab/paligemma2-adapters) for a curated index of community fine-tuned PaliGemma2 LoRA adapters. --- Built by [Recursia Lab](https://github.com/recursia-lab) • Apache 2.0