| --- |
| title: BirdScope AI - MCP Multi-Agent System |
| emoji: π¦
|
| colorFrom: green |
| colorTo: blue |
| sdk: gradio |
| sdk_version: 6.0.1 |
| python_version: 3.11 |
| app_file: app.py |
| pinned: false |
| license: mit |
| short_description: AI-powered bird identification with MCP multi-agent system |
| tags: |
| - building-mcp-track-enterprise |
| - building-mcp-track-consumer |
| - building-mcp-track-creative |
| - mcp-in-action-track-enterprise |
| - mcp-in-action-track-consumer |
| - mcp-in-action-track-creative |
| --- |
| |
| # π¦
BirdScope AI - MCP Multi-Agent System |
|
|
| **AI-powered bird identification with specialized MCP agents** |
|
|
| Built for the [MCP 1st Birthday Hackathon](https://huggingface.co/MCP-1st-Birthday) |
|
|
| --- |
|
|
| ## π’ Hackathon Submission |
|
|
| **Social Media:** [Twitter/X Post](https://x.com/zulucoconuts/status/1995255281064755708) |
|
|
| **Demo Video:** [Watch on YouTube/Loom](https://youtu.be/V_ZoOkyjEyU) |
|
|
| **Track Submissions:** |
| - π§ **Track 1 (Building MCP)**: Two custom MCP servers |
| - **Nuthatch MCP Server** - 7 tools for bird species database (search, species info, images, audio, family search, conservation filtering) |
| - **Modal Bird Classifier MCP** - 2 Modal-hosted GPU-powered image classification tools (base64 & URL inputs) |
| - Categories: Enterprise (wildlife conservation) | Consumer (bird enthusiasts and education) | Creative (multimedia exploration) |
| - π€ **Track 2 (MCP in Action)**: Full multi-agent system with supervisor routing |
| - LangGraph-based supervisor orchestrating 3 specialized subagents |
| - Integrates both MCP servers with intelligent tool routing |
| - Categories: Enterprise (conservation orgs) | Consumer (bird watchers) | Creative (educational multimedia) |
|
|
| **Author:** [@facemelter](https://huggingface.co/facemelter) |
|
|
| **Built with:** Gradio 6 | LangGraph | FastMCP | Modal (GPU) | OpenAI/Anthropic/HuggingFace LLMs |
|
|
| --- |
|
|
| ## π Project Overview |
|
|
| BirdScope AI showcases an advanced multi-agent system powered by **Gradio 6** and **LangGraph**, designed to identify bird species, explore multimedia content, and provide educational information about birds worldwide. |
|
|
| **Our innovation:** We built **two complete systems in one**: |
| - π§ **Two Custom MCP Servers** (Track 1): Nuthatch species database (7 tools) + Modal GPU classifier (2 tools) |
| - π€ **Multi-Agent Application** (Track 2): Supervisor-orchestrated specialist agents |
|
|
| This dual approach demonstrates both **building MCP infrastructure** and **leveraging MCP for autonomous agents**. |
|
|
| --- |
|
|
| ## β¨ Key Features |
|
|
| ### π€ Multi-Agent Orchestration |
| - **LangGraph Supervisor Pattern** with intelligent LLM-based routing |
| - **3 Specialized Subagents** (Image Identifier, Species Explorer, Taxonomy Specialist) |
| - **Session-based Agent Caching** - Agents reused within user sessions for 10x faster responses |
| - **Provider-Specific Prompts** - Optimized system prompts for OpenAI, Anthropic, and HuggingFace |
|
|
| ### π§ Dual MCP Server Architecture |
| - **Modal Bird Classifier** ([modal.com](https://modal.com)) |
| - [prithivMLmods/Bird-Species-Classifier-526](https://huggingface.co/prithivMLmods/Bird-Species-Classifier-526) from HuggingFace |
| - 526 bird species classification on Modal T4 GPU |
| - Serverless GPU deployment for on-demand classification |
| - Streamable HTTP transport with base64 and URL input support |
| - **Nuthatch MCP Server** (Custom Built - Track 1) |
| - FastMCP framework with 7 specialized tools |
| - Integrates [Nuthatch API](https://nuthatch.lastelm.software) (1000+ species) |
| - **Dual Transport Support**: STDIO (subprocess) for HF Spaces + HTTP for local debugging |
| - Data sources: Nuthatch DB, Unsplash (images), xeno-canto (audio) |
|
|
| ### π‘ Dual Streaming Output |
| - **Chat Response Stream** - Real-time markdown rendering with embedded media |
| - **Tool Execution Log Stream** - Parallel visibility into MCP tool calls (inputs/outputs) |
| - **Async Progress Indicators** - Immediate user feedback before processing begins |
|
|
| ### π¨ Structured Output Parsing |
| - **LlamaIndex Pydantic Models** - Type-safe response formatting |
| - **Regex URL Extraction** - Automatic detection of image and audio URLs |
| - **Smart Audio Normalization** - xeno-canto links converted to browser-friendly format (`/download` β playable) |
| - **Markdown Media Embedding** - Images and audio automatically formatted |
|
|
| ### π Multi-Provider LLM Support |
| - **OpenAI** (GPT-4o-mini) - Recommended for reliability |
| - **Anthropic** (Claude Sonnet 4) - Best for complex reasoning |
| - **HuggingFace Inference API** - Open-source models (limited tool calling) |
| - **User-Provided Keys** - No backend API key required, users supply their own |
|
|
| ### π
Production UI/UX |
| - **Gradio 6.0 SSR** - Server-side rendering for enhanced performance |
| - **Custom Cloud Theme** - Sky-inspired CSS with mobile-responsive design |
| - **Dynamic Examples** - Example queries adapt to selected agent mode |
| - **Instant Feedback** - "β³ Starting..." indicator appears immediately on submit |
|
|
| --- |
|
|
| ## ποΈ Data Sources & MCP Servers |
|
|
| We built **two custom MCP servers** that integrate with bird data APIs and GPU-powered classification: |
|
|
| **Data Sources:** |
| - **Nuthatch API** ([nuthatch.lastelm.software](https://nuthatch.lastelm.software)) - 1000+ bird species database by Last Elm Software |
| - **Unsplash** - High-quality reference images for visual identification |
| - **xeno-canto.org** - Community-contributed bird audio recordings worldwide |
| - **HuggingFace Model** - [prithivMLmods/Bird-Species-Classifier-526](https://huggingface.co/prithivMLmods/Bird-Species-Classifier-526) for GPU classification |
|
|
| **MCP Servers:** |
| 1. **Nuthatch MCP Server** (Track 1 - Building MCP) |
| - 7 specialized tools: search, species info, images, audio, family search, conservation filtering |
| - STDIO transport for HF Spaces, HTTP option for local debugging |
| - FastMCP framework with async API integration |
|
|
| 2. **Modal Bird Classifier** (GPU-powered) |
| - Image classification tools: URL and base64 input support |
| - Serverless GPU deployment via Modal |
| - Streamable HTTP transport |
|
|
| --- |
|
|
| ## π§© Core Components |
|
|
| **Multi-Agent Orchestration:** |
| - **LangGraph Supervisor Pattern** - LLM-based routing between specialist agents |
| - **3 Specialized Subagents** - Each with focused tool subset (image ID, species exploration, taxonomy) |
| - **Session-based Caching** - Agent instances reused within user sessions for performance |
| - **Dual Streaming** - Parallel chat response + tool execution log streams |
|
|
| **Agent Architecture:** |
| - `subagent_supervisor.py` - Creates supervisor workflow with LangGraph |
| - `subagent_factory.py` - Builds specialists with filtered tool access |
| - `subagent_config.py` - Defines agent modes and tool allocations |
| - `prompts.py` - Provider-specific system prompts (OpenAI, Anthropic, HuggingFace) |
|
|
| **UI & UX:** |
| - **Gradio 6.0** with SSR for enhanced performance |
| - Custom cloud-themed CSS with mobile-responsive design |
| - Dynamic examples that adapt to agent mode selection |
| - Immediate processing feedback with async streaming updates |
|
|
| --- |
|
|
| ## π Quick Start |
|
|
| **Try the Live Demo:** Just provide your LLM API key (OpenAI, Anthropic, or HuggingFace) in the sidebar and start exploring! |
|
|
| **For Developers:** |
| ```bash |
| # Clone and install |
| git clone <repo-url> |
| cd hackathon_draft |
| pip install -r requirements.txt |
| |
| # Configure environment |
| cp .env.example .env |
| # Edit .env with your API keys |
| |
| # Run locally |
| python app.py |
| ``` |
|
|
| **Deploy to HuggingFace Spaces:** |
| ```bash |
| python upload_to_space.py |
| # Configure Secrets in Space Settings (see docs/dev/main-README.md) |
| ``` |
|
|
| **Full Setup Guide:** See [docs/dev/main-README.md](docs/dev/main-README.md) for comprehensive deployment instructions |
|
|
| --- |
|
|
| ## π Credits & License |
|
|
| Built for the [HuggingFace MCP 1st Birthday Hackathon](https://huggingface.co/MCP-1st-Birthday) |
|
|
| **Data Sources:** [Nuthatch API](https://nuthatch.lastelm.software) (Last Elm Software) | [xeno-canto.org](https://xeno-canto.org) | [Unsplash](https://unsplash.com) |
|
|
| **Technology:** [Model Context Protocol](https://github.com/anthropics/mcp) | [LangGraph](https://github.com/langchain-ai/langgraph) | [Gradio 6](https://gradio.app) | [Modal](https://modal.com) |
|
|
| MIT License - Educational and research purposes |
|
|