File size: 8,121 Bytes
f4b39c3 2508e29 ff0e97f f4b39c3 2a6dab1 ff0e97f f4b39c3 2a6dab1 f4b39c3 2a6dab1 ff0e97f 2508e29 ff0e97f 2508e29 ff0e97f 2508e29 2a6dab1 2508e29 2a6dab1 2508e29 2a6dab1 ff0e97f 2a6dab1 2508e29 2a6dab1 ff0e97f 2a6dab1 ff0e97f 2508e29 ff0e97f 2a6dab1 ff0e97f 2a6dab1 ff0e97f 2a6dab1 ff0e97f 2a6dab1 ff0e97f 2508e29 ff0e97f 2a6dab1 ff0e97f 2a6dab1 2508e29 2a6dab1 2508e29 2a6dab1 2508e29 2a6dab1 ff0e97f 2a6dab1 2508e29 2a6dab1 2508e29 2a6dab1 ff0e97f 2a6dab1 ff0e97f 2a6dab1 2508e29 2a6dab1 2508e29 2a6dab1 2508e29 2a6dab1 2508e29 2a6dab1 2508e29 2a6dab1 2508e29 2a6dab1 2508e29 2a6dab1 2508e29 2a6dab1 2508e29 2a6dab1 2508e29 2a6dab1 2508e29 2a6dab1 2508e29 2a6dab1 2508e29 2a6dab1 2508e29 2a6dab1 ff0e97f 2a6dab1 2508e29 2a6dab1 2508e29 2a6dab1 2508e29 2a6dab1 2508e29 2a6dab1 2508e29 2a6dab1 2508e29 2a6dab1 2508e29 2a6dab1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 | ---
title: BirdScope AI - MCP Multi-Agent System
emoji: π¦
colorFrom: green
colorTo: blue
sdk: gradio
sdk_version: 6.0.1
python_version: 3.11
app_file: app.py
pinned: false
license: mit
short_description: AI-powered bird identification with MCP multi-agent system
tags:
- building-mcp-track-enterprise
- building-mcp-track-consumer
- building-mcp-track-creative
- mcp-in-action-track-enterprise
- mcp-in-action-track-consumer
- mcp-in-action-track-creative
---
# π¦
BirdScope AI - MCP Multi-Agent System
**AI-powered bird identification with specialized MCP agents**
Built for the [MCP 1st Birthday Hackathon](https://huggingface.co/MCP-1st-Birthday)
---
## π’ Hackathon Submission
**Social Media:** [Twitter/X Post](https://x.com/zulucoconuts/status/1995255281064755708)
**Demo Video:** [Watch on YouTube/Loom](https://youtu.be/V_ZoOkyjEyU)
**Track Submissions:**
- π§ **Track 1 (Building MCP)**: Two custom MCP servers
- **Nuthatch MCP Server** - 7 tools for bird species database (search, species info, images, audio, family search, conservation filtering)
- **Modal Bird Classifier MCP** - 2 Modal-hosted GPU-powered image classification tools (base64 & URL inputs)
- Categories: Enterprise (wildlife conservation) | Consumer (bird enthusiasts and education) | Creative (multimedia exploration)
- π€ **Track 2 (MCP in Action)**: Full multi-agent system with supervisor routing
- LangGraph-based supervisor orchestrating 3 specialized subagents
- Integrates both MCP servers with intelligent tool routing
- Categories: Enterprise (conservation orgs) | Consumer (bird watchers) | Creative (educational multimedia)
**Author:** [@facemelter](https://huggingface.co/facemelter)
**Built with:** Gradio 6 | LangGraph | FastMCP | Modal (GPU) | OpenAI/Anthropic/HuggingFace LLMs
---
## π Project Overview
BirdScope AI showcases an advanced multi-agent system powered by **Gradio 6** and **LangGraph**, designed to identify bird species, explore multimedia content, and provide educational information about birds worldwide.
**Our innovation:** We built **two complete systems in one**:
- π§ **Two Custom MCP Servers** (Track 1): Nuthatch species database (7 tools) + Modal GPU classifier (2 tools)
- π€ **Multi-Agent Application** (Track 2): Supervisor-orchestrated specialist agents
This dual approach demonstrates both **building MCP infrastructure** and **leveraging MCP for autonomous agents**.
---
## β¨ Key Features
### π€ Multi-Agent Orchestration
- **LangGraph Supervisor Pattern** with intelligent LLM-based routing
- **3 Specialized Subagents** (Image Identifier, Species Explorer, Taxonomy Specialist)
- **Session-based Agent Caching** - Agents reused within user sessions for 10x faster responses
- **Provider-Specific Prompts** - Optimized system prompts for OpenAI, Anthropic, and HuggingFace
### π§ Dual MCP Server Architecture
- **Modal Bird Classifier** ([modal.com](https://modal.com))
- [prithivMLmods/Bird-Species-Classifier-526](https://huggingface.co/prithivMLmods/Bird-Species-Classifier-526) from HuggingFace
- 526 bird species classification on Modal T4 GPU
- Serverless GPU deployment for on-demand classification
- Streamable HTTP transport with base64 and URL input support
- **Nuthatch MCP Server** (Custom Built - Track 1)
- FastMCP framework with 7 specialized tools
- Integrates [Nuthatch API](https://nuthatch.lastelm.software) (1000+ species)
- **Dual Transport Support**: STDIO (subprocess) for HF Spaces + HTTP for local debugging
- Data sources: Nuthatch DB, Unsplash (images), xeno-canto (audio)
### π‘ Dual Streaming Output
- **Chat Response Stream** - Real-time markdown rendering with embedded media
- **Tool Execution Log Stream** - Parallel visibility into MCP tool calls (inputs/outputs)
- **Async Progress Indicators** - Immediate user feedback before processing begins
### π¨ Structured Output Parsing
- **LlamaIndex Pydantic Models** - Type-safe response formatting
- **Regex URL Extraction** - Automatic detection of image and audio URLs
- **Smart Audio Normalization** - xeno-canto links converted to browser-friendly format (`/download` β playable)
- **Markdown Media Embedding** - Images and audio automatically formatted
### π Multi-Provider LLM Support
- **OpenAI** (GPT-4o-mini) - Recommended for reliability
- **Anthropic** (Claude Sonnet 4) - Best for complex reasoning
- **HuggingFace Inference API** - Open-source models (limited tool calling)
- **User-Provided Keys** - No backend API key required, users supply their own
### π
Production UI/UX
- **Gradio 6.0 SSR** - Server-side rendering for enhanced performance
- **Custom Cloud Theme** - Sky-inspired CSS with mobile-responsive design
- **Dynamic Examples** - Example queries adapt to selected agent mode
- **Instant Feedback** - "β³ Starting..." indicator appears immediately on submit
---
## ποΈ Data Sources & MCP Servers
We built **two custom MCP servers** that integrate with bird data APIs and GPU-powered classification:
**Data Sources:**
- **Nuthatch API** ([nuthatch.lastelm.software](https://nuthatch.lastelm.software)) - 1000+ bird species database by Last Elm Software
- **Unsplash** - High-quality reference images for visual identification
- **xeno-canto.org** - Community-contributed bird audio recordings worldwide
- **HuggingFace Model** - [prithivMLmods/Bird-Species-Classifier-526](https://huggingface.co/prithivMLmods/Bird-Species-Classifier-526) for GPU classification
**MCP Servers:**
1. **Nuthatch MCP Server** (Track 1 - Building MCP)
- 7 specialized tools: search, species info, images, audio, family search, conservation filtering
- STDIO transport for HF Spaces, HTTP option for local debugging
- FastMCP framework with async API integration
2. **Modal Bird Classifier** (GPU-powered)
- Image classification tools: URL and base64 input support
- Serverless GPU deployment via Modal
- Streamable HTTP transport
---
## π§© Core Components
**Multi-Agent Orchestration:**
- **LangGraph Supervisor Pattern** - LLM-based routing between specialist agents
- **3 Specialized Subagents** - Each with focused tool subset (image ID, species exploration, taxonomy)
- **Session-based Caching** - Agent instances reused within user sessions for performance
- **Dual Streaming** - Parallel chat response + tool execution log streams
**Agent Architecture:**
- `subagent_supervisor.py` - Creates supervisor workflow with LangGraph
- `subagent_factory.py` - Builds specialists with filtered tool access
- `subagent_config.py` - Defines agent modes and tool allocations
- `prompts.py` - Provider-specific system prompts (OpenAI, Anthropic, HuggingFace)
**UI & UX:**
- **Gradio 6.0** with SSR for enhanced performance
- Custom cloud-themed CSS with mobile-responsive design
- Dynamic examples that adapt to agent mode selection
- Immediate processing feedback with async streaming updates
---
## π Quick Start
**Try the Live Demo:** Just provide your LLM API key (OpenAI, Anthropic, or HuggingFace) in the sidebar and start exploring!
**For Developers:**
```bash
# Clone and install
git clone <repo-url>
cd hackathon_draft
pip install -r requirements.txt
# Configure environment
cp .env.example .env
# Edit .env with your API keys
# Run locally
python app.py
```
**Deploy to HuggingFace Spaces:**
```bash
python upload_to_space.py
# Configure Secrets in Space Settings (see docs/dev/main-README.md)
```
**Full Setup Guide:** See [docs/dev/main-README.md](docs/dev/main-README.md) for comprehensive deployment instructions
---
## π Credits & License
Built for the [HuggingFace MCP 1st Birthday Hackathon](https://huggingface.co/MCP-1st-Birthday)
**Data Sources:** [Nuthatch API](https://nuthatch.lastelm.software) (Last Elm Software) | [xeno-canto.org](https://xeno-canto.org) | [Unsplash](https://unsplash.com)
**Technology:** [Model Context Protocol](https://github.com/anthropics/mcp) | [LangGraph](https://github.com/langchain-ai/langgraph) | [Gradio 6](https://gradio.app) | [Modal](https://modal.com)
MIT License - Educational and research purposes
|