Spaces:
Build error
Build error
Commit ·
bb04104
1
Parent(s): 905c229
Add seamless LangChain integration
Browse files- Add HeadroomChatModel wrapper with auto provider detection (OpenAI, Anthropic, Google)
- Add HeadroomChatMessageHistory for automatic conversation compression
- Add HeadroomDocumentCompressor for retriever integration
- Add wrap_tools_with_headroom() for agent tool output compression
- Add async support (ainvoke, astream)
- Add LangSmith integration for observability
- Restructure integrations package into nested langchain/ and mcp/ subpackages
- Fix Pydantic v2 deprecation warning
- Add comprehensive docs/langchain.md guide with real-world examples
- Update README with LangChain quickstart and framework integrations
Bump version to 0.2.3
- README.md +73 -18
- docs/README.md +7 -0
- docs/langchain.md +622 -0
- headroom/cache/compression_store.py +2 -1
- headroom/cache/dynamic_detector.py +12 -3
- headroom/ccr/mcp_server.py +4 -3
- headroom/integrations/__init__.py +84 -5
- headroom/integrations/langchain/__init__.py +106 -0
- headroom/integrations/langchain/agents.py +326 -0
- headroom/integrations/{langchain.py → langchain/chat_model.py} +117 -27
- headroom/integrations/langchain/langsmith.py +324 -0
- headroom/integrations/langchain/memory.py +319 -0
- headroom/integrations/langchain/providers.py +200 -0
- headroom/integrations/langchain/retriever.py +371 -0
- headroom/integrations/langchain/streaming.py +341 -0
- headroom/integrations/mcp/__init__.py +37 -0
- headroom/integrations/{mcp.py → mcp/server.py} +0 -0
- headroom/transforms/llmlingua_compressor.py +2 -1
- pyproject.toml +1 -1
- tests/test_integrations/langchain/__init__.py +0 -0
- tests/test_integrations/{test_langchain.py → langchain/test_chat_model.py} +3 -3
- tests/test_integrations/{test_langchain_evals.py → langchain/test_evals.py} +0 -0
- tests/test_integrations/langchain/test_extended.py +646 -0
- tests/test_integrations/mcp/__init__.py +0 -0
- tests/test_integrations/{test_mcp.py → mcp/test_server.py} +0 -0
- uv.lock +208 -3
README.md
CHANGED
|
@@ -27,45 +27,89 @@
|
|
| 27 |
|
| 28 |
## What It Does
|
| 29 |
|
| 30 |
-
Headroom is a **smart compression
|
| 31 |
|
| 32 |
- **Compresses tool outputs** — 1000 search results → 15 items (keeps errors, anomalies, relevant items)
|
| 33 |
- **Enables provider caching** — Stabilizes prefixes so cache hits actually happen
|
| 34 |
- **Manages context windows** — Prevents token limit failures without breaking tool calls
|
| 35 |
- **Reversible compression** — LLM can retrieve original data if needed ([CCR architecture](docs/ccr.md))
|
| 36 |
|
| 37 |
-
|
| 38 |
|
| 39 |
---
|
| 40 |
|
| 41 |
## 30-Second Quickstart
|
| 42 |
|
|
|
|
|
|
|
| 43 |
```bash
|
| 44 |
-
# Install
|
| 45 |
pip install "headroom-ai[proxy]"
|
| 46 |
-
|
| 47 |
-
# Start proxy
|
| 48 |
headroom proxy --port 8787
|
| 49 |
-
|
| 50 |
-
# Verify
|
| 51 |
-
curl http://localhost:8787/health
|
| 52 |
```
|
| 53 |
|
| 54 |
-
|
| 55 |
|
| 56 |
```bash
|
| 57 |
# Claude Code
|
| 58 |
ANTHROPIC_BASE_URL=http://localhost:8787 claude
|
| 59 |
|
| 60 |
-
#
|
| 61 |
OPENAI_BASE_URL=http://localhost:8787/v1 cursor
|
|
|
|
|
|
|
|
|
|
| 62 |
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
python your_script.py
|
| 66 |
```
|
| 67 |
|
| 68 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 69 |
|
| 70 |
---
|
| 71 |
|
|
@@ -82,13 +126,21 @@ curl http://localhost:8787/stats
|
|
| 82 |
}
|
| 83 |
```
|
| 84 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 85 |
---
|
| 86 |
|
| 87 |
## Installation
|
| 88 |
|
| 89 |
```bash
|
| 90 |
-
pip install "headroom-ai[proxy]" # Proxy server (recommended)
|
| 91 |
pip install headroom-ai # SDK only
|
|
|
|
|
|
|
| 92 |
pip install "headroom-ai[code]" # AST-based code compression
|
| 93 |
pip install "headroom-ai[llmlingua]" # ML-based compression
|
| 94 |
pip install "headroom-ai[all]" # Everything
|
|
@@ -106,10 +158,10 @@ pip install "headroom-ai[all]" # Everything
|
|
| 106 |
| **CacheAligner** | Stabilizes prefixes for provider caching | [Transforms](docs/transforms.md) |
|
| 107 |
| **RollingWindow** | Manages context limits without breaking tools | [Transforms](docs/transforms.md) |
|
| 108 |
| **CCR** | Reversible compression with automatic retrieval | [CCR Guide](docs/ccr.md) |
|
|
|
|
| 109 |
| **Text Utilities** | Opt-in compression for search/logs | [Text Compression](docs/text-compression.md) |
|
| 110 |
| **LLMLingua-2** | ML-based 20x compression (opt-in) | [LLMLingua](docs/llmlingua.md) |
|
| 111 |
| **Code-Aware** | AST-based code compression (tree-sitter) | [Transforms](docs/transforms.md) |
|
| 112 |
-
| **ContentRouter** | Auto-routes content to optimal compressor | [Transforms](docs/transforms.md) |
|
| 113 |
|
| 114 |
---
|
| 115 |
|
|
@@ -123,7 +175,7 @@ pip install "headroom-ai[all]" # Everything
|
|
| 123 |
| Cohere | Official API | - |
|
| 124 |
| Mistral | Official tokenizer | - |
|
| 125 |
|
| 126 |
-
**New models auto-supported** — Unknown models get sensible defaults based on naming patterns
|
| 127 |
|
| 128 |
---
|
| 129 |
|
|
@@ -134,6 +186,7 @@ pip install "headroom-ai[all]" # Everything
|
|
| 134 |
| Search results (1000 items) | 45,000 tokens | 4,500 tokens | 90% |
|
| 135 |
| Log analysis (500 entries) | 22,000 tokens | 3,300 tokens | 85% |
|
| 136 |
| Long conversation (50 turns) | 80,000 tokens | 32,000 tokens | 60% |
|
|
|
|
| 137 |
|
| 138 |
Overhead: ~1-5ms per request.
|
| 139 |
|
|
@@ -152,13 +205,13 @@ Overhead: ~1-5ms per request.
|
|
| 152 |
|
| 153 |
| Guide | Description |
|
| 154 |
|-------|-------------|
|
|
|
|
| 155 |
| [SDK Guide](docs/sdk.md) | Wrap your client for fine-grained control |
|
| 156 |
| [Proxy Guide](docs/proxy.md) | Production deployment |
|
| 157 |
| [Configuration](docs/configuration.md) | All configuration options |
|
| 158 |
| [CCR Guide](docs/ccr.md) | Reversible compression architecture |
|
| 159 |
| [Metrics](docs/metrics.md) | Monitoring and observability |
|
| 160 |
| [Troubleshooting](docs/troubleshooting.md) | Common issues |
|
| 161 |
-
| [Architecture](docs/ARCHITECTURE.md) | How it works internally |
|
| 162 |
|
| 163 |
---
|
| 164 |
|
|
@@ -168,6 +221,8 @@ See [`examples/`](examples/) for runnable code:
|
|
| 168 |
|
| 169 |
- `basic_usage.py` — Simple SDK usage
|
| 170 |
- `proxy_integration.py` — Using with different clients
|
|
|
|
|
|
|
| 171 |
- `ccr_demo.py` — CCR architecture demonstration
|
| 172 |
|
| 173 |
---
|
|
|
|
| 27 |
|
| 28 |
## What It Does
|
| 29 |
|
| 30 |
+
Headroom is a **smart compression layer** for LLM applications:
|
| 31 |
|
| 32 |
- **Compresses tool outputs** — 1000 search results → 15 items (keeps errors, anomalies, relevant items)
|
| 33 |
- **Enables provider caching** — Stabilizes prefixes so cache hits actually happen
|
| 34 |
- **Manages context windows** — Prevents token limit failures without breaking tool calls
|
| 35 |
- **Reversible compression** — LLM can retrieve original data if needed ([CCR architecture](docs/ccr.md))
|
| 36 |
|
| 37 |
+
Works as a **proxy** (zero code changes) or **SDK** (fine-grained control).
|
| 38 |
|
| 39 |
---
|
| 40 |
|
| 41 |
## 30-Second Quickstart
|
| 42 |
|
| 43 |
+
### Option 1: Proxy (Zero Code Changes)
|
| 44 |
+
|
| 45 |
```bash
|
|
|
|
| 46 |
pip install "headroom-ai[proxy]"
|
|
|
|
|
|
|
| 47 |
headroom proxy --port 8787
|
|
|
|
|
|
|
|
|
|
| 48 |
```
|
| 49 |
|
| 50 |
+
Point your tools at the proxy:
|
| 51 |
|
| 52 |
```bash
|
| 53 |
# Claude Code
|
| 54 |
ANTHROPIC_BASE_URL=http://localhost:8787 claude
|
| 55 |
|
| 56 |
+
# Any OpenAI-compatible client
|
| 57 |
OPENAI_BASE_URL=http://localhost:8787/v1 cursor
|
| 58 |
+
```
|
| 59 |
+
|
| 60 |
+
### Option 2: LangChain Integration
|
| 61 |
|
| 62 |
+
```bash
|
| 63 |
+
pip install "headroom-ai[langchain]"
|
|
|
|
| 64 |
```
|
| 65 |
|
| 66 |
+
```python
|
| 67 |
+
from langchain_openai import ChatOpenAI
|
| 68 |
+
from headroom.integrations import HeadroomChatModel
|
| 69 |
+
|
| 70 |
+
# Wrap your model - that's it!
|
| 71 |
+
llm = HeadroomChatModel(ChatOpenAI(model="gpt-4o"))
|
| 72 |
+
|
| 73 |
+
# Use exactly like before
|
| 74 |
+
response = llm.invoke("Hello!")
|
| 75 |
+
```
|
| 76 |
+
|
| 77 |
+
See the full [LangChain Integration Guide](docs/langchain.md) for memory, retrievers, agents, and more.
|
| 78 |
+
|
| 79 |
+
---
|
| 80 |
+
|
| 81 |
+
## Framework Integrations
|
| 82 |
+
|
| 83 |
+
| Framework | Integration | Docs |
|
| 84 |
+
|-----------|-------------|------|
|
| 85 |
+
| **LangChain** | `HeadroomChatModel`, memory, retrievers, agents | [Guide](docs/langchain.md) |
|
| 86 |
+
| **MCP** | Tool output compression for Claude | [Guide](docs/ccr.md) |
|
| 87 |
+
| **Any OpenAI Client** | Proxy server | [Guide](docs/proxy.md) |
|
| 88 |
+
|
| 89 |
+
### LangChain Highlights
|
| 90 |
+
|
| 91 |
+
```python
|
| 92 |
+
from headroom.integrations import (
|
| 93 |
+
HeadroomChatModel, # Wrap any chat model
|
| 94 |
+
HeadroomChatMessageHistory, # Auto-compress conversation history
|
| 95 |
+
HeadroomDocumentCompressor, # Filter retrieved documents
|
| 96 |
+
wrap_tools_with_headroom, # Compress agent tool outputs
|
| 97 |
+
)
|
| 98 |
+
|
| 99 |
+
# Memory that auto-compresses when over 4K tokens
|
| 100 |
+
memory = ConversationBufferMemory(
|
| 101 |
+
chat_memory=HeadroomChatMessageHistory(base_history)
|
| 102 |
+
)
|
| 103 |
+
|
| 104 |
+
# Retriever that keeps only relevant docs
|
| 105 |
+
retriever = ContextualCompressionRetriever(
|
| 106 |
+
base_compressor=HeadroomDocumentCompressor(max_documents=10),
|
| 107 |
+
base_retriever=vectorstore.as_retriever(search_kwargs={"k": 50}),
|
| 108 |
+
)
|
| 109 |
+
|
| 110 |
+
# Agent tools with automatic output compression
|
| 111 |
+
tools = wrap_tools_with_headroom([search_tool, database_tool])
|
| 112 |
+
```
|
| 113 |
|
| 114 |
---
|
| 115 |
|
|
|
|
| 126 |
}
|
| 127 |
```
|
| 128 |
|
| 129 |
+
Or in Python:
|
| 130 |
+
|
| 131 |
+
```python
|
| 132 |
+
print(llm.get_metrics())
|
| 133 |
+
# {'tokens_saved': 12500, 'savings_percent': 45.2}
|
| 134 |
+
```
|
| 135 |
+
|
| 136 |
---
|
| 137 |
|
| 138 |
## Installation
|
| 139 |
|
| 140 |
```bash
|
|
|
|
| 141 |
pip install headroom-ai # SDK only
|
| 142 |
+
pip install "headroom-ai[proxy]" # Proxy server
|
| 143 |
+
pip install "headroom-ai[langchain]" # LangChain integration
|
| 144 |
pip install "headroom-ai[code]" # AST-based code compression
|
| 145 |
pip install "headroom-ai[llmlingua]" # ML-based compression
|
| 146 |
pip install "headroom-ai[all]" # Everything
|
|
|
|
| 158 |
| **CacheAligner** | Stabilizes prefixes for provider caching | [Transforms](docs/transforms.md) |
|
| 159 |
| **RollingWindow** | Manages context limits without breaking tools | [Transforms](docs/transforms.md) |
|
| 160 |
| **CCR** | Reversible compression with automatic retrieval | [CCR Guide](docs/ccr.md) |
|
| 161 |
+
| **LangChain** | Memory, retrievers, agents, streaming | [LangChain](docs/langchain.md) |
|
| 162 |
| **Text Utilities** | Opt-in compression for search/logs | [Text Compression](docs/text-compression.md) |
|
| 163 |
| **LLMLingua-2** | ML-based 20x compression (opt-in) | [LLMLingua](docs/llmlingua.md) |
|
| 164 |
| **Code-Aware** | AST-based code compression (tree-sitter) | [Transforms](docs/transforms.md) |
|
|
|
|
| 165 |
|
| 166 |
---
|
| 167 |
|
|
|
|
| 175 |
| Cohere | Official API | - |
|
| 176 |
| Mistral | Official tokenizer | - |
|
| 177 |
|
| 178 |
+
**New models auto-supported** — Unknown models get sensible defaults based on naming patterns.
|
| 179 |
|
| 180 |
---
|
| 181 |
|
|
|
|
| 186 |
| Search results (1000 items) | 45,000 tokens | 4,500 tokens | 90% |
|
| 187 |
| Log analysis (500 entries) | 22,000 tokens | 3,300 tokens | 85% |
|
| 188 |
| Long conversation (50 turns) | 80,000 tokens | 32,000 tokens | 60% |
|
| 189 |
+
| Agent with tools (10 calls) | 100,000 tokens | 15,000 tokens | 85% |
|
| 190 |
|
| 191 |
Overhead: ~1-5ms per request.
|
| 192 |
|
|
|
|
| 205 |
|
| 206 |
| Guide | Description |
|
| 207 |
|-------|-------------|
|
| 208 |
+
| [LangChain Integration](docs/langchain.md) | Full LangChain support |
|
| 209 |
| [SDK Guide](docs/sdk.md) | Wrap your client for fine-grained control |
|
| 210 |
| [Proxy Guide](docs/proxy.md) | Production deployment |
|
| 211 |
| [Configuration](docs/configuration.md) | All configuration options |
|
| 212 |
| [CCR Guide](docs/ccr.md) | Reversible compression architecture |
|
| 213 |
| [Metrics](docs/metrics.md) | Monitoring and observability |
|
| 214 |
| [Troubleshooting](docs/troubleshooting.md) | Common issues |
|
|
|
|
| 215 |
|
| 216 |
---
|
| 217 |
|
|
|
|
| 221 |
|
| 222 |
- `basic_usage.py` — Simple SDK usage
|
| 223 |
- `proxy_integration.py` — Using with different clients
|
| 224 |
+
- `langchain_agent.py` — LangChain ReAct agent with Headroom
|
| 225 |
+
- `rag_pipeline.py` — RAG with document compression
|
| 226 |
- `ccr_demo.py` — CCR architecture demonstration
|
| 227 |
|
| 228 |
---
|
docs/README.md
CHANGED
|
@@ -10,6 +10,13 @@ Welcome to the Headroom documentation.
|
|
| 10 |
| [SDK Guide](sdk.md) | Python SDK usage |
|
| 11 |
| [Proxy Guide](proxy.md) | Proxy server deployment |
|
| 12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
## Core Concepts
|
| 14 |
|
| 15 |
| Topic | Description |
|
|
|
|
| 10 |
| [SDK Guide](sdk.md) | Python SDK usage |
|
| 11 |
| [Proxy Guide](proxy.md) | Proxy server deployment |
|
| 12 |
|
| 13 |
+
## Framework Integrations
|
| 14 |
+
|
| 15 |
+
| Framework | Description |
|
| 16 |
+
|-----------|-------------|
|
| 17 |
+
| [LangChain](langchain.md) | Chat models, memory, retrievers, agents, streaming |
|
| 18 |
+
| MCP | See [CCR Guide](ccr.md) for tool compression |
|
| 19 |
+
|
| 20 |
## Core Concepts
|
| 21 |
|
| 22 |
| Topic | Description |
|
docs/langchain.md
ADDED
|
@@ -0,0 +1,622 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# LangChain Integration
|
| 2 |
+
|
| 3 |
+
Headroom provides seamless integration with LangChain, enabling automatic context optimization across all LangChain patterns: chat models, memory, retrievers, agents, and observability.
|
| 4 |
+
|
| 5 |
+
## Installation
|
| 6 |
+
|
| 7 |
+
```bash
|
| 8 |
+
pip install "headroom-ai[langchain]"
|
| 9 |
+
```
|
| 10 |
+
|
| 11 |
+
This installs Headroom with LangChain dependencies (`langchain-core`).
|
| 12 |
+
|
| 13 |
+
## Quick Start
|
| 14 |
+
|
| 15 |
+
### Wrap Any Chat Model (1 Line)
|
| 16 |
+
|
| 17 |
+
```python
|
| 18 |
+
from langchain_openai import ChatOpenAI
|
| 19 |
+
from headroom.integrations import HeadroomChatModel
|
| 20 |
+
|
| 21 |
+
# Wrap your model - that's it!
|
| 22 |
+
llm = HeadroomChatModel(ChatOpenAI(model="gpt-4o"))
|
| 23 |
+
|
| 24 |
+
# Use exactly like before
|
| 25 |
+
response = llm.invoke("Hello!")
|
| 26 |
+
```
|
| 27 |
+
|
| 28 |
+
Headroom automatically:
|
| 29 |
+
- Detects the provider (OpenAI, Anthropic, Google)
|
| 30 |
+
- Compresses tool outputs in conversation history
|
| 31 |
+
- Optimizes for provider caching
|
| 32 |
+
- Tracks token savings
|
| 33 |
+
|
| 34 |
+
### Check Your Savings
|
| 35 |
+
|
| 36 |
+
```python
|
| 37 |
+
# After some usage
|
| 38 |
+
print(llm.get_metrics())
|
| 39 |
+
# {'tokens_saved': 12500, 'savings_percent': 45.2, 'requests': 50}
|
| 40 |
+
```
|
| 41 |
+
|
| 42 |
+
---
|
| 43 |
+
|
| 44 |
+
## Integration Patterns
|
| 45 |
+
|
| 46 |
+
### 1. Chat Model Wrapper
|
| 47 |
+
|
| 48 |
+
The `HeadroomChatModel` wraps any LangChain `BaseChatModel`:
|
| 49 |
+
|
| 50 |
+
```python
|
| 51 |
+
from langchain_openai import ChatOpenAI
|
| 52 |
+
from langchain_anthropic import ChatAnthropic
|
| 53 |
+
from headroom.integrations import HeadroomChatModel
|
| 54 |
+
|
| 55 |
+
# OpenAI
|
| 56 |
+
llm = HeadroomChatModel(ChatOpenAI(model="gpt-4o"))
|
| 57 |
+
|
| 58 |
+
# Anthropic (auto-detected)
|
| 59 |
+
llm = HeadroomChatModel(ChatAnthropic(model="claude-3-5-sonnet-20241022"))
|
| 60 |
+
|
| 61 |
+
# Custom configuration
|
| 62 |
+
from headroom import HeadroomConfig, HeadroomMode
|
| 63 |
+
|
| 64 |
+
config = HeadroomConfig(
|
| 65 |
+
default_mode=HeadroomMode.OPTIMIZE,
|
| 66 |
+
smart_crusher_target_ratio=0.3, # Target 70% compression
|
| 67 |
+
)
|
| 68 |
+
llm = HeadroomChatModel(
|
| 69 |
+
ChatOpenAI(model="gpt-4o"),
|
| 70 |
+
headroom_config=config,
|
| 71 |
+
)
|
| 72 |
+
```
|
| 73 |
+
|
| 74 |
+
#### Async Support
|
| 75 |
+
|
| 76 |
+
Full async support for `ainvoke` and `astream`:
|
| 77 |
+
|
| 78 |
+
```python
|
| 79 |
+
# Async invoke
|
| 80 |
+
response = await llm.ainvoke("Hello!")
|
| 81 |
+
|
| 82 |
+
# Async streaming
|
| 83 |
+
async for chunk in llm.astream("Tell me a story"):
|
| 84 |
+
print(chunk.content, end="", flush=True)
|
| 85 |
+
```
|
| 86 |
+
|
| 87 |
+
#### Tool Calling
|
| 88 |
+
|
| 89 |
+
Works seamlessly with LangChain tool calling:
|
| 90 |
+
|
| 91 |
+
```python
|
| 92 |
+
from langchain_core.tools import tool
|
| 93 |
+
|
| 94 |
+
@tool
|
| 95 |
+
def search(query: str) -> str:
|
| 96 |
+
"""Search the web."""
|
| 97 |
+
return {"results": [...]} # Large JSON response
|
| 98 |
+
|
| 99 |
+
llm_with_tools = llm.bind_tools([search])
|
| 100 |
+
response = llm_with_tools.invoke("Search for Python tutorials")
|
| 101 |
+
# Tool outputs are automatically compressed in subsequent turns
|
| 102 |
+
```
|
| 103 |
+
|
| 104 |
+
---
|
| 105 |
+
|
| 106 |
+
### 2. Memory Integration
|
| 107 |
+
|
| 108 |
+
`HeadroomChatMessageHistory` wraps any chat history with automatic compression:
|
| 109 |
+
|
| 110 |
+
```python
|
| 111 |
+
from langchain.memory import ConversationBufferMemory
|
| 112 |
+
from langchain_community.chat_message_histories import ChatMessageHistory
|
| 113 |
+
from headroom.integrations import HeadroomChatMessageHistory
|
| 114 |
+
|
| 115 |
+
# Wrap any history
|
| 116 |
+
base_history = ChatMessageHistory()
|
| 117 |
+
compressed_history = HeadroomChatMessageHistory(
|
| 118 |
+
base_history,
|
| 119 |
+
compress_threshold_tokens=4000, # Compress when over 4K tokens
|
| 120 |
+
keep_recent_turns=5, # Always keep last 5 turns
|
| 121 |
+
)
|
| 122 |
+
|
| 123 |
+
# Use with any memory class
|
| 124 |
+
memory = ConversationBufferMemory(chat_memory=compressed_history)
|
| 125 |
+
|
| 126 |
+
# Zero changes to your chain!
|
| 127 |
+
chain = ConversationChain(llm=llm, memory=memory)
|
| 128 |
+
```
|
| 129 |
+
|
| 130 |
+
**Why this matters**: Long conversations can blow up to 50K+ tokens. HeadroomChatMessageHistory automatically compresses older turns while preserving recent context.
|
| 131 |
+
|
| 132 |
+
```python
|
| 133 |
+
# Check compression stats
|
| 134 |
+
print(compressed_history.get_compression_stats())
|
| 135 |
+
# {'compression_count': 12, 'total_tokens_saved': 28000}
|
| 136 |
+
```
|
| 137 |
+
|
| 138 |
+
---
|
| 139 |
+
|
| 140 |
+
### 3. Retriever Integration
|
| 141 |
+
|
| 142 |
+
`HeadroomDocumentCompressor` filters retrieved documents by relevance:
|
| 143 |
+
|
| 144 |
+
```python
|
| 145 |
+
from langchain.retrievers import ContextualCompressionRetriever
|
| 146 |
+
from langchain_community.vectorstores import FAISS
|
| 147 |
+
from headroom.integrations import HeadroomDocumentCompressor
|
| 148 |
+
|
| 149 |
+
# Create vector store retriever (retrieve many for recall)
|
| 150 |
+
vectorstore = FAISS.from_documents(documents, embeddings)
|
| 151 |
+
base_retriever = vectorstore.as_retriever(search_kwargs={"k": 50})
|
| 152 |
+
|
| 153 |
+
# Wrap with Headroom compression (keep best for precision)
|
| 154 |
+
compressor = HeadroomDocumentCompressor(
|
| 155 |
+
max_documents=10, # Keep top 10
|
| 156 |
+
min_relevance=0.3, # Minimum relevance score
|
| 157 |
+
prefer_diverse=True, # MMR-style diversity
|
| 158 |
+
)
|
| 159 |
+
|
| 160 |
+
retriever = ContextualCompressionRetriever(
|
| 161 |
+
base_compressor=compressor,
|
| 162 |
+
base_retriever=base_retriever,
|
| 163 |
+
)
|
| 164 |
+
|
| 165 |
+
# Retrieves 50 docs, returns best 10
|
| 166 |
+
docs = retriever.invoke("What is Python?")
|
| 167 |
+
```
|
| 168 |
+
|
| 169 |
+
**Why this matters**: Vector search often returns many marginally-relevant documents. HeadroomDocumentCompressor uses BM25-style scoring to keep only the most relevant ones, reducing context size while improving answer quality.
|
| 170 |
+
|
| 171 |
+
---
|
| 172 |
+
|
| 173 |
+
### 4. Agent Tool Wrapping
|
| 174 |
+
|
| 175 |
+
`wrap_tools_with_headroom` compresses tool outputs for agents:
|
| 176 |
+
|
| 177 |
+
```python
|
| 178 |
+
from langchain.agents import create_openai_tools_agent, AgentExecutor
|
| 179 |
+
from langchain_core.tools import tool
|
| 180 |
+
from headroom.integrations import wrap_tools_with_headroom
|
| 181 |
+
|
| 182 |
+
@tool
|
| 183 |
+
def search_database(query: str) -> str:
|
| 184 |
+
"""Search the database."""
|
| 185 |
+
# Returns 1000 results as JSON
|
| 186 |
+
return json.dumps({"results": [...], "total": 1000})
|
| 187 |
+
|
| 188 |
+
@tool
|
| 189 |
+
def fetch_logs(service: str) -> str:
|
| 190 |
+
"""Fetch service logs."""
|
| 191 |
+
# Returns 500 log entries
|
| 192 |
+
return json.dumps({"logs": [...]})
|
| 193 |
+
|
| 194 |
+
# Wrap tools with compression
|
| 195 |
+
tools = [search_database, fetch_logs]
|
| 196 |
+
wrapped_tools = wrap_tools_with_headroom(
|
| 197 |
+
tools,
|
| 198 |
+
min_chars_to_compress=1000, # Only compress large outputs
|
| 199 |
+
)
|
| 200 |
+
|
| 201 |
+
# Create agent with wrapped tools
|
| 202 |
+
agent = create_openai_tools_agent(llm, wrapped_tools, prompt)
|
| 203 |
+
executor = AgentExecutor(agent=agent, tools=wrapped_tools)
|
| 204 |
+
|
| 205 |
+
# Tool outputs are automatically compressed
|
| 206 |
+
result = executor.invoke({"input": "Find users who logged in yesterday"})
|
| 207 |
+
```
|
| 208 |
+
|
| 209 |
+
**Per-tool metrics:**
|
| 210 |
+
|
| 211 |
+
```python
|
| 212 |
+
from headroom.integrations import get_tool_metrics
|
| 213 |
+
|
| 214 |
+
metrics = get_tool_metrics()
|
| 215 |
+
print(metrics.get_summary())
|
| 216 |
+
# {
|
| 217 |
+
# 'total_invocations': 25,
|
| 218 |
+
# 'total_compressions': 18,
|
| 219 |
+
# 'total_chars_saved': 450000,
|
| 220 |
+
# 'by_tool': {
|
| 221 |
+
# 'search_database': {'invocations': 15, 'chars_saved': 320000},
|
| 222 |
+
# 'fetch_logs': {'invocations': 10, 'chars_saved': 130000},
|
| 223 |
+
# }
|
| 224 |
+
# }
|
| 225 |
+
```
|
| 226 |
+
|
| 227 |
+
---
|
| 228 |
+
|
| 229 |
+
### 5. Streaming Metrics
|
| 230 |
+
|
| 231 |
+
Track output tokens during streaming:
|
| 232 |
+
|
| 233 |
+
```python
|
| 234 |
+
from headroom.integrations import StreamingMetricsTracker
|
| 235 |
+
|
| 236 |
+
tracker = StreamingMetricsTracker(model="gpt-4o")
|
| 237 |
+
|
| 238 |
+
for chunk in llm.stream("Write a poem about coding"):
|
| 239 |
+
tracker.add_chunk(chunk)
|
| 240 |
+
print(chunk.content, end="", flush=True)
|
| 241 |
+
|
| 242 |
+
metrics = tracker.finish()
|
| 243 |
+
print(f"\nOutput tokens: {metrics.output_tokens}")
|
| 244 |
+
print(f"Duration: {metrics.duration_ms:.0f}ms")
|
| 245 |
+
```
|
| 246 |
+
|
| 247 |
+
**Context manager style:**
|
| 248 |
+
|
| 249 |
+
```python
|
| 250 |
+
from headroom.integrations import StreamingMetricsCallback
|
| 251 |
+
|
| 252 |
+
with StreamingMetricsCallback(model="gpt-4o") as tracker:
|
| 253 |
+
for chunk in llm.stream(messages):
|
| 254 |
+
tracker.add_chunk(chunk)
|
| 255 |
+
print(chunk.content, end="")
|
| 256 |
+
|
| 257 |
+
print(f"Metrics: {tracker.metrics}")
|
| 258 |
+
```
|
| 259 |
+
|
| 260 |
+
---
|
| 261 |
+
|
| 262 |
+
### 6. LangSmith Integration
|
| 263 |
+
|
| 264 |
+
Add Headroom metrics to LangSmith traces:
|
| 265 |
+
|
| 266 |
+
```python
|
| 267 |
+
from headroom.integrations import HeadroomLangSmithCallbackHandler
|
| 268 |
+
|
| 269 |
+
# Create callback handler
|
| 270 |
+
langsmith_handler = HeadroomLangSmithCallbackHandler()
|
| 271 |
+
|
| 272 |
+
# Use with your LLM
|
| 273 |
+
llm = HeadroomChatModel(
|
| 274 |
+
ChatOpenAI(model="gpt-4o"),
|
| 275 |
+
callbacks=[langsmith_handler],
|
| 276 |
+
)
|
| 277 |
+
|
| 278 |
+
# After calls, metrics appear in LangSmith traces:
|
| 279 |
+
# - headroom.tokens_before
|
| 280 |
+
# - headroom.tokens_after
|
| 281 |
+
# - headroom.tokens_saved
|
| 282 |
+
# - headroom.compression_ratio
|
| 283 |
+
```
|
| 284 |
+
|
| 285 |
+
---
|
| 286 |
+
|
| 287 |
+
## Real-World Examples
|
| 288 |
+
|
| 289 |
+
### Example 1: LangGraph ReAct Agent
|
| 290 |
+
|
| 291 |
+
The ReAct pattern is the most common agent architecture. Here's how to optimize it:
|
| 292 |
+
|
| 293 |
+
```python
|
| 294 |
+
from langchain_openai import ChatOpenAI
|
| 295 |
+
from langchain_core.tools import tool
|
| 296 |
+
from langgraph.prebuilt import create_react_agent
|
| 297 |
+
from headroom.integrations import HeadroomChatModel, wrap_tools_with_headroom
|
| 298 |
+
|
| 299 |
+
# Define tools that return large outputs
|
| 300 |
+
@tool
|
| 301 |
+
def search_web(query: str) -> str:
|
| 302 |
+
"""Search the web for information."""
|
| 303 |
+
# Simulating large search results
|
| 304 |
+
return json.dumps({
|
| 305 |
+
"results": [
|
| 306 |
+
{"title": f"Result {i}", "snippet": "..." * 100, "url": f"https://..."}
|
| 307 |
+
for i in range(100)
|
| 308 |
+
],
|
| 309 |
+
"total": 1000,
|
| 310 |
+
})
|
| 311 |
+
|
| 312 |
+
@tool
|
| 313 |
+
def query_database(sql: str) -> str:
|
| 314 |
+
"""Execute SQL query."""
|
| 315 |
+
return json.dumps({
|
| 316 |
+
"rows": [{"id": i, "data": "..." * 50} for i in range(500)],
|
| 317 |
+
"total": 500,
|
| 318 |
+
})
|
| 319 |
+
|
| 320 |
+
# Wrap model with Headroom
|
| 321 |
+
llm = HeadroomChatModel(ChatOpenAI(model="gpt-4o"))
|
| 322 |
+
|
| 323 |
+
# Wrap tools with compression
|
| 324 |
+
tools = wrap_tools_with_headroom([search_web, query_database])
|
| 325 |
+
|
| 326 |
+
# Create ReAct agent
|
| 327 |
+
agent = create_react_agent(llm, tools)
|
| 328 |
+
|
| 329 |
+
# Run - tool outputs are automatically compressed between iterations
|
| 330 |
+
result = agent.invoke({
|
| 331 |
+
"messages": [("user", "Find all users who signed up last week and their activity")]
|
| 332 |
+
})
|
| 333 |
+
|
| 334 |
+
# Check savings
|
| 335 |
+
print(f"Tokens saved: {llm.get_metrics()['tokens_saved']}")
|
| 336 |
+
```
|
| 337 |
+
|
| 338 |
+
**Without Headroom**: Each tool call adds 10-50K tokens to context.
|
| 339 |
+
**With Headroom**: Tool outputs compressed to 1-2K tokens, agent runs faster and cheaper.
|
| 340 |
+
|
| 341 |
+
---
|
| 342 |
+
|
| 343 |
+
### Example 2: RAG Pipeline with Document Filtering
|
| 344 |
+
|
| 345 |
+
```python
|
| 346 |
+
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
|
| 347 |
+
from langchain_community.vectorstores import Chroma
|
| 348 |
+
from langchain.chains import RetrievalQA
|
| 349 |
+
from langchain.retrievers import ContextualCompressionRetriever
|
| 350 |
+
from headroom.integrations import HeadroomChatModel, HeadroomDocumentCompressor
|
| 351 |
+
|
| 352 |
+
# Setup vector store
|
| 353 |
+
embeddings = OpenAIEmbeddings()
|
| 354 |
+
vectorstore = Chroma.from_documents(documents, embeddings)
|
| 355 |
+
|
| 356 |
+
# High-recall retriever (get many candidates)
|
| 357 |
+
base_retriever = vectorstore.as_retriever(search_kwargs={"k": 50})
|
| 358 |
+
|
| 359 |
+
# Headroom compressor for precision
|
| 360 |
+
compressor = HeadroomDocumentCompressor(
|
| 361 |
+
max_documents=5, # Keep only top 5
|
| 362 |
+
min_relevance=0.4, # Must be 40%+ relevant
|
| 363 |
+
prefer_diverse=True, # Avoid redundant docs
|
| 364 |
+
)
|
| 365 |
+
|
| 366 |
+
# Combine into compression retriever
|
| 367 |
+
retriever = ContextualCompressionRetriever(
|
| 368 |
+
base_compressor=compressor,
|
| 369 |
+
base_retriever=base_retriever,
|
| 370 |
+
)
|
| 371 |
+
|
| 372 |
+
# Wrap LLM
|
| 373 |
+
llm = HeadroomChatModel(ChatOpenAI(model="gpt-4o"))
|
| 374 |
+
|
| 375 |
+
# Create QA chain
|
| 376 |
+
qa_chain = RetrievalQA.from_chain_type(
|
| 377 |
+
llm=llm,
|
| 378 |
+
retriever=retriever,
|
| 379 |
+
return_source_documents=True,
|
| 380 |
+
)
|
| 381 |
+
|
| 382 |
+
# Query - retrieves 50 docs, uses best 5
|
| 383 |
+
result = qa_chain.invoke({"query": "How do I configure authentication?"})
|
| 384 |
+
print(f"Answer: {result['result']}")
|
| 385 |
+
print(f"Sources: {len(result['source_documents'])} docs")
|
| 386 |
+
```
|
| 387 |
+
|
| 388 |
+
**Impact**:
|
| 389 |
+
- Without filtering: 50 docs × ~500 tokens = 25K context tokens
|
| 390 |
+
- With Headroom: 5 docs × ~500 tokens = 2.5K context tokens (90% reduction)
|
| 391 |
+
|
| 392 |
+
---
|
| 393 |
+
|
| 394 |
+
### Example 3: Conversational Agent with Memory
|
| 395 |
+
|
| 396 |
+
```python
|
| 397 |
+
from langchain_openai import ChatOpenAI
|
| 398 |
+
from langchain.memory import ConversationBufferMemory
|
| 399 |
+
from langchain_community.chat_message_histories import ChatMessageHistory
|
| 400 |
+
from langchain.chains import ConversationChain
|
| 401 |
+
from headroom.integrations import HeadroomChatModel, HeadroomChatMessageHistory
|
| 402 |
+
|
| 403 |
+
# Wrap LLM
|
| 404 |
+
llm = HeadroomChatModel(ChatOpenAI(model="gpt-4o"))
|
| 405 |
+
|
| 406 |
+
# Wrap memory with auto-compression
|
| 407 |
+
base_history = ChatMessageHistory()
|
| 408 |
+
compressed_history = HeadroomChatMessageHistory(
|
| 409 |
+
base_history,
|
| 410 |
+
compress_threshold_tokens=8000, # Compress when over 8K
|
| 411 |
+
keep_recent_turns=10, # Always keep last 10 turns
|
| 412 |
+
)
|
| 413 |
+
|
| 414 |
+
memory = ConversationBufferMemory(
|
| 415 |
+
chat_memory=compressed_history,
|
| 416 |
+
return_messages=True,
|
| 417 |
+
)
|
| 418 |
+
|
| 419 |
+
# Create conversation chain
|
| 420 |
+
chain = ConversationChain(llm=llm, memory=memory)
|
| 421 |
+
|
| 422 |
+
# Long conversation - memory auto-compresses
|
| 423 |
+
for i in range(100):
|
| 424 |
+
response = chain.invoke({"input": f"Tell me about topic {i}"})
|
| 425 |
+
print(f"Turn {i}: {len(response['response'])} chars")
|
| 426 |
+
|
| 427 |
+
# Check memory stats
|
| 428 |
+
print(compressed_history.get_compression_stats())
|
| 429 |
+
# {'compression_count': 8, 'total_tokens_saved': 45000}
|
| 430 |
+
```
|
| 431 |
+
|
| 432 |
+
**Impact**: Without compression, 100-turn conversation = 100K+ tokens. With HeadroomChatMessageHistory, it stays under 8K tokens while preserving recent context.
|
| 433 |
+
|
| 434 |
+
---
|
| 435 |
+
|
| 436 |
+
### Example 4: Multi-Tool Research Agent
|
| 437 |
+
|
| 438 |
+
```python
|
| 439 |
+
from langchain_openai import ChatOpenAI
|
| 440 |
+
from langchain.agents import AgentExecutor, create_openai_tools_agent
|
| 441 |
+
from langchain_core.prompts import ChatPromptTemplate
|
| 442 |
+
from langchain_core.tools import tool
|
| 443 |
+
from headroom.integrations import (
|
| 444 |
+
HeadroomChatModel,
|
| 445 |
+
wrap_tools_with_headroom,
|
| 446 |
+
get_tool_metrics,
|
| 447 |
+
reset_tool_metrics,
|
| 448 |
+
)
|
| 449 |
+
|
| 450 |
+
@tool
|
| 451 |
+
def search_arxiv(query: str) -> str:
|
| 452 |
+
"""Search arXiv for papers."""
|
| 453 |
+
return json.dumps({"papers": [{"title": f"Paper {i}", "abstract": "..." * 200} for i in range(50)]})
|
| 454 |
+
|
| 455 |
+
@tool
|
| 456 |
+
def search_github(query: str) -> str:
|
| 457 |
+
"""Search GitHub repositories."""
|
| 458 |
+
return json.dumps({"repos": [{"name": f"repo-{i}", "description": "..." * 100, "stars": i * 100} for i in range(100)]})
|
| 459 |
+
|
| 460 |
+
@tool
|
| 461 |
+
def fetch_documentation(url: str) -> str:
|
| 462 |
+
"""Fetch documentation from URL."""
|
| 463 |
+
return "..." * 5000 # Large doc content
|
| 464 |
+
|
| 465 |
+
# Wrap everything
|
| 466 |
+
llm = HeadroomChatModel(ChatOpenAI(model="gpt-4o"))
|
| 467 |
+
tools = wrap_tools_with_headroom([search_arxiv, search_github, fetch_documentation])
|
| 468 |
+
|
| 469 |
+
prompt = ChatPromptTemplate.from_messages([
|
| 470 |
+
("system", "You are a research assistant. Use tools to gather information."),
|
| 471 |
+
("human", "{input}"),
|
| 472 |
+
("placeholder", "{agent_scratchpad}"),
|
| 473 |
+
])
|
| 474 |
+
|
| 475 |
+
agent = create_openai_tools_agent(llm, tools, prompt)
|
| 476 |
+
executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
|
| 477 |
+
|
| 478 |
+
# Reset metrics for this session
|
| 479 |
+
reset_tool_metrics()
|
| 480 |
+
|
| 481 |
+
# Run complex research task
|
| 482 |
+
result = executor.invoke({
|
| 483 |
+
"input": "Research the latest advances in LLM context compression and find relevant GitHub projects"
|
| 484 |
+
})
|
| 485 |
+
|
| 486 |
+
# Check per-tool metrics
|
| 487 |
+
metrics = get_tool_metrics().get_summary()
|
| 488 |
+
print(f"Total chars saved: {metrics['total_chars_saved']:,}")
|
| 489 |
+
print(f"Per-tool breakdown: {metrics['by_tool']}")
|
| 490 |
+
```
|
| 491 |
+
|
| 492 |
+
---
|
| 493 |
+
|
| 494 |
+
## Configuration Options
|
| 495 |
+
|
| 496 |
+
### HeadroomChatModel
|
| 497 |
+
|
| 498 |
+
```python
|
| 499 |
+
HeadroomChatModel(
|
| 500 |
+
wrapped_model, # Any LangChain BaseChatModel
|
| 501 |
+
headroom_config=HeadroomConfig(), # Headroom configuration
|
| 502 |
+
auto_detect_provider=True, # Auto-detect from wrapped model
|
| 503 |
+
)
|
| 504 |
+
```
|
| 505 |
+
|
| 506 |
+
### HeadroomChatMessageHistory
|
| 507 |
+
|
| 508 |
+
```python
|
| 509 |
+
HeadroomChatMessageHistory(
|
| 510 |
+
base_history, # Any BaseChatMessageHistory
|
| 511 |
+
compress_threshold_tokens=4000, # Token threshold for compression
|
| 512 |
+
keep_recent_turns=5, # Minimum turns to preserve
|
| 513 |
+
model="gpt-4o", # Model for token counting
|
| 514 |
+
)
|
| 515 |
+
```
|
| 516 |
+
|
| 517 |
+
### HeadroomDocumentCompressor
|
| 518 |
+
|
| 519 |
+
```python
|
| 520 |
+
HeadroomDocumentCompressor(
|
| 521 |
+
max_documents=10, # Maximum docs to return
|
| 522 |
+
min_relevance=0.0, # Minimum relevance score (0-1)
|
| 523 |
+
prefer_diverse=False, # Use MMR for diversity
|
| 524 |
+
)
|
| 525 |
+
```
|
| 526 |
+
|
| 527 |
+
### wrap_tools_with_headroom
|
| 528 |
+
|
| 529 |
+
```python
|
| 530 |
+
wrap_tools_with_headroom(
|
| 531 |
+
tools, # List of LangChain tools
|
| 532 |
+
min_chars_to_compress=1000, # Minimum output size
|
| 533 |
+
smart_crusher_config=None, # SmartCrusher configuration
|
| 534 |
+
)
|
| 535 |
+
```
|
| 536 |
+
|
| 537 |
+
---
|
| 538 |
+
|
| 539 |
+
## Import Reference
|
| 540 |
+
|
| 541 |
+
```python
|
| 542 |
+
from headroom.integrations import (
|
| 543 |
+
# Chat Model
|
| 544 |
+
HeadroomChatModel,
|
| 545 |
+
|
| 546 |
+
# Memory
|
| 547 |
+
HeadroomChatMessageHistory,
|
| 548 |
+
|
| 549 |
+
# Retrievers
|
| 550 |
+
HeadroomDocumentCompressor,
|
| 551 |
+
|
| 552 |
+
# Agents
|
| 553 |
+
HeadroomToolWrapper,
|
| 554 |
+
wrap_tools_with_headroom,
|
| 555 |
+
get_tool_metrics,
|
| 556 |
+
reset_tool_metrics,
|
| 557 |
+
|
| 558 |
+
# Streaming
|
| 559 |
+
StreamingMetricsTracker,
|
| 560 |
+
StreamingMetricsCallback,
|
| 561 |
+
track_streaming_response,
|
| 562 |
+
|
| 563 |
+
# LangSmith
|
| 564 |
+
HeadroomLangSmithCallbackHandler,
|
| 565 |
+
|
| 566 |
+
# Provider Detection
|
| 567 |
+
detect_provider,
|
| 568 |
+
get_headroom_provider,
|
| 569 |
+
)
|
| 570 |
+
|
| 571 |
+
# Or import from subpackage directly
|
| 572 |
+
from headroom.integrations.langchain import HeadroomChatModel
|
| 573 |
+
from headroom.integrations.langchain.memory import HeadroomChatMessageHistory
|
| 574 |
+
```
|
| 575 |
+
|
| 576 |
+
---
|
| 577 |
+
|
| 578 |
+
## Troubleshooting
|
| 579 |
+
|
| 580 |
+
### LangChain not detected
|
| 581 |
+
|
| 582 |
+
```python
|
| 583 |
+
from headroom.integrations import langchain_available
|
| 584 |
+
|
| 585 |
+
if not langchain_available():
|
| 586 |
+
print("Install with: pip install headroom-ai[langchain]")
|
| 587 |
+
```
|
| 588 |
+
|
| 589 |
+
### Provider detection failing
|
| 590 |
+
|
| 591 |
+
```python
|
| 592 |
+
# Force a specific provider
|
| 593 |
+
from headroom.providers import AnthropicProvider
|
| 594 |
+
|
| 595 |
+
llm = HeadroomChatModel(
|
| 596 |
+
ChatAnthropic(model="claude-3-5-sonnet-20241022"),
|
| 597 |
+
auto_detect_provider=False,
|
| 598 |
+
)
|
| 599 |
+
llm._provider = AnthropicProvider()
|
| 600 |
+
```
|
| 601 |
+
|
| 602 |
+
### Memory not compressing
|
| 603 |
+
|
| 604 |
+
Check that your message count exceeds the threshold:
|
| 605 |
+
|
| 606 |
+
```python
|
| 607 |
+
history = HeadroomChatMessageHistory(
|
| 608 |
+
base_history,
|
| 609 |
+
compress_threshold_tokens=1000, # Lower threshold
|
| 610 |
+
keep_recent_turns=2, # Fewer preserved turns
|
| 611 |
+
)
|
| 612 |
+
```
|
| 613 |
+
|
| 614 |
+
---
|
| 615 |
+
|
| 616 |
+
## Performance Tips
|
| 617 |
+
|
| 618 |
+
1. **Use tool wrapping for agents** - Agents with tools benefit most from compression
|
| 619 |
+
2. **Set appropriate thresholds** - Don't compress small conversations
|
| 620 |
+
3. **Enable diversity for RAG** - `prefer_diverse=True` improves answer quality
|
| 621 |
+
4. **Monitor with LangSmith** - Use the callback handler to track savings over time
|
| 622 |
+
5. **Batch similar requests** - Provider caching works better with stable prefixes
|
headroom/cache/compression_store.py
CHANGED
|
@@ -292,7 +292,8 @@ class CompressionStore:
|
|
| 292 |
tool_signature_hash=entry.tool_signature_hash,
|
| 293 |
)
|
| 294 |
|
| 295 |
-
# CRITICAL: Make a deep copy to return
|
|
|
|
| 296 |
# The entry contains mutable fields (search_queries list) that must be copied
|
| 297 |
result_entry = replace(entry, search_queries=list(entry.search_queries))
|
| 298 |
|
|
|
|
| 292 |
tool_signature_hash=entry.tool_signature_hash,
|
| 293 |
)
|
| 294 |
|
| 295 |
+
# CRITICAL: Make a deep copy to return
|
| 296 |
+
# (entry could be modified/evicted after lock release)
|
| 297 |
# The entry contains mutable fields (search_queries list) that must be copied
|
| 298 |
result_entry = replace(entry, search_queries=list(entry.search_queries))
|
| 299 |
|
headroom/cache/dynamic_detector.py
CHANGED
|
@@ -588,13 +588,19 @@ class NERDetector:
|
|
| 588 |
self._load_error: str | None = None
|
| 589 |
|
| 590 |
if not _SPACY_AVAILABLE:
|
| 591 |
-
self._load_error =
|
|
|
|
|
|
|
|
|
|
| 592 |
return
|
| 593 |
|
| 594 |
try:
|
| 595 |
self._nlp = spacy.load(config.spacy_model)
|
| 596 |
except OSError:
|
| 597 |
-
self._load_error =
|
|
|
|
|
|
|
|
|
|
| 598 |
|
| 599 |
@property
|
| 600 |
def is_available(self) -> bool:
|
|
@@ -704,7 +710,10 @@ class SemanticDetector:
|
|
| 704 |
self._load_error: str | None = None
|
| 705 |
|
| 706 |
if not _SENTENCE_TRANSFORMERS_AVAILABLE:
|
| 707 |
-
self._load_error =
|
|
|
|
|
|
|
|
|
|
| 708 |
return
|
| 709 |
|
| 710 |
try:
|
|
|
|
| 588 |
self._load_error: str | None = None
|
| 589 |
|
| 590 |
if not _SPACY_AVAILABLE:
|
| 591 |
+
self._load_error = (
|
| 592 |
+
"spaCy not installed. Install with: "
|
| 593 |
+
"pip install spacy && python -m spacy download en_core_web_sm"
|
| 594 |
+
)
|
| 595 |
return
|
| 596 |
|
| 597 |
try:
|
| 598 |
self._nlp = spacy.load(config.spacy_model)
|
| 599 |
except OSError:
|
| 600 |
+
self._load_error = (
|
| 601 |
+
f"spaCy model '{config.spacy_model}' not found. "
|
| 602 |
+
f"Install with: python -m spacy download {config.spacy_model}"
|
| 603 |
+
)
|
| 604 |
|
| 605 |
@property
|
| 606 |
def is_available(self) -> bool:
|
|
|
|
| 710 |
self._load_error: str | None = None
|
| 711 |
|
| 712 |
if not _SENTENCE_TRANSFORMERS_AVAILABLE:
|
| 713 |
+
self._load_error = (
|
| 714 |
+
"sentence-transformers not installed. "
|
| 715 |
+
"Install with: pip install sentence-transformers"
|
| 716 |
+
)
|
| 717 |
return
|
| 718 |
|
| 719 |
try:
|
headroom/ccr/mcp_server.py
CHANGED
|
@@ -109,9 +109,10 @@ class CCRMCPServer:
|
|
| 109 |
Tool(
|
| 110 |
name=CCR_TOOL_NAME,
|
| 111 |
description=(
|
| 112 |
-
"Retrieve original uncompressed content that was compressed
|
| 113 |
-
"Use this when you need more data than what's
|
| 114 |
-
"The hash is provided in
|
|
|
|
| 115 |
),
|
| 116 |
inputSchema={
|
| 117 |
"type": "object",
|
|
|
|
| 109 |
Tool(
|
| 110 |
name=CCR_TOOL_NAME,
|
| 111 |
description=(
|
| 112 |
+
"Retrieve original uncompressed content that was compressed "
|
| 113 |
+
"to save tokens. Use this when you need more data than what's "
|
| 114 |
+
"shown in compressed tool results. The hash is provided in "
|
| 115 |
+
"compression markers like [N items compressed... hash=abc123]."
|
| 116 |
),
|
| 117 |
inputSchema={
|
| 118 |
"type": "object",
|
headroom/integrations/__init__.py
CHANGED
|
@@ -1,18 +1,69 @@
|
|
| 1 |
"""Headroom integrations with popular LLM frameworks.
|
| 2 |
|
| 3 |
Available integrations:
|
| 4 |
-
- LangChain: HeadroomChatModel, HeadroomCallbackHandler, optimize_messages
|
| 5 |
-
- MCP: HeadroomMCPCompressor, compress_tool_result, HeadroomMCPClientWrapper
|
| 6 |
|
| 7 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
"""
|
| 9 |
|
|
|
|
| 10 |
from .langchain import (
|
|
|
|
|
|
|
|
|
|
| 11 |
HeadroomCallbackHandler,
|
|
|
|
|
|
|
| 12 |
HeadroomChatModel,
|
|
|
|
|
|
|
|
|
|
| 13 |
HeadroomRunnable,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 |
optimize_messages,
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
)
|
|
|
|
|
|
|
| 16 |
from .mcp import (
|
| 17 |
DEFAULT_MCP_PROFILES,
|
| 18 |
HeadroomMCPClientWrapper,
|
|
@@ -25,11 +76,39 @@ from .mcp import (
|
|
| 25 |
)
|
| 26 |
|
| 27 |
__all__ = [
|
| 28 |
-
# LangChain
|
| 29 |
"HeadroomChatModel",
|
| 30 |
"HeadroomCallbackHandler",
|
| 31 |
-
"optimize_messages",
|
| 32 |
"HeadroomRunnable",
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
# MCP
|
| 34 |
"HeadroomMCPCompressor",
|
| 35 |
"HeadroomMCPClientWrapper",
|
|
|
|
| 1 |
"""Headroom integrations with popular LLM frameworks.
|
| 2 |
|
| 3 |
Available integrations:
|
|
|
|
|
|
|
| 4 |
|
| 5 |
+
LangChain (pip install headroom[langchain]):
|
| 6 |
+
- HeadroomChatModel: Drop-in wrapper for any LangChain chat model
|
| 7 |
+
- HeadroomChatMessageHistory: Automatic conversation compression
|
| 8 |
+
- HeadroomDocumentCompressor: Relevance-based document filtering
|
| 9 |
+
- HeadroomToolWrapper: Tool output compression for agents
|
| 10 |
+
- StreamingMetricsTracker: Token counting during streaming
|
| 11 |
+
- HeadroomLangSmithCallbackHandler: LangSmith trace enrichment
|
| 12 |
+
|
| 13 |
+
MCP (Model Context Protocol):
|
| 14 |
+
- HeadroomMCPCompressor: Compress MCP tool results
|
| 15 |
+
- compress_tool_result: Simple function for tool compression
|
| 16 |
+
|
| 17 |
+
Example:
|
| 18 |
+
# LangChain integration
|
| 19 |
+
from headroom.integrations import HeadroomChatModel
|
| 20 |
+
# or explicitly:
|
| 21 |
+
from headroom.integrations.langchain import HeadroomChatModel
|
| 22 |
+
|
| 23 |
+
# MCP integration
|
| 24 |
+
from headroom.integrations import compress_tool_result
|
| 25 |
+
# or explicitly:
|
| 26 |
+
from headroom.integrations.mcp import compress_tool_result
|
| 27 |
"""
|
| 28 |
|
| 29 |
+
# Re-export from langchain subpackage for backwards compatibility
|
| 30 |
from .langchain import (
|
| 31 |
+
# Retrievers
|
| 32 |
+
CompressionMetrics,
|
| 33 |
+
# Core
|
| 34 |
HeadroomCallbackHandler,
|
| 35 |
+
# Memory
|
| 36 |
+
HeadroomChatMessageHistory,
|
| 37 |
HeadroomChatModel,
|
| 38 |
+
HeadroomDocumentCompressor,
|
| 39 |
+
# LangSmith
|
| 40 |
+
HeadroomLangSmithCallbackHandler,
|
| 41 |
HeadroomRunnable,
|
| 42 |
+
# Agents
|
| 43 |
+
HeadroomToolWrapper,
|
| 44 |
+
OptimizationMetrics,
|
| 45 |
+
# Streaming
|
| 46 |
+
StreamingMetrics,
|
| 47 |
+
StreamingMetricsCallback,
|
| 48 |
+
StreamingMetricsTracker,
|
| 49 |
+
ToolCompressionMetrics,
|
| 50 |
+
ToolMetricsCollector,
|
| 51 |
+
# Provider Detection
|
| 52 |
+
detect_provider,
|
| 53 |
+
get_headroom_provider,
|
| 54 |
+
get_model_name_from_langchain,
|
| 55 |
+
get_tool_metrics,
|
| 56 |
+
is_langsmith_available,
|
| 57 |
+
is_langsmith_tracing_enabled,
|
| 58 |
+
langchain_available,
|
| 59 |
optimize_messages,
|
| 60 |
+
reset_tool_metrics,
|
| 61 |
+
track_async_streaming_response,
|
| 62 |
+
track_streaming_response,
|
| 63 |
+
wrap_tools_with_headroom,
|
| 64 |
)
|
| 65 |
+
|
| 66 |
+
# Re-export from mcp subpackage for backwards compatibility
|
| 67 |
from .mcp import (
|
| 68 |
DEFAULT_MCP_PROFILES,
|
| 69 |
HeadroomMCPClientWrapper,
|
|
|
|
| 76 |
)
|
| 77 |
|
| 78 |
__all__ = [
|
| 79 |
+
# LangChain Core
|
| 80 |
"HeadroomChatModel",
|
| 81 |
"HeadroomCallbackHandler",
|
|
|
|
| 82 |
"HeadroomRunnable",
|
| 83 |
+
"OptimizationMetrics",
|
| 84 |
+
"optimize_messages",
|
| 85 |
+
"langchain_available",
|
| 86 |
+
# Provider Detection
|
| 87 |
+
"detect_provider",
|
| 88 |
+
"get_headroom_provider",
|
| 89 |
+
"get_model_name_from_langchain",
|
| 90 |
+
# Memory
|
| 91 |
+
"HeadroomChatMessageHistory",
|
| 92 |
+
# Retrievers
|
| 93 |
+
"HeadroomDocumentCompressor",
|
| 94 |
+
"CompressionMetrics",
|
| 95 |
+
# Agents
|
| 96 |
+
"HeadroomToolWrapper",
|
| 97 |
+
"ToolCompressionMetrics",
|
| 98 |
+
"ToolMetricsCollector",
|
| 99 |
+
"wrap_tools_with_headroom",
|
| 100 |
+
"get_tool_metrics",
|
| 101 |
+
"reset_tool_metrics",
|
| 102 |
+
# LangSmith
|
| 103 |
+
"HeadroomLangSmithCallbackHandler",
|
| 104 |
+
"is_langsmith_available",
|
| 105 |
+
"is_langsmith_tracing_enabled",
|
| 106 |
+
# Streaming
|
| 107 |
+
"StreamingMetricsTracker",
|
| 108 |
+
"StreamingMetricsCallback",
|
| 109 |
+
"StreamingMetrics",
|
| 110 |
+
"track_streaming_response",
|
| 111 |
+
"track_async_streaming_response",
|
| 112 |
# MCP
|
| 113 |
"HeadroomMCPCompressor",
|
| 114 |
"HeadroomMCPClientWrapper",
|
headroom/integrations/langchain/__init__.py
ADDED
|
@@ -0,0 +1,106 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""LangChain integration for Headroom.
|
| 2 |
+
|
| 3 |
+
This package provides seamless integration with LangChain, including:
|
| 4 |
+
- HeadroomChatModel: Drop-in wrapper for any LangChain chat model
|
| 5 |
+
- HeadroomChatMessageHistory: Automatic conversation compression
|
| 6 |
+
- HeadroomDocumentCompressor: Relevance-based document filtering
|
| 7 |
+
- HeadroomToolWrapper: Tool output compression for agents
|
| 8 |
+
- StreamingMetricsTracker: Token counting during streaming
|
| 9 |
+
- HeadroomLangSmithCallbackHandler: LangSmith trace enrichment
|
| 10 |
+
|
| 11 |
+
Example:
|
| 12 |
+
from langchain_openai import ChatOpenAI
|
| 13 |
+
from headroom.integrations.langchain import HeadroomChatModel
|
| 14 |
+
|
| 15 |
+
# Wrap any LangChain model
|
| 16 |
+
llm = HeadroomChatModel(ChatOpenAI(model="gpt-4o"))
|
| 17 |
+
|
| 18 |
+
# Use like normal - optimization happens automatically
|
| 19 |
+
response = llm.invoke("Hello!")
|
| 20 |
+
|
| 21 |
+
Install: pip install headroom[langchain]
|
| 22 |
+
"""
|
| 23 |
+
|
| 24 |
+
# Core chat model wrapper
|
| 25 |
+
# Agent tool wrapping
|
| 26 |
+
from .agents import (
|
| 27 |
+
HeadroomToolWrapper,
|
| 28 |
+
ToolCompressionMetrics,
|
| 29 |
+
ToolMetricsCollector,
|
| 30 |
+
get_tool_metrics,
|
| 31 |
+
reset_tool_metrics,
|
| 32 |
+
wrap_tools_with_headroom,
|
| 33 |
+
)
|
| 34 |
+
from .chat_model import (
|
| 35 |
+
HeadroomCallbackHandler,
|
| 36 |
+
HeadroomChatModel,
|
| 37 |
+
HeadroomRunnable,
|
| 38 |
+
OptimizationMetrics,
|
| 39 |
+
langchain_available,
|
| 40 |
+
optimize_messages,
|
| 41 |
+
)
|
| 42 |
+
|
| 43 |
+
# LangSmith integration
|
| 44 |
+
from .langsmith import (
|
| 45 |
+
HeadroomLangSmithCallbackHandler,
|
| 46 |
+
is_langsmith_available,
|
| 47 |
+
is_langsmith_tracing_enabled,
|
| 48 |
+
)
|
| 49 |
+
|
| 50 |
+
# Memory integration
|
| 51 |
+
from .memory import HeadroomChatMessageHistory
|
| 52 |
+
|
| 53 |
+
# Provider auto-detection
|
| 54 |
+
from .providers import (
|
| 55 |
+
detect_provider,
|
| 56 |
+
get_headroom_provider,
|
| 57 |
+
get_model_name_from_langchain,
|
| 58 |
+
)
|
| 59 |
+
|
| 60 |
+
# Retriever integration
|
| 61 |
+
from .retriever import CompressionMetrics, HeadroomDocumentCompressor
|
| 62 |
+
|
| 63 |
+
# Streaming metrics
|
| 64 |
+
from .streaming import (
|
| 65 |
+
StreamingMetrics,
|
| 66 |
+
StreamingMetricsCallback,
|
| 67 |
+
StreamingMetricsTracker,
|
| 68 |
+
track_async_streaming_response,
|
| 69 |
+
track_streaming_response,
|
| 70 |
+
)
|
| 71 |
+
|
| 72 |
+
__all__ = [
|
| 73 |
+
# Core
|
| 74 |
+
"HeadroomChatModel",
|
| 75 |
+
"HeadroomCallbackHandler",
|
| 76 |
+
"HeadroomRunnable",
|
| 77 |
+
"OptimizationMetrics",
|
| 78 |
+
"optimize_messages",
|
| 79 |
+
"langchain_available",
|
| 80 |
+
# Provider Detection
|
| 81 |
+
"detect_provider",
|
| 82 |
+
"get_headroom_provider",
|
| 83 |
+
"get_model_name_from_langchain",
|
| 84 |
+
# Memory
|
| 85 |
+
"HeadroomChatMessageHistory",
|
| 86 |
+
# Retrievers
|
| 87 |
+
"HeadroomDocumentCompressor",
|
| 88 |
+
"CompressionMetrics",
|
| 89 |
+
# Agents
|
| 90 |
+
"HeadroomToolWrapper",
|
| 91 |
+
"ToolCompressionMetrics",
|
| 92 |
+
"ToolMetricsCollector",
|
| 93 |
+
"wrap_tools_with_headroom",
|
| 94 |
+
"get_tool_metrics",
|
| 95 |
+
"reset_tool_metrics",
|
| 96 |
+
# LangSmith
|
| 97 |
+
"HeadroomLangSmithCallbackHandler",
|
| 98 |
+
"is_langsmith_available",
|
| 99 |
+
"is_langsmith_tracing_enabled",
|
| 100 |
+
# Streaming
|
| 101 |
+
"StreamingMetricsTracker",
|
| 102 |
+
"StreamingMetricsCallback",
|
| 103 |
+
"StreamingMetrics",
|
| 104 |
+
"track_streaming_response",
|
| 105 |
+
"track_async_streaming_response",
|
| 106 |
+
]
|
headroom/integrations/langchain/agents.py
ADDED
|
@@ -0,0 +1,326 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Agent tool integration for LangChain with output compression.
|
| 2 |
+
|
| 3 |
+
This module provides HeadroomToolWrapper and wrap_tools_with_headroom
|
| 4 |
+
for wrapping LangChain tools to automatically compress their outputs
|
| 5 |
+
and track per-tool compression metrics.
|
| 6 |
+
|
| 7 |
+
Example:
|
| 8 |
+
from langchain.agents import create_openai_tools_agent
|
| 9 |
+
from langchain.tools import Tool
|
| 10 |
+
from headroom.integrations import wrap_tools_with_headroom
|
| 11 |
+
|
| 12 |
+
# Define tools
|
| 13 |
+
tools = [
|
| 14 |
+
Tool(name="search", func=search_func, description="Search"),
|
| 15 |
+
Tool(name="database", func=db_func, description="Query DB"),
|
| 16 |
+
]
|
| 17 |
+
|
| 18 |
+
# Wrap with Headroom compression
|
| 19 |
+
wrapped_tools = wrap_tools_with_headroom(tools)
|
| 20 |
+
|
| 21 |
+
# Use in agent - outputs are automatically compressed
|
| 22 |
+
agent = create_openai_tools_agent(llm, wrapped_tools, prompt)
|
| 23 |
+
"""
|
| 24 |
+
|
| 25 |
+
from __future__ import annotations
|
| 26 |
+
|
| 27 |
+
import logging
|
| 28 |
+
from dataclasses import dataclass, field
|
| 29 |
+
from datetime import datetime
|
| 30 |
+
from typing import Any
|
| 31 |
+
|
| 32 |
+
# LangChain imports - these are optional dependencies
|
| 33 |
+
try:
|
| 34 |
+
from langchain_core.tools import BaseTool, StructuredTool, Tool
|
| 35 |
+
|
| 36 |
+
LANGCHAIN_AVAILABLE = True
|
| 37 |
+
except ImportError:
|
| 38 |
+
LANGCHAIN_AVAILABLE = False
|
| 39 |
+
BaseTool = object # type: ignore[misc,assignment]
|
| 40 |
+
StructuredTool = object # type: ignore[misc,assignment]
|
| 41 |
+
Tool = object # type: ignore[misc,assignment]
|
| 42 |
+
|
| 43 |
+
from headroom.integrations.mcp import compress_tool_result
|
| 44 |
+
|
| 45 |
+
logger = logging.getLogger(__name__)
|
| 46 |
+
|
| 47 |
+
|
| 48 |
+
def _check_langchain_available() -> None:
|
| 49 |
+
"""Raise ImportError if LangChain is not installed."""
|
| 50 |
+
if not LANGCHAIN_AVAILABLE:
|
| 51 |
+
raise ImportError(
|
| 52 |
+
"LangChain is required for this integration. "
|
| 53 |
+
"Install with: pip install headroom[langchain] "
|
| 54 |
+
"or: pip install langchain-core"
|
| 55 |
+
)
|
| 56 |
+
|
| 57 |
+
|
| 58 |
+
@dataclass
|
| 59 |
+
class ToolCompressionMetrics:
|
| 60 |
+
"""Metrics from a single tool compression."""
|
| 61 |
+
|
| 62 |
+
tool_name: str
|
| 63 |
+
timestamp: datetime
|
| 64 |
+
chars_before: int
|
| 65 |
+
chars_after: int
|
| 66 |
+
chars_saved: int
|
| 67 |
+
compression_ratio: float
|
| 68 |
+
was_compressed: bool
|
| 69 |
+
|
| 70 |
+
|
| 71 |
+
@dataclass
|
| 72 |
+
class ToolMetricsCollector:
|
| 73 |
+
"""Collects compression metrics across all tool invocations."""
|
| 74 |
+
|
| 75 |
+
metrics: list[ToolCompressionMetrics] = field(default_factory=list)
|
| 76 |
+
|
| 77 |
+
def add(self, metric: ToolCompressionMetrics) -> None:
|
| 78 |
+
"""Add a metric entry."""
|
| 79 |
+
self.metrics.append(metric)
|
| 80 |
+
# Keep only last 1000
|
| 81 |
+
if len(self.metrics) > 1000:
|
| 82 |
+
self.metrics = self.metrics[-1000:]
|
| 83 |
+
|
| 84 |
+
def get_summary(self) -> dict[str, Any]:
|
| 85 |
+
"""Get summary statistics."""
|
| 86 |
+
if not self.metrics:
|
| 87 |
+
return {
|
| 88 |
+
"total_invocations": 0,
|
| 89 |
+
"total_compressions": 0,
|
| 90 |
+
"total_chars_saved": 0,
|
| 91 |
+
}
|
| 92 |
+
|
| 93 |
+
compressed = [m for m in self.metrics if m.was_compressed]
|
| 94 |
+
return {
|
| 95 |
+
"total_invocations": len(self.metrics),
|
| 96 |
+
"total_compressions": len(compressed),
|
| 97 |
+
"total_chars_saved": sum(m.chars_saved for m in self.metrics),
|
| 98 |
+
"average_compression_ratio": (
|
| 99 |
+
sum(m.compression_ratio for m in compressed) / len(compressed) if compressed else 0
|
| 100 |
+
),
|
| 101 |
+
"by_tool": self._get_by_tool_stats(),
|
| 102 |
+
}
|
| 103 |
+
|
| 104 |
+
def _get_by_tool_stats(self) -> dict[str, dict[str, Any]]:
|
| 105 |
+
"""Get per-tool statistics."""
|
| 106 |
+
by_tool: dict[str, list[ToolCompressionMetrics]] = {}
|
| 107 |
+
for m in self.metrics:
|
| 108 |
+
if m.tool_name not in by_tool:
|
| 109 |
+
by_tool[m.tool_name] = []
|
| 110 |
+
by_tool[m.tool_name].append(m)
|
| 111 |
+
|
| 112 |
+
result = {}
|
| 113 |
+
for name, tool_metrics in by_tool.items():
|
| 114 |
+
compressed = [m for m in tool_metrics if m.was_compressed]
|
| 115 |
+
result[name] = {
|
| 116 |
+
"invocations": len(tool_metrics),
|
| 117 |
+
"compressions": len(compressed),
|
| 118 |
+
"chars_saved": sum(m.chars_saved for m in tool_metrics),
|
| 119 |
+
}
|
| 120 |
+
return result
|
| 121 |
+
|
| 122 |
+
|
| 123 |
+
# Global metrics collector
|
| 124 |
+
_global_metrics = ToolMetricsCollector()
|
| 125 |
+
|
| 126 |
+
|
| 127 |
+
def get_tool_metrics() -> ToolMetricsCollector:
|
| 128 |
+
"""Get the global tool metrics collector."""
|
| 129 |
+
return _global_metrics
|
| 130 |
+
|
| 131 |
+
|
| 132 |
+
def reset_tool_metrics() -> None:
|
| 133 |
+
"""Reset global tool metrics."""
|
| 134 |
+
global _global_metrics
|
| 135 |
+
_global_metrics = ToolMetricsCollector()
|
| 136 |
+
|
| 137 |
+
|
| 138 |
+
class HeadroomToolWrapper:
|
| 139 |
+
"""Wraps a LangChain tool to compress its output.
|
| 140 |
+
|
| 141 |
+
Applies SmartCrusher compression to tool outputs, particularly
|
| 142 |
+
useful for tools that return large JSON arrays (search results,
|
| 143 |
+
database queries, etc.).
|
| 144 |
+
|
| 145 |
+
Example:
|
| 146 |
+
from langchain.tools import Tool
|
| 147 |
+
from headroom.integrations import HeadroomToolWrapper
|
| 148 |
+
|
| 149 |
+
def search(query: str) -> str:
|
| 150 |
+
# Returns large JSON with 1000 results
|
| 151 |
+
return json.dumps({"results": [...1000 items...]})
|
| 152 |
+
|
| 153 |
+
search_tool = Tool(name="search", func=search, description="Search")
|
| 154 |
+
wrapped = HeadroomToolWrapper(search_tool)
|
| 155 |
+
|
| 156 |
+
# Use wrapped tool - output automatically compressed
|
| 157 |
+
result = wrapped("python tutorials")
|
| 158 |
+
|
| 159 |
+
Attributes:
|
| 160 |
+
tool: The wrapped LangChain tool
|
| 161 |
+
min_chars_to_compress: Minimum output size to trigger compression
|
| 162 |
+
metrics_collector: Collector for compression metrics
|
| 163 |
+
"""
|
| 164 |
+
|
| 165 |
+
def __init__(
|
| 166 |
+
self,
|
| 167 |
+
tool: BaseTool,
|
| 168 |
+
min_chars_to_compress: int = 1000,
|
| 169 |
+
metrics_collector: ToolMetricsCollector | None = None,
|
| 170 |
+
):
|
| 171 |
+
"""Initialize HeadroomToolWrapper.
|
| 172 |
+
|
| 173 |
+
Args:
|
| 174 |
+
tool: The LangChain BaseTool to wrap.
|
| 175 |
+
min_chars_to_compress: Minimum character count for output
|
| 176 |
+
before compression is applied. Default 1000.
|
| 177 |
+
metrics_collector: Collector for metrics. Uses global
|
| 178 |
+
collector if not specified.
|
| 179 |
+
"""
|
| 180 |
+
_check_langchain_available()
|
| 181 |
+
|
| 182 |
+
self.tool = tool
|
| 183 |
+
self.min_chars_to_compress = min_chars_to_compress
|
| 184 |
+
self._metrics = metrics_collector or _global_metrics
|
| 185 |
+
|
| 186 |
+
# Copy tool metadata
|
| 187 |
+
self.name = tool.name
|
| 188 |
+
self.description = tool.description
|
| 189 |
+
|
| 190 |
+
def __call__(self, *args: Any, **kwargs: Any) -> str:
|
| 191 |
+
"""Invoke the tool and compress output.
|
| 192 |
+
|
| 193 |
+
Args:
|
| 194 |
+
*args: Arguments to pass to the tool.
|
| 195 |
+
**kwargs: Keyword arguments to pass to the tool.
|
| 196 |
+
|
| 197 |
+
Returns:
|
| 198 |
+
Compressed tool output as string.
|
| 199 |
+
"""
|
| 200 |
+
# Invoke underlying tool
|
| 201 |
+
result = self.tool.invoke(*args, **kwargs)
|
| 202 |
+
|
| 203 |
+
# Convert to string if needed
|
| 204 |
+
if not isinstance(result, str):
|
| 205 |
+
result = str(result)
|
| 206 |
+
|
| 207 |
+
# Check if compression is needed
|
| 208 |
+
if len(result) < self.min_chars_to_compress:
|
| 209 |
+
self._record_metrics(result, result, was_compressed=False)
|
| 210 |
+
return result
|
| 211 |
+
|
| 212 |
+
# Try to compress
|
| 213 |
+
compressed = self._compress_output(result)
|
| 214 |
+
self._record_metrics(result, compressed, was_compressed=True)
|
| 215 |
+
|
| 216 |
+
return compressed
|
| 217 |
+
|
| 218 |
+
def invoke(self, *args: Any, **kwargs: Any) -> str:
|
| 219 |
+
"""Invoke the tool (alias for __call__)."""
|
| 220 |
+
return self(*args, **kwargs)
|
| 221 |
+
|
| 222 |
+
def _compress_output(self, output: str) -> str:
|
| 223 |
+
"""Apply compression to tool output.
|
| 224 |
+
|
| 225 |
+
Args:
|
| 226 |
+
output: Tool output string.
|
| 227 |
+
|
| 228 |
+
Returns:
|
| 229 |
+
Compressed output.
|
| 230 |
+
"""
|
| 231 |
+
try:
|
| 232 |
+
return compress_tool_result(
|
| 233 |
+
content=output,
|
| 234 |
+
tool_name=self.name,
|
| 235 |
+
)
|
| 236 |
+
except Exception as e:
|
| 237 |
+
logger.debug(f"Tool compression failed: {e}")
|
| 238 |
+
return output
|
| 239 |
+
|
| 240 |
+
def _record_metrics(self, original: str, compressed: str, was_compressed: bool) -> None:
|
| 241 |
+
"""Record compression metrics.
|
| 242 |
+
|
| 243 |
+
Args:
|
| 244 |
+
original: Original output.
|
| 245 |
+
compressed: Compressed output.
|
| 246 |
+
was_compressed: Whether compression was applied.
|
| 247 |
+
"""
|
| 248 |
+
chars_before = len(original)
|
| 249 |
+
chars_after = len(compressed)
|
| 250 |
+
chars_saved = chars_before - chars_after
|
| 251 |
+
|
| 252 |
+
metric = ToolCompressionMetrics(
|
| 253 |
+
tool_name=self.name,
|
| 254 |
+
timestamp=datetime.now(),
|
| 255 |
+
chars_before=chars_before,
|
| 256 |
+
chars_after=chars_after,
|
| 257 |
+
chars_saved=max(0, chars_saved),
|
| 258 |
+
compression_ratio=chars_after / chars_before if chars_before > 0 else 1.0,
|
| 259 |
+
was_compressed=was_compressed and chars_saved > 0,
|
| 260 |
+
)
|
| 261 |
+
|
| 262 |
+
self._metrics.add(metric)
|
| 263 |
+
|
| 264 |
+
if was_compressed and chars_saved > 0:
|
| 265 |
+
logger.info(
|
| 266 |
+
f"HeadroomToolWrapper[{self.name}]: {chars_before} -> {chars_after} chars "
|
| 267 |
+
f"({chars_saved} saved, {metric.compression_ratio:.1%} of original)"
|
| 268 |
+
)
|
| 269 |
+
|
| 270 |
+
def as_langchain_tool(self) -> StructuredTool:
|
| 271 |
+
"""Convert wrapper back to a LangChain tool.
|
| 272 |
+
|
| 273 |
+
Useful when you need to pass the wrapped tool to APIs
|
| 274 |
+
that expect a LangChain tool type.
|
| 275 |
+
|
| 276 |
+
Returns:
|
| 277 |
+
StructuredTool that wraps this wrapper.
|
| 278 |
+
"""
|
| 279 |
+
return StructuredTool.from_function(
|
| 280 |
+
func=self.__call__,
|
| 281 |
+
name=self.name,
|
| 282 |
+
description=self.description,
|
| 283 |
+
)
|
| 284 |
+
|
| 285 |
+
|
| 286 |
+
def wrap_tools_with_headroom(
|
| 287 |
+
tools: list[BaseTool],
|
| 288 |
+
min_chars_to_compress: int = 1000,
|
| 289 |
+
metrics_collector: ToolMetricsCollector | None = None,
|
| 290 |
+
) -> list[StructuredTool]:
|
| 291 |
+
"""Wrap multiple LangChain tools with Headroom compression.
|
| 292 |
+
|
| 293 |
+
Convenience function to wrap all tools in a list at once.
|
| 294 |
+
|
| 295 |
+
Args:
|
| 296 |
+
tools: List of LangChain tools to wrap.
|
| 297 |
+
min_chars_to_compress: Minimum output size for compression.
|
| 298 |
+
metrics_collector: Shared metrics collector for all tools.
|
| 299 |
+
|
| 300 |
+
Returns:
|
| 301 |
+
List of wrapped tools as StructuredTools.
|
| 302 |
+
|
| 303 |
+
Example:
|
| 304 |
+
from langchain.tools import Tool
|
| 305 |
+
from headroom.integrations import wrap_tools_with_headroom
|
| 306 |
+
|
| 307 |
+
tools = [search_tool, database_tool, api_tool]
|
| 308 |
+
wrapped = wrap_tools_with_headroom(tools)
|
| 309 |
+
|
| 310 |
+
# Use wrapped tools in agent
|
| 311 |
+
agent = create_openai_tools_agent(llm, wrapped, prompt)
|
| 312 |
+
"""
|
| 313 |
+
_check_langchain_available()
|
| 314 |
+
|
| 315 |
+
collector = metrics_collector or _global_metrics
|
| 316 |
+
|
| 317 |
+
wrapped = []
|
| 318 |
+
for tool in tools:
|
| 319 |
+
wrapper = HeadroomToolWrapper(
|
| 320 |
+
tool=tool,
|
| 321 |
+
min_chars_to_compress=min_chars_to_compress,
|
| 322 |
+
metrics_collector=collector,
|
| 323 |
+
)
|
| 324 |
+
wrapped.append(wrapper.as_langchain_tool())
|
| 325 |
+
|
| 326 |
+
return wrapped
|
headroom/integrations/{langchain.py → langchain/chat_model.py}
RENAMED
|
@@ -27,9 +27,10 @@ Example:
|
|
| 27 |
|
| 28 |
from __future__ import annotations
|
| 29 |
|
|
|
|
| 30 |
import json
|
| 31 |
import logging
|
| 32 |
-
from collections.abc import Iterator, Sequence
|
| 33 |
from dataclasses import dataclass
|
| 34 |
from datetime import datetime
|
| 35 |
from typing import Any
|
|
@@ -48,13 +49,14 @@ try:
|
|
| 48 |
)
|
| 49 |
from langchain_core.outputs import ChatGeneration, ChatResult
|
| 50 |
from langchain_core.runnables import RunnableLambda
|
| 51 |
-
from pydantic import Field, PrivateAttr
|
| 52 |
|
| 53 |
LANGCHAIN_AVAILABLE = True
|
| 54 |
except ImportError:
|
| 55 |
LANGCHAIN_AVAILABLE = False
|
| 56 |
BaseChatModel = object
|
| 57 |
BaseCallbackHandler = object
|
|
|
|
| 58 |
Field = lambda **kwargs: None # type: ignore[assignment] # noqa: E731
|
| 59 |
PrivateAttr = lambda **kwargs: None # type: ignore[assignment] # noqa: E731
|
| 60 |
|
|
@@ -62,10 +64,12 @@ from headroom import HeadroomConfig, HeadroomMode
|
|
| 62 |
from headroom.providers import OpenAIProvider
|
| 63 |
from headroom.transforms import TransformPipeline
|
| 64 |
|
|
|
|
|
|
|
| 65 |
logger = logging.getLogger(__name__)
|
| 66 |
|
| 67 |
|
| 68 |
-
def _check_langchain_available():
|
| 69 |
"""Raise ImportError if LangChain is not installed."""
|
| 70 |
if not LANGCHAIN_AVAILABLE:
|
| 71 |
raise ImportError(
|
|
@@ -133,6 +137,10 @@ class HeadroomChatModel(BaseChatModel):
|
|
| 133 |
wrapped_model: Any = Field(description="The wrapped LangChain chat model")
|
| 134 |
headroom_config: Any = Field(default=None, description="Headroom configuration")
|
| 135 |
mode: HeadroomMode = Field(default=HeadroomMode.OPTIMIZE, description="Headroom mode")
|
|
|
|
|
|
|
|
|
|
|
|
|
| 136 |
|
| 137 |
# Private attributes (not serialized)
|
| 138 |
_metrics_history: list = PrivateAttr(default_factory=list)
|
|
@@ -140,24 +148,27 @@ class HeadroomChatModel(BaseChatModel):
|
|
| 140 |
_pipeline: Any = PrivateAttr(default=None)
|
| 141 |
_provider: Any = PrivateAttr(default=None)
|
| 142 |
|
| 143 |
-
|
| 144 |
-
|
| 145 |
-
|
| 146 |
-
arbitrary_types_allowed = True
|
| 147 |
|
| 148 |
def __init__(
|
| 149 |
self,
|
| 150 |
wrapped_model: BaseChatModel,
|
| 151 |
config: HeadroomConfig | None = None,
|
| 152 |
mode: HeadroomMode = HeadroomMode.OPTIMIZE,
|
| 153 |
-
|
| 154 |
-
|
|
|
|
| 155 |
"""Initialize HeadroomChatModel.
|
| 156 |
|
| 157 |
Args:
|
| 158 |
wrapped_model: Any LangChain BaseChatModel to wrap
|
| 159 |
config: HeadroomConfig for optimization settings
|
| 160 |
mode: HeadroomMode (AUDIT, OPTIMIZE, or SIMULATE)
|
|
|
|
|
|
|
|
|
|
|
|
|
| 161 |
**kwargs: Additional arguments passed to BaseChatModel
|
| 162 |
"""
|
| 163 |
_check_langchain_available()
|
|
@@ -166,6 +177,7 @@ class HeadroomChatModel(BaseChatModel):
|
|
| 166 |
wrapped_model=wrapped_model,
|
| 167 |
headroom_config=config or HeadroomConfig(),
|
| 168 |
mode=mode,
|
|
|
|
| 169 |
**kwargs,
|
| 170 |
)
|
| 171 |
self._metrics_history = []
|
|
@@ -188,9 +200,17 @@ class HeadroomChatModel(BaseChatModel):
|
|
| 188 |
|
| 189 |
@property
|
| 190 |
def pipeline(self) -> TransformPipeline:
|
| 191 |
-
"""Lazily initialize TransformPipeline.
|
|
|
|
|
|
|
|
|
|
|
|
|
| 192 |
if self._pipeline is None:
|
| 193 |
-
self.
|
|
|
|
|
|
|
|
|
|
|
|
|
| 194 |
self._pipeline = TransformPipeline(
|
| 195 |
config=self.headroom_config,
|
| 196 |
provider=self._provider,
|
|
@@ -290,10 +310,11 @@ class HeadroomChatModel(BaseChatModel):
|
|
| 290 |
# Convert to OpenAI format
|
| 291 |
openai_messages = self._convert_messages_to_openai(messages)
|
| 292 |
|
| 293 |
-
# Get model name
|
| 294 |
-
model =
|
| 295 |
-
|
| 296 |
-
|
|
|
|
| 297 |
|
| 298 |
# Get model context limit from provider
|
| 299 |
model_limit = self._provider.get_context_limit(model) if self._provider else 128000
|
|
@@ -342,7 +363,7 @@ class HeadroomChatModel(BaseChatModel):
|
|
| 342 |
messages: list[BaseMessage],
|
| 343 |
stop: list[str] | None = None,
|
| 344 |
run_manager: Any = None,
|
| 345 |
-
**kwargs,
|
| 346 |
) -> ChatResult:
|
| 347 |
"""Generate response with Headroom optimization.
|
| 348 |
|
|
@@ -371,14 +392,15 @@ class HeadroomChatModel(BaseChatModel):
|
|
| 371 |
messages: list[BaseMessage],
|
| 372 |
stop: list[str] | None = None,
|
| 373 |
run_manager: Any = None,
|
| 374 |
-
**kwargs,
|
| 375 |
) -> Iterator[ChatGeneration]:
|
| 376 |
"""Stream response with Headroom optimization."""
|
| 377 |
# Optimize messages
|
| 378 |
optimized_messages, metrics = self._optimize_messages(messages)
|
| 379 |
|
| 380 |
logger.info(
|
| 381 |
-
f"Headroom optimized (streaming): {metrics.tokens_before} ->
|
|
|
|
| 382 |
)
|
| 383 |
|
| 384 |
# Stream from wrapped model
|
|
@@ -389,13 +411,78 @@ class HeadroomChatModel(BaseChatModel):
|
|
| 389 |
**kwargs,
|
| 390 |
)
|
| 391 |
|
| 392 |
-
def
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 393 |
"""Bind tools to the wrapped model."""
|
| 394 |
new_wrapped = self.wrapped_model.bind_tools(tools, **kwargs)
|
| 395 |
return HeadroomChatModel(
|
| 396 |
wrapped_model=new_wrapped,
|
| 397 |
config=self.headroom_config,
|
| 398 |
mode=self.mode,
|
|
|
|
| 399 |
)
|
| 400 |
|
| 401 |
def get_savings_summary(self) -> dict[str, Any]:
|
|
@@ -494,7 +581,7 @@ class HeadroomCallbackHandler(BaseCallbackHandler):
|
|
| 494 |
self,
|
| 495 |
serialized: dict[str, Any],
|
| 496 |
prompts: list[str],
|
| 497 |
-
**kwargs,
|
| 498 |
) -> None:
|
| 499 |
"""Called when LLM starts processing."""
|
| 500 |
self._current_request = {
|
|
@@ -511,7 +598,7 @@ class HeadroomCallbackHandler(BaseCallbackHandler):
|
|
| 511 |
self,
|
| 512 |
serialized: dict[str, Any],
|
| 513 |
messages: list[list[BaseMessage]],
|
| 514 |
-
**kwargs,
|
| 515 |
) -> None:
|
| 516 |
"""Called when chat model starts processing."""
|
| 517 |
# Estimate tokens from messages
|
|
@@ -532,7 +619,10 @@ class HeadroomCallbackHandler(BaseCallbackHandler):
|
|
| 532 |
|
| 533 |
# Check token alert
|
| 534 |
if self.token_alert_threshold and estimated_tokens > self.token_alert_threshold:
|
| 535 |
-
alert =
|
|
|
|
|
|
|
|
|
|
| 536 |
self._alerts.append(alert)
|
| 537 |
logger.warning(alert)
|
| 538 |
|
|
@@ -542,7 +632,7 @@ class HeadroomCallbackHandler(BaseCallbackHandler):
|
|
| 542 |
f"Chat model request: ~{estimated_tokens} input tokens",
|
| 543 |
)
|
| 544 |
|
| 545 |
-
def on_llm_end(self, response: Any, **kwargs) -> None:
|
| 546 |
"""Called when LLM finishes processing."""
|
| 547 |
if self._current_request is None:
|
| 548 |
return
|
|
@@ -579,7 +669,7 @@ class HeadroomCallbackHandler(BaseCallbackHandler):
|
|
| 579 |
|
| 580 |
self._current_request = None
|
| 581 |
|
| 582 |
-
def on_llm_error(self, error: Exception, **kwargs) -> None:
|
| 583 |
"""Called when LLM encounters an error."""
|
| 584 |
if self._current_request:
|
| 585 |
self._current_request["error"] = str(error)
|
|
@@ -677,19 +767,19 @@ class HeadroomRunnable:
|
|
| 677 |
)
|
| 678 |
return self._pipeline
|
| 679 |
|
| 680 |
-
def __or__(self, other):
|
| 681 |
"""Support pipe operator for LCEL composition."""
|
| 682 |
from langchain_core.runnables import RunnableSequence
|
| 683 |
|
| 684 |
return RunnableSequence(first=self.as_runnable(), last=other)
|
| 685 |
|
| 686 |
-
def __ror__(self, other):
|
| 687 |
"""Support reverse pipe operator."""
|
| 688 |
from langchain_core.runnables import RunnableSequence
|
| 689 |
|
| 690 |
return RunnableSequence(first=other, last=self.as_runnable())
|
| 691 |
|
| 692 |
-
def as_runnable(self):
|
| 693 |
"""Convert to LangChain Runnable."""
|
| 694 |
return RunnableLambda(self._optimize)
|
| 695 |
|
|
|
|
| 27 |
|
| 28 |
from __future__ import annotations
|
| 29 |
|
| 30 |
+
import asyncio
|
| 31 |
import json
|
| 32 |
import logging
|
| 33 |
+
from collections.abc import AsyncIterator, Iterator, Sequence
|
| 34 |
from dataclasses import dataclass
|
| 35 |
from datetime import datetime
|
| 36 |
from typing import Any
|
|
|
|
| 49 |
)
|
| 50 |
from langchain_core.outputs import ChatGeneration, ChatResult
|
| 51 |
from langchain_core.runnables import RunnableLambda
|
| 52 |
+
from pydantic import ConfigDict, Field, PrivateAttr
|
| 53 |
|
| 54 |
LANGCHAIN_AVAILABLE = True
|
| 55 |
except ImportError:
|
| 56 |
LANGCHAIN_AVAILABLE = False
|
| 57 |
BaseChatModel = object
|
| 58 |
BaseCallbackHandler = object
|
| 59 |
+
ConfigDict = lambda **kwargs: {} # type: ignore[assignment,misc] # noqa: E731
|
| 60 |
Field = lambda **kwargs: None # type: ignore[assignment] # noqa: E731
|
| 61 |
PrivateAttr = lambda **kwargs: None # type: ignore[assignment] # noqa: E731
|
| 62 |
|
|
|
|
| 64 |
from headroom.providers import OpenAIProvider
|
| 65 |
from headroom.transforms import TransformPipeline
|
| 66 |
|
| 67 |
+
from .providers import get_headroom_provider, get_model_name_from_langchain
|
| 68 |
+
|
| 69 |
logger = logging.getLogger(__name__)
|
| 70 |
|
| 71 |
|
| 72 |
+
def _check_langchain_available() -> None:
|
| 73 |
"""Raise ImportError if LangChain is not installed."""
|
| 74 |
if not LANGCHAIN_AVAILABLE:
|
| 75 |
raise ImportError(
|
|
|
|
| 137 |
wrapped_model: Any = Field(description="The wrapped LangChain chat model")
|
| 138 |
headroom_config: Any = Field(default=None, description="Headroom configuration")
|
| 139 |
mode: HeadroomMode = Field(default=HeadroomMode.OPTIMIZE, description="Headroom mode")
|
| 140 |
+
auto_detect_provider: bool = Field(
|
| 141 |
+
default=True,
|
| 142 |
+
description="Auto-detect provider from wrapped model (OpenAI, Anthropic, Google)",
|
| 143 |
+
)
|
| 144 |
|
| 145 |
# Private attributes (not serialized)
|
| 146 |
_metrics_history: list = PrivateAttr(default_factory=list)
|
|
|
|
| 148 |
_pipeline: Any = PrivateAttr(default=None)
|
| 149 |
_provider: Any = PrivateAttr(default=None)
|
| 150 |
|
| 151 |
+
# Pydantic v2 config for LangChain compatibility
|
| 152 |
+
model_config = ConfigDict(arbitrary_types_allowed=True)
|
|
|
|
|
|
|
| 153 |
|
| 154 |
def __init__(
|
| 155 |
self,
|
| 156 |
wrapped_model: BaseChatModel,
|
| 157 |
config: HeadroomConfig | None = None,
|
| 158 |
mode: HeadroomMode = HeadroomMode.OPTIMIZE,
|
| 159 |
+
auto_detect_provider: bool = True,
|
| 160 |
+
**kwargs: Any,
|
| 161 |
+
) -> None:
|
| 162 |
"""Initialize HeadroomChatModel.
|
| 163 |
|
| 164 |
Args:
|
| 165 |
wrapped_model: Any LangChain BaseChatModel to wrap
|
| 166 |
config: HeadroomConfig for optimization settings
|
| 167 |
mode: HeadroomMode (AUDIT, OPTIMIZE, or SIMULATE)
|
| 168 |
+
auto_detect_provider: Auto-detect provider from wrapped model.
|
| 169 |
+
When True (default), automatically detects if the wrapped model
|
| 170 |
+
is OpenAI, Anthropic, Google, etc. and uses the appropriate
|
| 171 |
+
Headroom provider for accurate token counting.
|
| 172 |
**kwargs: Additional arguments passed to BaseChatModel
|
| 173 |
"""
|
| 174 |
_check_langchain_available()
|
|
|
|
| 177 |
wrapped_model=wrapped_model,
|
| 178 |
headroom_config=config or HeadroomConfig(),
|
| 179 |
mode=mode,
|
| 180 |
+
auto_detect_provider=auto_detect_provider,
|
| 181 |
**kwargs,
|
| 182 |
)
|
| 183 |
self._metrics_history = []
|
|
|
|
| 200 |
|
| 201 |
@property
|
| 202 |
def pipeline(self) -> TransformPipeline:
|
| 203 |
+
"""Lazily initialize TransformPipeline.
|
| 204 |
+
|
| 205 |
+
When auto_detect_provider is True, automatically detects the provider
|
| 206 |
+
from the wrapped model's class path (e.g., ChatAnthropic -> AnthropicProvider).
|
| 207 |
+
"""
|
| 208 |
if self._pipeline is None:
|
| 209 |
+
if self.auto_detect_provider:
|
| 210 |
+
self._provider = get_headroom_provider(self.wrapped_model)
|
| 211 |
+
logger.debug(f"Auto-detected provider: {self._provider.__class__.__name__}")
|
| 212 |
+
else:
|
| 213 |
+
self._provider = OpenAIProvider()
|
| 214 |
self._pipeline = TransformPipeline(
|
| 215 |
config=self.headroom_config,
|
| 216 |
provider=self._provider,
|
|
|
|
| 310 |
# Convert to OpenAI format
|
| 311 |
openai_messages = self._convert_messages_to_openai(messages)
|
| 312 |
|
| 313 |
+
# Get model name from wrapped model
|
| 314 |
+
model = get_model_name_from_langchain(self.wrapped_model)
|
| 315 |
+
|
| 316 |
+
# Ensure pipeline is initialized (this also sets up provider)
|
| 317 |
+
_ = self.pipeline
|
| 318 |
|
| 319 |
# Get model context limit from provider
|
| 320 |
model_limit = self._provider.get_context_limit(model) if self._provider else 128000
|
|
|
|
| 363 |
messages: list[BaseMessage],
|
| 364 |
stop: list[str] | None = None,
|
| 365 |
run_manager: Any = None,
|
| 366 |
+
**kwargs: Any,
|
| 367 |
) -> ChatResult:
|
| 368 |
"""Generate response with Headroom optimization.
|
| 369 |
|
|
|
|
| 392 |
messages: list[BaseMessage],
|
| 393 |
stop: list[str] | None = None,
|
| 394 |
run_manager: Any = None,
|
| 395 |
+
**kwargs: Any,
|
| 396 |
) -> Iterator[ChatGeneration]:
|
| 397 |
"""Stream response with Headroom optimization."""
|
| 398 |
# Optimize messages
|
| 399 |
optimized_messages, metrics = self._optimize_messages(messages)
|
| 400 |
|
| 401 |
logger.info(
|
| 402 |
+
f"Headroom optimized (streaming): {metrics.tokens_before} -> "
|
| 403 |
+
f"{metrics.tokens_after} tokens"
|
| 404 |
)
|
| 405 |
|
| 406 |
# Stream from wrapped model
|
|
|
|
| 411 |
**kwargs,
|
| 412 |
)
|
| 413 |
|
| 414 |
+
async def _agenerate(
|
| 415 |
+
self,
|
| 416 |
+
messages: list[BaseMessage],
|
| 417 |
+
stop: list[str] | None = None,
|
| 418 |
+
run_manager: Any = None,
|
| 419 |
+
**kwargs: Any,
|
| 420 |
+
) -> ChatResult:
|
| 421 |
+
"""Async generate response with Headroom optimization.
|
| 422 |
+
|
| 423 |
+
This enables `await model.ainvoke(messages)` to work correctly.
|
| 424 |
+
The optimization step runs in a thread executor since it's CPU-bound.
|
| 425 |
+
"""
|
| 426 |
+
# Run optimization in executor (CPU-bound)
|
| 427 |
+
loop = asyncio.get_event_loop()
|
| 428 |
+
optimized_messages, metrics = await loop.run_in_executor(
|
| 429 |
+
None, self._optimize_messages, messages
|
| 430 |
+
)
|
| 431 |
+
|
| 432 |
+
logger.info(
|
| 433 |
+
f"Headroom optimized (async): {metrics.tokens_before} -> {metrics.tokens_after} tokens "
|
| 434 |
+
f"({metrics.savings_percent:.1f}% saved)"
|
| 435 |
+
)
|
| 436 |
+
|
| 437 |
+
# Call wrapped model's async generate
|
| 438 |
+
result = await self.wrapped_model._agenerate(
|
| 439 |
+
optimized_messages,
|
| 440 |
+
stop=stop,
|
| 441 |
+
run_manager=run_manager,
|
| 442 |
+
**kwargs,
|
| 443 |
+
)
|
| 444 |
+
|
| 445 |
+
return result
|
| 446 |
+
|
| 447 |
+
async def _astream(
|
| 448 |
+
self,
|
| 449 |
+
messages: list[BaseMessage],
|
| 450 |
+
stop: list[str] | None = None,
|
| 451 |
+
run_manager: Any = None,
|
| 452 |
+
**kwargs: Any,
|
| 453 |
+
) -> AsyncIterator[ChatGeneration]:
|
| 454 |
+
"""Async stream response with Headroom optimization.
|
| 455 |
+
|
| 456 |
+
This enables `async for chunk in model.astream(messages)` to work correctly.
|
| 457 |
+
"""
|
| 458 |
+
# Run optimization in executor (CPU-bound)
|
| 459 |
+
loop = asyncio.get_event_loop()
|
| 460 |
+
optimized_messages, metrics = await loop.run_in_executor(
|
| 461 |
+
None, self._optimize_messages, messages
|
| 462 |
+
)
|
| 463 |
+
|
| 464 |
+
logger.info(
|
| 465 |
+
f"Headroom optimized (async streaming): {metrics.tokens_before} -> "
|
| 466 |
+
f"{metrics.tokens_after} tokens"
|
| 467 |
+
)
|
| 468 |
+
|
| 469 |
+
# Async stream from wrapped model
|
| 470 |
+
async for chunk in self.wrapped_model._astream(
|
| 471 |
+
optimized_messages,
|
| 472 |
+
stop=stop,
|
| 473 |
+
run_manager=run_manager,
|
| 474 |
+
**kwargs,
|
| 475 |
+
):
|
| 476 |
+
yield chunk
|
| 477 |
+
|
| 478 |
+
def bind_tools(self, tools: Sequence[Any], **kwargs: Any) -> HeadroomChatModel:
|
| 479 |
"""Bind tools to the wrapped model."""
|
| 480 |
new_wrapped = self.wrapped_model.bind_tools(tools, **kwargs)
|
| 481 |
return HeadroomChatModel(
|
| 482 |
wrapped_model=new_wrapped,
|
| 483 |
config=self.headroom_config,
|
| 484 |
mode=self.mode,
|
| 485 |
+
auto_detect_provider=self.auto_detect_provider,
|
| 486 |
)
|
| 487 |
|
| 488 |
def get_savings_summary(self) -> dict[str, Any]:
|
|
|
|
| 581 |
self,
|
| 582 |
serialized: dict[str, Any],
|
| 583 |
prompts: list[str],
|
| 584 |
+
**kwargs: Any,
|
| 585 |
) -> None:
|
| 586 |
"""Called when LLM starts processing."""
|
| 587 |
self._current_request = {
|
|
|
|
| 598 |
self,
|
| 599 |
serialized: dict[str, Any],
|
| 600 |
messages: list[list[BaseMessage]],
|
| 601 |
+
**kwargs: Any,
|
| 602 |
) -> None:
|
| 603 |
"""Called when chat model starts processing."""
|
| 604 |
# Estimate tokens from messages
|
|
|
|
| 619 |
|
| 620 |
# Check token alert
|
| 621 |
if self.token_alert_threshold and estimated_tokens > self.token_alert_threshold:
|
| 622 |
+
alert = (
|
| 623 |
+
f"Token alert: {estimated_tokens} tokens exceeds "
|
| 624 |
+
f"threshold {self.token_alert_threshold}"
|
| 625 |
+
)
|
| 626 |
self._alerts.append(alert)
|
| 627 |
logger.warning(alert)
|
| 628 |
|
|
|
|
| 632 |
f"Chat model request: ~{estimated_tokens} input tokens",
|
| 633 |
)
|
| 634 |
|
| 635 |
+
def on_llm_end(self, response: Any, **kwargs: Any) -> None:
|
| 636 |
"""Called when LLM finishes processing."""
|
| 637 |
if self._current_request is None:
|
| 638 |
return
|
|
|
|
| 669 |
|
| 670 |
self._current_request = None
|
| 671 |
|
| 672 |
+
def on_llm_error(self, error: Exception, **kwargs: Any) -> None:
|
| 673 |
"""Called when LLM encounters an error."""
|
| 674 |
if self._current_request:
|
| 675 |
self._current_request["error"] = str(error)
|
|
|
|
| 767 |
)
|
| 768 |
return self._pipeline
|
| 769 |
|
| 770 |
+
def __or__(self, other: Any) -> Any:
|
| 771 |
"""Support pipe operator for LCEL composition."""
|
| 772 |
from langchain_core.runnables import RunnableSequence
|
| 773 |
|
| 774 |
return RunnableSequence(first=self.as_runnable(), last=other)
|
| 775 |
|
| 776 |
+
def __ror__(self, other: Any) -> Any:
|
| 777 |
"""Support reverse pipe operator."""
|
| 778 |
from langchain_core.runnables import RunnableSequence
|
| 779 |
|
| 780 |
return RunnableSequence(first=other, last=self.as_runnable())
|
| 781 |
|
| 782 |
+
def as_runnable(self) -> RunnableLambda:
|
| 783 |
"""Convert to LangChain Runnable."""
|
| 784 |
return RunnableLambda(self._optimize)
|
| 785 |
|
headroom/integrations/langchain/langsmith.py
ADDED
|
@@ -0,0 +1,324 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""LangSmith integration for Headroom compression metrics.
|
| 2 |
+
|
| 3 |
+
This module provides HeadroomLangSmithCallbackHandler, a LangChain callback
|
| 4 |
+
handler that adds Headroom compression metrics to LangSmith traces.
|
| 5 |
+
|
| 6 |
+
When used with HeadroomChatModel, it automatically captures:
|
| 7 |
+
- Tokens before/after optimization
|
| 8 |
+
- Savings percentage
|
| 9 |
+
- Transforms applied
|
| 10 |
+
- Per-request compression details
|
| 11 |
+
|
| 12 |
+
Example:
|
| 13 |
+
import os
|
| 14 |
+
from langchain_openai import ChatOpenAI
|
| 15 |
+
from headroom.integrations import (
|
| 16 |
+
HeadroomChatModel,
|
| 17 |
+
HeadroomLangSmithCallbackHandler,
|
| 18 |
+
)
|
| 19 |
+
|
| 20 |
+
# Enable LangSmith tracing
|
| 21 |
+
os.environ["LANGCHAIN_TRACING_V2"] = "true"
|
| 22 |
+
os.environ["LANGCHAIN_API_KEY"] = "..."
|
| 23 |
+
|
| 24 |
+
# Create handler
|
| 25 |
+
handler = HeadroomLangSmithCallbackHandler()
|
| 26 |
+
|
| 27 |
+
# Use with HeadroomChatModel
|
| 28 |
+
llm = HeadroomChatModel(
|
| 29 |
+
ChatOpenAI(model="gpt-4o"),
|
| 30 |
+
callbacks=[handler],
|
| 31 |
+
)
|
| 32 |
+
|
| 33 |
+
# Traces will include headroom.* metadata
|
| 34 |
+
response = llm.invoke("Hello!")
|
| 35 |
+
"""
|
| 36 |
+
|
| 37 |
+
from __future__ import annotations
|
| 38 |
+
|
| 39 |
+
import logging
|
| 40 |
+
import os
|
| 41 |
+
from dataclasses import dataclass, field
|
| 42 |
+
from datetime import datetime
|
| 43 |
+
from typing import Any
|
| 44 |
+
from uuid import UUID
|
| 45 |
+
|
| 46 |
+
# LangChain imports - these are optional dependencies
|
| 47 |
+
try:
|
| 48 |
+
from langchain_core.callbacks import BaseCallbackHandler
|
| 49 |
+
from langchain_core.messages import BaseMessage
|
| 50 |
+
from langchain_core.outputs import LLMResult
|
| 51 |
+
|
| 52 |
+
LANGCHAIN_AVAILABLE = True
|
| 53 |
+
except ImportError:
|
| 54 |
+
LANGCHAIN_AVAILABLE = False
|
| 55 |
+
BaseCallbackHandler = object # type: ignore[misc,assignment]
|
| 56 |
+
LLMResult = object # type: ignore[misc,assignment]
|
| 57 |
+
|
| 58 |
+
# LangSmith imports - optional
|
| 59 |
+
try:
|
| 60 |
+
from langsmith import Client as LangSmithClient
|
| 61 |
+
|
| 62 |
+
LANGSMITH_AVAILABLE = True
|
| 63 |
+
except ImportError:
|
| 64 |
+
LANGSMITH_AVAILABLE = False
|
| 65 |
+
LangSmithClient = None # type: ignore[misc,assignment]
|
| 66 |
+
|
| 67 |
+
logger = logging.getLogger(__name__)
|
| 68 |
+
|
| 69 |
+
|
| 70 |
+
def _check_langchain_available() -> None:
|
| 71 |
+
"""Raise ImportError if LangChain is not installed."""
|
| 72 |
+
if not LANGCHAIN_AVAILABLE:
|
| 73 |
+
raise ImportError(
|
| 74 |
+
"LangChain is required for this integration. "
|
| 75 |
+
"Install with: pip install headroom[langchain] "
|
| 76 |
+
"or: pip install langchain-core"
|
| 77 |
+
)
|
| 78 |
+
|
| 79 |
+
|
| 80 |
+
@dataclass
|
| 81 |
+
class PendingMetrics:
|
| 82 |
+
"""Metrics pending attachment to a LangSmith run."""
|
| 83 |
+
|
| 84 |
+
tokens_before: int
|
| 85 |
+
tokens_after: int
|
| 86 |
+
tokens_saved: int
|
| 87 |
+
savings_percent: float
|
| 88 |
+
transforms_applied: list[str]
|
| 89 |
+
timestamp: datetime = field(default_factory=datetime.now)
|
| 90 |
+
|
| 91 |
+
|
| 92 |
+
class HeadroomLangSmithCallbackHandler(BaseCallbackHandler):
|
| 93 |
+
"""Callback handler that adds Headroom metrics to LangSmith traces.
|
| 94 |
+
|
| 95 |
+
Integrates with LangSmith to provide visibility into context
|
| 96 |
+
optimization within traces. Metrics appear as metadata with
|
| 97 |
+
the `headroom.` prefix.
|
| 98 |
+
|
| 99 |
+
Works automatically when:
|
| 100 |
+
1. LANGCHAIN_TRACING_V2=true is set
|
| 101 |
+
2. Used as a callback with HeadroomChatModel
|
| 102 |
+
3. LangSmith API key is configured
|
| 103 |
+
|
| 104 |
+
Example:
|
| 105 |
+
from headroom.integrations import (
|
| 106 |
+
HeadroomChatModel,
|
| 107 |
+
HeadroomLangSmithCallbackHandler,
|
| 108 |
+
)
|
| 109 |
+
|
| 110 |
+
handler = HeadroomLangSmithCallbackHandler()
|
| 111 |
+
llm = HeadroomChatModel(
|
| 112 |
+
ChatOpenAI(model="gpt-4o"),
|
| 113 |
+
callbacks=[handler],
|
| 114 |
+
)
|
| 115 |
+
|
| 116 |
+
response = llm.invoke("Hello!")
|
| 117 |
+
# LangSmith trace now includes:
|
| 118 |
+
# - headroom.tokens_before
|
| 119 |
+
# - headroom.tokens_after
|
| 120 |
+
# - headroom.tokens_saved
|
| 121 |
+
# - headroom.savings_percent
|
| 122 |
+
# - headroom.transforms_applied
|
| 123 |
+
|
| 124 |
+
Attributes:
|
| 125 |
+
langsmith_client: LangSmith client for updating runs.
|
| 126 |
+
pending_metrics: Metrics waiting to be attached to runs.
|
| 127 |
+
"""
|
| 128 |
+
|
| 129 |
+
def __init__(
|
| 130 |
+
self,
|
| 131 |
+
langsmith_client: Any = None,
|
| 132 |
+
auto_update_runs: bool = True,
|
| 133 |
+
):
|
| 134 |
+
"""Initialize HeadroomLangSmithCallbackHandler.
|
| 135 |
+
|
| 136 |
+
Args:
|
| 137 |
+
langsmith_client: LangSmith client instance. Auto-creates
|
| 138 |
+
one if not provided and LangSmith is available.
|
| 139 |
+
auto_update_runs: If True, automatically updates LangSmith
|
| 140 |
+
runs with Headroom metadata. Default True.
|
| 141 |
+
"""
|
| 142 |
+
_check_langchain_available()
|
| 143 |
+
|
| 144 |
+
self._client = langsmith_client
|
| 145 |
+
self._auto_update = auto_update_runs
|
| 146 |
+
self._pending_metrics: dict[str, PendingMetrics] = {}
|
| 147 |
+
self._run_metrics: dict[str, dict[str, Any]] = {}
|
| 148 |
+
|
| 149 |
+
# Initialize LangSmith client if available and not provided
|
| 150 |
+
if self._client is None and LANGSMITH_AVAILABLE and auto_update_runs:
|
| 151 |
+
try:
|
| 152 |
+
if os.environ.get("LANGCHAIN_API_KEY"):
|
| 153 |
+
self._client = LangSmithClient()
|
| 154 |
+
except Exception as e:
|
| 155 |
+
logger.debug(f"Could not initialize LangSmith client: {e}")
|
| 156 |
+
|
| 157 |
+
def set_headroom_metrics(
|
| 158 |
+
self,
|
| 159 |
+
run_id: str | UUID,
|
| 160 |
+
tokens_before: int,
|
| 161 |
+
tokens_after: int,
|
| 162 |
+
transforms_applied: list[str] | None = None,
|
| 163 |
+
) -> None:
|
| 164 |
+
"""Set Headroom metrics for a run.
|
| 165 |
+
|
| 166 |
+
Call this from HeadroomChatModel after optimization to attach
|
| 167 |
+
metrics to the current run.
|
| 168 |
+
|
| 169 |
+
Args:
|
| 170 |
+
run_id: The LangSmith run ID.
|
| 171 |
+
tokens_before: Token count before optimization.
|
| 172 |
+
tokens_after: Token count after optimization.
|
| 173 |
+
transforms_applied: List of transforms that were applied.
|
| 174 |
+
"""
|
| 175 |
+
run_id_str = str(run_id)
|
| 176 |
+
tokens_saved = tokens_before - tokens_after
|
| 177 |
+
savings_percent = (tokens_saved / tokens_before * 100) if tokens_before > 0 else 0.0
|
| 178 |
+
|
| 179 |
+
metrics = PendingMetrics(
|
| 180 |
+
tokens_before=tokens_before,
|
| 181 |
+
tokens_after=tokens_after,
|
| 182 |
+
tokens_saved=tokens_saved,
|
| 183 |
+
savings_percent=savings_percent,
|
| 184 |
+
transforms_applied=transforms_applied or [],
|
| 185 |
+
)
|
| 186 |
+
|
| 187 |
+
self._pending_metrics[run_id_str] = metrics
|
| 188 |
+
|
| 189 |
+
logger.debug(
|
| 190 |
+
f"Headroom metrics set for run {run_id_str}: "
|
| 191 |
+
f"{tokens_before} -> {tokens_after} tokens ({savings_percent:.1f}% saved)"
|
| 192 |
+
)
|
| 193 |
+
|
| 194 |
+
def on_chat_model_start(
|
| 195 |
+
self,
|
| 196 |
+
serialized: dict[str, Any],
|
| 197 |
+
messages: list[list[BaseMessage]],
|
| 198 |
+
*,
|
| 199 |
+
run_id: UUID,
|
| 200 |
+
**kwargs: Any,
|
| 201 |
+
) -> None:
|
| 202 |
+
"""Called when chat model starts.
|
| 203 |
+
|
| 204 |
+
Records the run ID for later metric attachment.
|
| 205 |
+
"""
|
| 206 |
+
run_id_str = str(run_id)
|
| 207 |
+
# Initialize empty metrics for this run
|
| 208 |
+
self._run_metrics[run_id_str] = {}
|
| 209 |
+
|
| 210 |
+
def on_llm_end(
|
| 211 |
+
self,
|
| 212 |
+
response: LLMResult,
|
| 213 |
+
*,
|
| 214 |
+
run_id: UUID,
|
| 215 |
+
**kwargs: Any,
|
| 216 |
+
) -> None:
|
| 217 |
+
"""Called when LLM completes.
|
| 218 |
+
|
| 219 |
+
Attaches pending Headroom metrics to the LangSmith run.
|
| 220 |
+
"""
|
| 221 |
+
run_id_str = str(run_id)
|
| 222 |
+
|
| 223 |
+
# Check for pending metrics
|
| 224 |
+
if run_id_str in self._pending_metrics:
|
| 225 |
+
metrics = self._pending_metrics.pop(run_id_str)
|
| 226 |
+
self._attach_metrics_to_run(run_id_str, metrics)
|
| 227 |
+
|
| 228 |
+
def _attach_metrics_to_run(self, run_id: str, metrics: PendingMetrics) -> None:
|
| 229 |
+
"""Attach Headroom metrics to a LangSmith run.
|
| 230 |
+
|
| 231 |
+
Args:
|
| 232 |
+
run_id: The run ID.
|
| 233 |
+
metrics: Metrics to attach.
|
| 234 |
+
"""
|
| 235 |
+
metadata = {
|
| 236 |
+
"headroom.tokens_before": metrics.tokens_before,
|
| 237 |
+
"headroom.tokens_after": metrics.tokens_after,
|
| 238 |
+
"headroom.tokens_saved": metrics.tokens_saved,
|
| 239 |
+
"headroom.savings_percent": round(metrics.savings_percent, 2),
|
| 240 |
+
"headroom.transforms_applied": metrics.transforms_applied,
|
| 241 |
+
"headroom.optimization_timestamp": metrics.timestamp.isoformat(),
|
| 242 |
+
}
|
| 243 |
+
|
| 244 |
+
# Store in run metrics
|
| 245 |
+
self._run_metrics[run_id] = metadata
|
| 246 |
+
|
| 247 |
+
# Update LangSmith run if client available
|
| 248 |
+
if self._client and self._auto_update:
|
| 249 |
+
try:
|
| 250 |
+
self._client.update_run(
|
| 251 |
+
run_id=run_id,
|
| 252 |
+
extra={"metadata": metadata},
|
| 253 |
+
)
|
| 254 |
+
logger.debug(f"Updated LangSmith run {run_id} with Headroom metrics")
|
| 255 |
+
except Exception as e:
|
| 256 |
+
logger.debug(f"Could not update LangSmith run: {e}")
|
| 257 |
+
|
| 258 |
+
def get_run_metrics(self, run_id: str | UUID) -> dict[str, Any]:
|
| 259 |
+
"""Get Headroom metrics for a specific run.
|
| 260 |
+
|
| 261 |
+
Args:
|
| 262 |
+
run_id: The run ID.
|
| 263 |
+
|
| 264 |
+
Returns:
|
| 265 |
+
Dictionary of headroom.* metrics for the run.
|
| 266 |
+
"""
|
| 267 |
+
return self._run_metrics.get(str(run_id), {})
|
| 268 |
+
|
| 269 |
+
def get_all_metrics(self) -> dict[str, dict[str, Any]]:
|
| 270 |
+
"""Get all recorded run metrics.
|
| 271 |
+
|
| 272 |
+
Returns:
|
| 273 |
+
Dictionary mapping run IDs to their metrics.
|
| 274 |
+
"""
|
| 275 |
+
return self._run_metrics.copy()
|
| 276 |
+
|
| 277 |
+
def get_summary(self) -> dict[str, Any]:
|
| 278 |
+
"""Get summary statistics across all runs.
|
| 279 |
+
|
| 280 |
+
Returns:
|
| 281 |
+
Summary with total runs, tokens saved, etc.
|
| 282 |
+
"""
|
| 283 |
+
if not self._run_metrics:
|
| 284 |
+
return {
|
| 285 |
+
"total_runs": 0,
|
| 286 |
+
"total_tokens_saved": 0,
|
| 287 |
+
"average_savings_percent": 0,
|
| 288 |
+
}
|
| 289 |
+
|
| 290 |
+
total_saved = sum(m.get("headroom.tokens_saved", 0) for m in self._run_metrics.values())
|
| 291 |
+
savings_percents = [
|
| 292 |
+
m.get("headroom.savings_percent", 0) for m in self._run_metrics.values()
|
| 293 |
+
]
|
| 294 |
+
|
| 295 |
+
return {
|
| 296 |
+
"total_runs": len(self._run_metrics),
|
| 297 |
+
"total_tokens_saved": total_saved,
|
| 298 |
+
"average_savings_percent": (
|
| 299 |
+
sum(savings_percents) / len(savings_percents) if savings_percents else 0
|
| 300 |
+
),
|
| 301 |
+
}
|
| 302 |
+
|
| 303 |
+
def reset(self) -> None:
|
| 304 |
+
"""Clear all recorded metrics."""
|
| 305 |
+
self._pending_metrics.clear()
|
| 306 |
+
self._run_metrics.clear()
|
| 307 |
+
|
| 308 |
+
|
| 309 |
+
def is_langsmith_available() -> bool:
|
| 310 |
+
"""Check if LangSmith is available and configured.
|
| 311 |
+
|
| 312 |
+
Returns:
|
| 313 |
+
True if LangSmith is installed and API key is set.
|
| 314 |
+
"""
|
| 315 |
+
return LANGSMITH_AVAILABLE and bool(os.environ.get("LANGCHAIN_API_KEY"))
|
| 316 |
+
|
| 317 |
+
|
| 318 |
+
def is_langsmith_tracing_enabled() -> bool:
|
| 319 |
+
"""Check if LangSmith tracing is enabled.
|
| 320 |
+
|
| 321 |
+
Returns:
|
| 322 |
+
True if LANGCHAIN_TRACING_V2 is set to "true".
|
| 323 |
+
"""
|
| 324 |
+
return os.environ.get("LANGCHAIN_TRACING_V2", "").lower() == "true"
|
headroom/integrations/langchain/memory.py
ADDED
|
@@ -0,0 +1,319 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Memory integration for LangChain with automatic compression.
|
| 2 |
+
|
| 3 |
+
This module provides HeadroomChatMessageHistory, a wrapper for any LangChain
|
| 4 |
+
chat message history that automatically compresses conversation history
|
| 5 |
+
when it exceeds a token threshold.
|
| 6 |
+
|
| 7 |
+
Example:
|
| 8 |
+
from langchain.memory import ConversationBufferMemory
|
| 9 |
+
from langchain_community.chat_message_histories import ChatMessageHistory
|
| 10 |
+
from headroom.integrations import HeadroomChatMessageHistory
|
| 11 |
+
|
| 12 |
+
# Wrap any chat message history
|
| 13 |
+
base_history = ChatMessageHistory()
|
| 14 |
+
compressed_history = HeadroomChatMessageHistory(base_history)
|
| 15 |
+
|
| 16 |
+
# Use with ConversationBufferMemory (zero code changes to chain)
|
| 17 |
+
memory = ConversationBufferMemory(chat_memory=compressed_history)
|
| 18 |
+
"""
|
| 19 |
+
|
| 20 |
+
from __future__ import annotations
|
| 21 |
+
|
| 22 |
+
import logging
|
| 23 |
+
from typing import TYPE_CHECKING, Any
|
| 24 |
+
|
| 25 |
+
if TYPE_CHECKING:
|
| 26 |
+
from headroom.providers.base import Provider
|
| 27 |
+
|
| 28 |
+
# LangChain imports - these are optional dependencies
|
| 29 |
+
try:
|
| 30 |
+
from langchain_core.chat_history import BaseChatMessageHistory
|
| 31 |
+
from langchain_core.messages import (
|
| 32 |
+
AIMessage,
|
| 33 |
+
BaseMessage,
|
| 34 |
+
HumanMessage,
|
| 35 |
+
SystemMessage,
|
| 36 |
+
ToolMessage,
|
| 37 |
+
)
|
| 38 |
+
|
| 39 |
+
LANGCHAIN_AVAILABLE = True
|
| 40 |
+
except ImportError:
|
| 41 |
+
LANGCHAIN_AVAILABLE = False
|
| 42 |
+
BaseChatMessageHistory = object # type: ignore[misc,assignment]
|
| 43 |
+
|
| 44 |
+
from headroom import HeadroomConfig
|
| 45 |
+
from headroom.config import RollingWindowConfig
|
| 46 |
+
from headroom.providers import OpenAIProvider
|
| 47 |
+
from headroom.transforms import TransformPipeline
|
| 48 |
+
|
| 49 |
+
logger = logging.getLogger(__name__)
|
| 50 |
+
|
| 51 |
+
|
| 52 |
+
def _check_langchain_available() -> None:
|
| 53 |
+
"""Raise ImportError if LangChain is not installed."""
|
| 54 |
+
if not LANGCHAIN_AVAILABLE:
|
| 55 |
+
raise ImportError(
|
| 56 |
+
"LangChain is required for this integration. "
|
| 57 |
+
"Install with: pip install headroom[langchain] "
|
| 58 |
+
"or: pip install langchain-core"
|
| 59 |
+
)
|
| 60 |
+
|
| 61 |
+
|
| 62 |
+
class HeadroomChatMessageHistory(BaseChatMessageHistory):
|
| 63 |
+
"""Wraps any LangChain chat message history with automatic compression.
|
| 64 |
+
|
| 65 |
+
When conversation history exceeds the token threshold, automatically
|
| 66 |
+
applies RollingWindow compression to keep recent turns while fitting
|
| 67 |
+
within the limit.
|
| 68 |
+
|
| 69 |
+
This works with ANY memory type because it wraps at the storage layer:
|
| 70 |
+
- ConversationBufferMemory
|
| 71 |
+
- ConversationSummaryMemory
|
| 72 |
+
- ConversationBufferWindowMemory
|
| 73 |
+
- Redis, PostgreSQL, or any custom history
|
| 74 |
+
|
| 75 |
+
Example:
|
| 76 |
+
from langchain.memory import ConversationBufferMemory
|
| 77 |
+
from langchain_community.chat_message_histories import ChatMessageHistory
|
| 78 |
+
from headroom.integrations import HeadroomChatMessageHistory
|
| 79 |
+
|
| 80 |
+
# Wrap base history
|
| 81 |
+
base = ChatMessageHistory()
|
| 82 |
+
compressed = HeadroomChatMessageHistory(
|
| 83 |
+
base,
|
| 84 |
+
compress_threshold_tokens=4000,
|
| 85 |
+
keep_recent_turns=5,
|
| 86 |
+
)
|
| 87 |
+
|
| 88 |
+
# Use with any memory class
|
| 89 |
+
memory = ConversationBufferMemory(chat_memory=compressed)
|
| 90 |
+
|
| 91 |
+
# Messages are compressed automatically when accessed
|
| 92 |
+
chain = ConversationChain(llm=llm, memory=memory)
|
| 93 |
+
chain.invoke({"input": "Hello!"})
|
| 94 |
+
|
| 95 |
+
Attributes:
|
| 96 |
+
base_history: The underlying chat message history
|
| 97 |
+
compress_threshold_tokens: Token count that triggers compression
|
| 98 |
+
keep_recent_turns: Minimum recent turns to always preserve
|
| 99 |
+
model: Model name for token counting (default: "gpt-4o")
|
| 100 |
+
"""
|
| 101 |
+
|
| 102 |
+
def __init__(
|
| 103 |
+
self,
|
| 104 |
+
base_history: BaseChatMessageHistory,
|
| 105 |
+
compress_threshold_tokens: int = 4000,
|
| 106 |
+
keep_recent_turns: int = 5,
|
| 107 |
+
model: str = "gpt-4o",
|
| 108 |
+
provider: Provider | None = None,
|
| 109 |
+
):
|
| 110 |
+
"""Initialize HeadroomChatMessageHistory.
|
| 111 |
+
|
| 112 |
+
Args:
|
| 113 |
+
base_history: Any LangChain BaseChatMessageHistory to wrap
|
| 114 |
+
compress_threshold_tokens: Apply compression when history exceeds
|
| 115 |
+
this many tokens. Default 4000.
|
| 116 |
+
keep_recent_turns: Minimum number of recent user/assistant turns
|
| 117 |
+
to always preserve during compression. Default 5.
|
| 118 |
+
model: Model name for token counting. Default "gpt-4o".
|
| 119 |
+
provider: Headroom provider for token counting. Auto-uses
|
| 120 |
+
OpenAIProvider if not specified.
|
| 121 |
+
"""
|
| 122 |
+
_check_langchain_available()
|
| 123 |
+
|
| 124 |
+
self._base = base_history
|
| 125 |
+
self._threshold = compress_threshold_tokens
|
| 126 |
+
self._keep_recent_turns = keep_recent_turns
|
| 127 |
+
self._model = model
|
| 128 |
+
self._provider: Provider = provider or OpenAIProvider()
|
| 129 |
+
|
| 130 |
+
# Track compression stats
|
| 131 |
+
self._compression_count = 0
|
| 132 |
+
self._total_tokens_saved = 0
|
| 133 |
+
|
| 134 |
+
@property
|
| 135 |
+
def messages(self) -> list[BaseMessage]:
|
| 136 |
+
"""Get messages, applying compression if over threshold.
|
| 137 |
+
|
| 138 |
+
Returns:
|
| 139 |
+
List of messages, potentially compressed to fit within threshold.
|
| 140 |
+
"""
|
| 141 |
+
raw_messages = self._base.messages
|
| 142 |
+
|
| 143 |
+
if not raw_messages:
|
| 144 |
+
return []
|
| 145 |
+
|
| 146 |
+
# Count tokens
|
| 147 |
+
token_count = self._count_tokens(raw_messages)
|
| 148 |
+
|
| 149 |
+
if token_count <= self._threshold:
|
| 150 |
+
return list(raw_messages)
|
| 151 |
+
|
| 152 |
+
# Apply compression
|
| 153 |
+
compressed = self._apply_rolling_window(raw_messages)
|
| 154 |
+
tokens_after = self._count_tokens(compressed)
|
| 155 |
+
|
| 156 |
+
self._compression_count += 1
|
| 157 |
+
self._total_tokens_saved += token_count - tokens_after
|
| 158 |
+
|
| 159 |
+
logger.info(
|
| 160 |
+
f"HeadroomChatMessageHistory compressed: {token_count} -> {tokens_after} tokens "
|
| 161 |
+
f"({len(raw_messages)} -> {len(compressed)} messages)"
|
| 162 |
+
)
|
| 163 |
+
|
| 164 |
+
return compressed
|
| 165 |
+
|
| 166 |
+
def add_message(self, message: BaseMessage) -> None:
|
| 167 |
+
"""Add a message to the underlying history.
|
| 168 |
+
|
| 169 |
+
Args:
|
| 170 |
+
message: The message to add.
|
| 171 |
+
"""
|
| 172 |
+
self._base.add_message(message)
|
| 173 |
+
|
| 174 |
+
def add_user_message(self, message: str) -> None:
|
| 175 |
+
"""Add a user message to the history.
|
| 176 |
+
|
| 177 |
+
Args:
|
| 178 |
+
message: The user message content.
|
| 179 |
+
"""
|
| 180 |
+
self._base.add_user_message(message)
|
| 181 |
+
|
| 182 |
+
def add_ai_message(self, message: str) -> None:
|
| 183 |
+
"""Add an AI message to the history.
|
| 184 |
+
|
| 185 |
+
Args:
|
| 186 |
+
message: The AI message content.
|
| 187 |
+
"""
|
| 188 |
+
self._base.add_ai_message(message)
|
| 189 |
+
|
| 190 |
+
def clear(self) -> None:
|
| 191 |
+
"""Clear all messages from history."""
|
| 192 |
+
self._base.clear()
|
| 193 |
+
|
| 194 |
+
def _count_tokens(self, messages: list[BaseMessage]) -> int:
|
| 195 |
+
"""Count tokens in messages using provider's tokenizer.
|
| 196 |
+
|
| 197 |
+
Args:
|
| 198 |
+
messages: List of messages to count.
|
| 199 |
+
|
| 200 |
+
Returns:
|
| 201 |
+
Total token count.
|
| 202 |
+
"""
|
| 203 |
+
token_counter = self._provider.get_token_counter(self._model)
|
| 204 |
+
total = 0
|
| 205 |
+
for msg in messages:
|
| 206 |
+
content = msg.content if isinstance(msg.content, str) else str(msg.content)
|
| 207 |
+
total += token_counter.count_text(content)
|
| 208 |
+
return total
|
| 209 |
+
|
| 210 |
+
def _apply_rolling_window(self, messages: list[BaseMessage]) -> list[BaseMessage]:
|
| 211 |
+
"""Apply RollingWindow compression to messages.
|
| 212 |
+
|
| 213 |
+
Args:
|
| 214 |
+
messages: Messages to compress.
|
| 215 |
+
|
| 216 |
+
Returns:
|
| 217 |
+
Compressed messages fitting within threshold.
|
| 218 |
+
"""
|
| 219 |
+
# Convert to OpenAI format for Headroom transforms
|
| 220 |
+
openai_messages = self._convert_to_openai(messages)
|
| 221 |
+
|
| 222 |
+
# Use TransformPipeline which handles tokenizer setup
|
| 223 |
+
config = HeadroomConfig(
|
| 224 |
+
rolling_window=RollingWindowConfig(keep_last_turns=self._keep_recent_turns),
|
| 225 |
+
)
|
| 226 |
+
pipeline = TransformPipeline(config=config, provider=self._provider)
|
| 227 |
+
|
| 228 |
+
# Apply compression via pipeline
|
| 229 |
+
result = pipeline.apply(
|
| 230 |
+
messages=openai_messages,
|
| 231 |
+
model=self._model,
|
| 232 |
+
model_limit=self._threshold,
|
| 233 |
+
)
|
| 234 |
+
|
| 235 |
+
# Convert back to LangChain format
|
| 236 |
+
return self._convert_from_openai(result.messages)
|
| 237 |
+
|
| 238 |
+
def _convert_to_openai(self, messages: list[BaseMessage]) -> list[dict[str, Any]]:
|
| 239 |
+
"""Convert LangChain messages to OpenAI format.
|
| 240 |
+
|
| 241 |
+
Args:
|
| 242 |
+
messages: LangChain messages.
|
| 243 |
+
|
| 244 |
+
Returns:
|
| 245 |
+
OpenAI format messages.
|
| 246 |
+
"""
|
| 247 |
+
result = []
|
| 248 |
+
for msg in messages:
|
| 249 |
+
content = msg.content if isinstance(msg.content, str) else str(msg.content)
|
| 250 |
+
|
| 251 |
+
if isinstance(msg, SystemMessage):
|
| 252 |
+
result.append({"role": "system", "content": content})
|
| 253 |
+
elif isinstance(msg, HumanMessage):
|
| 254 |
+
result.append({"role": "user", "content": content})
|
| 255 |
+
elif isinstance(msg, AIMessage):
|
| 256 |
+
entry: dict[str, Any] = {"role": "assistant", "content": content}
|
| 257 |
+
if hasattr(msg, "tool_calls") and msg.tool_calls:
|
| 258 |
+
entry["tool_calls"] = msg.tool_calls
|
| 259 |
+
result.append(entry)
|
| 260 |
+
elif isinstance(msg, ToolMessage):
|
| 261 |
+
result.append(
|
| 262 |
+
{
|
| 263 |
+
"role": "tool",
|
| 264 |
+
"tool_call_id": getattr(msg, "tool_call_id", ""),
|
| 265 |
+
"content": content,
|
| 266 |
+
}
|
| 267 |
+
)
|
| 268 |
+
else:
|
| 269 |
+
# Generic fallback
|
| 270 |
+
result.append(
|
| 271 |
+
{
|
| 272 |
+
"role": getattr(msg, "type", "user"),
|
| 273 |
+
"content": content,
|
| 274 |
+
}
|
| 275 |
+
)
|
| 276 |
+
return result
|
| 277 |
+
|
| 278 |
+
def _convert_from_openai(self, messages: list[dict[str, Any]]) -> list[BaseMessage]:
|
| 279 |
+
"""Convert OpenAI format back to LangChain messages.
|
| 280 |
+
|
| 281 |
+
Args:
|
| 282 |
+
messages: OpenAI format messages.
|
| 283 |
+
|
| 284 |
+
Returns:
|
| 285 |
+
LangChain messages.
|
| 286 |
+
"""
|
| 287 |
+
result: list[BaseMessage] = []
|
| 288 |
+
for msg in messages:
|
| 289 |
+
role = msg.get("role", "user")
|
| 290 |
+
content = msg.get("content", "")
|
| 291 |
+
|
| 292 |
+
if role == "system":
|
| 293 |
+
result.append(SystemMessage(content=content))
|
| 294 |
+
elif role == "user":
|
| 295 |
+
result.append(HumanMessage(content=content))
|
| 296 |
+
elif role == "assistant":
|
| 297 |
+
tool_calls = msg.get("tool_calls", [])
|
| 298 |
+
result.append(AIMessage(content=content, tool_calls=tool_calls))
|
| 299 |
+
elif role == "tool":
|
| 300 |
+
result.append(
|
| 301 |
+
ToolMessage(
|
| 302 |
+
content=content,
|
| 303 |
+
tool_call_id=msg.get("tool_call_id", ""),
|
| 304 |
+
)
|
| 305 |
+
)
|
| 306 |
+
return result
|
| 307 |
+
|
| 308 |
+
def get_compression_stats(self) -> dict[str, Any]:
|
| 309 |
+
"""Get statistics about compression operations.
|
| 310 |
+
|
| 311 |
+
Returns:
|
| 312 |
+
Dictionary with compression_count, total_tokens_saved.
|
| 313 |
+
"""
|
| 314 |
+
return {
|
| 315 |
+
"compression_count": self._compression_count,
|
| 316 |
+
"total_tokens_saved": self._total_tokens_saved,
|
| 317 |
+
"threshold_tokens": self._threshold,
|
| 318 |
+
"keep_recent_turns": self._keep_recent_turns,
|
| 319 |
+
}
|
headroom/integrations/langchain/providers.py
ADDED
|
@@ -0,0 +1,200 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Provider detection for LangChain models.
|
| 2 |
+
|
| 3 |
+
This module provides automatic provider detection from LangChain chat models
|
| 4 |
+
without requiring explicit provider imports. It uses duck-typing based on
|
| 5 |
+
class paths to identify the appropriate Headroom provider.
|
| 6 |
+
|
| 7 |
+
Example:
|
| 8 |
+
from langchain_anthropic import ChatAnthropic
|
| 9 |
+
from headroom.integrations.langchain import get_headroom_provider
|
| 10 |
+
|
| 11 |
+
model = ChatAnthropic(model="claude-3-5-sonnet-20241022")
|
| 12 |
+
provider = get_headroom_provider(model) # Returns AnthropicProvider
|
| 13 |
+
"""
|
| 14 |
+
|
| 15 |
+
from __future__ import annotations
|
| 16 |
+
|
| 17 |
+
import logging
|
| 18 |
+
from typing import TYPE_CHECKING, Any
|
| 19 |
+
|
| 20 |
+
if TYPE_CHECKING:
|
| 21 |
+
from headroom.providers.base import Provider
|
| 22 |
+
|
| 23 |
+
logger = logging.getLogger(__name__)
|
| 24 |
+
|
| 25 |
+
# Provider detection patterns
|
| 26 |
+
# Maps provider name to list of class path patterns to match
|
| 27 |
+
PROVIDER_PATTERNS: dict[str, list[str]] = {
|
| 28 |
+
"openai": [
|
| 29 |
+
"langchain_openai.ChatOpenAI",
|
| 30 |
+
"langchain_openai.chat_models.ChatOpenAI",
|
| 31 |
+
"langchain_community.chat_models.ChatOpenAI",
|
| 32 |
+
"langchain.chat_models.ChatOpenAI",
|
| 33 |
+
"ChatOpenAI",
|
| 34 |
+
],
|
| 35 |
+
"anthropic": [
|
| 36 |
+
"langchain_anthropic.ChatAnthropic",
|
| 37 |
+
"langchain_anthropic.chat_models.ChatAnthropic",
|
| 38 |
+
"langchain_community.chat_models.ChatAnthropic",
|
| 39 |
+
"langchain.chat_models.ChatAnthropic",
|
| 40 |
+
"ChatAnthropic",
|
| 41 |
+
],
|
| 42 |
+
"google": [
|
| 43 |
+
"langchain_google_genai.ChatGoogleGenerativeAI",
|
| 44 |
+
"langchain_google_genai.chat_models.ChatGoogleGenerativeAI",
|
| 45 |
+
"langchain_community.chat_models.ChatGoogleGenerativeAI",
|
| 46 |
+
"ChatGoogleGenerativeAI",
|
| 47 |
+
# Also match Vertex AI
|
| 48 |
+
"langchain_google_vertexai.ChatVertexAI",
|
| 49 |
+
"ChatVertexAI",
|
| 50 |
+
],
|
| 51 |
+
"cohere": [
|
| 52 |
+
"langchain_cohere.ChatCohere",
|
| 53 |
+
"langchain_community.chat_models.ChatCohere",
|
| 54 |
+
"ChatCohere",
|
| 55 |
+
],
|
| 56 |
+
"mistral": [
|
| 57 |
+
"langchain_mistralai.ChatMistralAI",
|
| 58 |
+
"langchain_community.chat_models.ChatMistralAI",
|
| 59 |
+
"ChatMistralAI",
|
| 60 |
+
],
|
| 61 |
+
}
|
| 62 |
+
|
| 63 |
+
# Model name patterns for fallback detection
|
| 64 |
+
MODEL_NAME_PATTERNS: dict[str, list[str]] = {
|
| 65 |
+
"anthropic": ["claude", "anthropic"],
|
| 66 |
+
"openai": ["gpt", "o1", "o3", "davinci", "turbo"],
|
| 67 |
+
"google": ["gemini", "palm", "bison"],
|
| 68 |
+
"cohere": ["command", "cohere"],
|
| 69 |
+
"mistral": ["mistral", "mixtral"],
|
| 70 |
+
}
|
| 71 |
+
|
| 72 |
+
|
| 73 |
+
def detect_provider(model: Any) -> str:
|
| 74 |
+
"""Detect provider name from a LangChain model using duck-typing.
|
| 75 |
+
|
| 76 |
+
Detection strategy:
|
| 77 |
+
1. Check class module and name against known patterns
|
| 78 |
+
2. Check model_name attribute against known model patterns
|
| 79 |
+
3. Fall back to "openai" as safe default
|
| 80 |
+
|
| 81 |
+
Args:
|
| 82 |
+
model: Any LangChain chat model instance
|
| 83 |
+
|
| 84 |
+
Returns:
|
| 85 |
+
Provider name string: "openai", "anthropic", "google", "cohere", "mistral"
|
| 86 |
+
|
| 87 |
+
Example:
|
| 88 |
+
>>> from langchain_anthropic import ChatAnthropic
|
| 89 |
+
>>> model = ChatAnthropic(model="claude-3-5-sonnet-20241022")
|
| 90 |
+
>>> detect_provider(model)
|
| 91 |
+
'anthropic'
|
| 92 |
+
"""
|
| 93 |
+
# Strategy 1: Check class path
|
| 94 |
+
class_module = getattr(model.__class__, "__module__", "")
|
| 95 |
+
class_name = model.__class__.__name__
|
| 96 |
+
class_path = f"{class_module}.{class_name}"
|
| 97 |
+
|
| 98 |
+
for provider_name, patterns in PROVIDER_PATTERNS.items():
|
| 99 |
+
for pattern in patterns:
|
| 100 |
+
if pattern in class_path or class_name == pattern.split(".")[-1]:
|
| 101 |
+
logger.debug(f"Detected provider '{provider_name}' from class path: {class_path}")
|
| 102 |
+
return provider_name
|
| 103 |
+
|
| 104 |
+
# Strategy 2: Check model_name attribute
|
| 105 |
+
model_name = _get_model_name(model)
|
| 106 |
+
if model_name:
|
| 107 |
+
model_name_lower = model_name.lower()
|
| 108 |
+
for provider_name, name_patterns in MODEL_NAME_PATTERNS.items():
|
| 109 |
+
for pattern in name_patterns:
|
| 110 |
+
if pattern in model_name_lower:
|
| 111 |
+
logger.debug(
|
| 112 |
+
f"Detected provider '{provider_name}' from model name: {model_name}"
|
| 113 |
+
)
|
| 114 |
+
return provider_name
|
| 115 |
+
|
| 116 |
+
# Strategy 3: Fall back to OpenAI (most common, safe default)
|
| 117 |
+
logger.debug(f"Could not detect provider for {class_path}, falling back to 'openai'")
|
| 118 |
+
return "openai"
|
| 119 |
+
|
| 120 |
+
|
| 121 |
+
def _get_model_name(model: Any) -> str | None:
|
| 122 |
+
"""Extract model name from a LangChain model.
|
| 123 |
+
|
| 124 |
+
Tries common attribute names used by different LangChain models.
|
| 125 |
+
"""
|
| 126 |
+
# Try common attribute names
|
| 127 |
+
for attr in ["model_name", "model", "model_id", "_model_name"]:
|
| 128 |
+
value = getattr(model, attr, None)
|
| 129 |
+
if isinstance(value, str):
|
| 130 |
+
return value
|
| 131 |
+
|
| 132 |
+
return None
|
| 133 |
+
|
| 134 |
+
|
| 135 |
+
def get_headroom_provider(model: Any) -> Provider:
|
| 136 |
+
"""Get appropriate Headroom Provider instance for a LangChain model.
|
| 137 |
+
|
| 138 |
+
This function automatically detects the provider from the model type
|
| 139 |
+
and returns a configured Headroom provider for accurate token counting
|
| 140 |
+
and context limit detection.
|
| 141 |
+
|
| 142 |
+
Args:
|
| 143 |
+
model: Any LangChain chat model instance
|
| 144 |
+
|
| 145 |
+
Returns:
|
| 146 |
+
Configured Headroom Provider instance
|
| 147 |
+
|
| 148 |
+
Example:
|
| 149 |
+
>>> from langchain_anthropic import ChatAnthropic
|
| 150 |
+
>>> model = ChatAnthropic(model="claude-3-5-sonnet-20241022")
|
| 151 |
+
>>> provider = get_headroom_provider(model)
|
| 152 |
+
>>> provider.name
|
| 153 |
+
'anthropic'
|
| 154 |
+
"""
|
| 155 |
+
# Import providers lazily to avoid circular imports
|
| 156 |
+
from headroom.providers import (
|
| 157 |
+
AnthropicProvider,
|
| 158 |
+
GoogleProvider,
|
| 159 |
+
OpenAIProvider,
|
| 160 |
+
)
|
| 161 |
+
|
| 162 |
+
provider_name = detect_provider(model)
|
| 163 |
+
|
| 164 |
+
if provider_name == "anthropic":
|
| 165 |
+
return AnthropicProvider()
|
| 166 |
+
elif provider_name == "google":
|
| 167 |
+
return GoogleProvider()
|
| 168 |
+
# Cohere and Mistral fall back to OpenAI-compatible for now
|
| 169 |
+
# TODO: Add dedicated providers when needed
|
| 170 |
+
|
| 171 |
+
# Default to OpenAI
|
| 172 |
+
return OpenAIProvider()
|
| 173 |
+
|
| 174 |
+
|
| 175 |
+
def get_model_name_from_langchain(model: Any) -> str:
|
| 176 |
+
"""Extract the model name string from a LangChain model.
|
| 177 |
+
|
| 178 |
+
Useful for getting the model identifier for token counting
|
| 179 |
+
and context limit lookup.
|
| 180 |
+
|
| 181 |
+
Args:
|
| 182 |
+
model: Any LangChain chat model instance
|
| 183 |
+
|
| 184 |
+
Returns:
|
| 185 |
+
Model name string (e.g., "gpt-4o", "claude-3-5-sonnet-20241022")
|
| 186 |
+
"""
|
| 187 |
+
name = _get_model_name(model)
|
| 188 |
+
if name:
|
| 189 |
+
return name
|
| 190 |
+
|
| 191 |
+
# Try to infer from class name
|
| 192 |
+
class_name = model.__class__.__name__
|
| 193 |
+
if "GPT" in class_name or "OpenAI" in class_name:
|
| 194 |
+
return "gpt-4o" # Safe default for OpenAI
|
| 195 |
+
elif "Anthropic" in class_name or "Claude" in class_name:
|
| 196 |
+
return "claude-3-5-sonnet-20241022" # Safe default for Anthropic
|
| 197 |
+
elif "Google" in class_name or "Gemini" in class_name:
|
| 198 |
+
return "gemini-1.5-pro" # Safe default for Google
|
| 199 |
+
|
| 200 |
+
return "gpt-4o" # Ultimate fallback
|
headroom/integrations/langchain/retriever.py
ADDED
|
@@ -0,0 +1,371 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Retriever integration for LangChain with intelligent document compression.
|
| 2 |
+
|
| 3 |
+
This module provides HeadroomDocumentCompressor, a LangChain BaseDocumentCompressor
|
| 4 |
+
that reduces retrieved documents based on relevance scoring while preserving
|
| 5 |
+
the most important information.
|
| 6 |
+
|
| 7 |
+
Example:
|
| 8 |
+
from langchain.retrievers import ContextualCompressionRetriever
|
| 9 |
+
from langchain_community.vectorstores import Chroma
|
| 10 |
+
from headroom.integrations import HeadroomDocumentCompressor
|
| 11 |
+
|
| 12 |
+
# Create vector store retriever
|
| 13 |
+
vectorstore = Chroma.from_documents(documents, embeddings)
|
| 14 |
+
base_retriever = vectorstore.as_retriever(search_kwargs={"k": 50})
|
| 15 |
+
|
| 16 |
+
# Wrap with Headroom compression
|
| 17 |
+
compressor = HeadroomDocumentCompressor(max_documents=10)
|
| 18 |
+
retriever = ContextualCompressionRetriever(
|
| 19 |
+
base_compressor=compressor,
|
| 20 |
+
base_retriever=base_retriever,
|
| 21 |
+
)
|
| 22 |
+
|
| 23 |
+
# Retrieve - automatically keeps most relevant documents
|
| 24 |
+
docs = retriever.invoke("What is the capital of France?")
|
| 25 |
+
"""
|
| 26 |
+
|
| 27 |
+
from __future__ import annotations
|
| 28 |
+
|
| 29 |
+
import logging
|
| 30 |
+
import re
|
| 31 |
+
from collections.abc import Sequence
|
| 32 |
+
from dataclasses import dataclass
|
| 33 |
+
from typing import Any
|
| 34 |
+
|
| 35 |
+
# LangChain imports - these are optional dependencies
|
| 36 |
+
try:
|
| 37 |
+
from langchain_core.callbacks import Callbacks
|
| 38 |
+
from langchain_core.documents import Document
|
| 39 |
+
|
| 40 |
+
# BaseDocumentCompressor location varies by langchain version
|
| 41 |
+
try:
|
| 42 |
+
from langchain.retrievers.document_compressors import BaseDocumentCompressor
|
| 43 |
+
except ImportError:
|
| 44 |
+
try:
|
| 45 |
+
from langchain_core.documents.compressors import BaseDocumentCompressor
|
| 46 |
+
except ImportError:
|
| 47 |
+
# Fallback: create a minimal base class
|
| 48 |
+
class BaseDocumentCompressor: # type: ignore[no-redef]
|
| 49 |
+
"""Minimal base class for document compression."""
|
| 50 |
+
|
| 51 |
+
def compress_documents(
|
| 52 |
+
self, documents: Sequence[Any], query: str, callbacks: Any = None
|
| 53 |
+
) -> Sequence[Any]:
|
| 54 |
+
raise NotImplementedError
|
| 55 |
+
|
| 56 |
+
LANGCHAIN_AVAILABLE = True
|
| 57 |
+
except ImportError:
|
| 58 |
+
LANGCHAIN_AVAILABLE = False
|
| 59 |
+
BaseDocumentCompressor = object # type: ignore[misc,assignment]
|
| 60 |
+
Document = object # type: ignore[misc,assignment]
|
| 61 |
+
Callbacks = None # type: ignore[misc,assignment]
|
| 62 |
+
|
| 63 |
+
logger = logging.getLogger(__name__)
|
| 64 |
+
|
| 65 |
+
|
| 66 |
+
def _check_langchain_available() -> None:
|
| 67 |
+
"""Raise ImportError if LangChain is not installed."""
|
| 68 |
+
if not LANGCHAIN_AVAILABLE:
|
| 69 |
+
raise ImportError(
|
| 70 |
+
"LangChain is required for this integration. "
|
| 71 |
+
"Install with: pip install headroom[langchain] "
|
| 72 |
+
"or: pip install langchain-core"
|
| 73 |
+
)
|
| 74 |
+
|
| 75 |
+
|
| 76 |
+
@dataclass
|
| 77 |
+
class CompressionMetrics:
|
| 78 |
+
"""Metrics from document compression."""
|
| 79 |
+
|
| 80 |
+
documents_before: int
|
| 81 |
+
documents_after: int
|
| 82 |
+
documents_removed: int
|
| 83 |
+
relevance_scores: list[float]
|
| 84 |
+
|
| 85 |
+
|
| 86 |
+
class HeadroomDocumentCompressor(BaseDocumentCompressor):
|
| 87 |
+
"""Compresses retrieved documents based on relevance to query.
|
| 88 |
+
|
| 89 |
+
Uses BM25-style relevance scoring to keep only the most relevant
|
| 90 |
+
documents from a larger retrieval set. This allows you to retrieve
|
| 91 |
+
many documents initially (for recall) and then compress down to
|
| 92 |
+
the most relevant ones (for precision).
|
| 93 |
+
|
| 94 |
+
Works with LangChain's ContextualCompressionRetriever pattern.
|
| 95 |
+
|
| 96 |
+
Example:
|
| 97 |
+
from langchain.retrievers import ContextualCompressionRetriever
|
| 98 |
+
from headroom.integrations import HeadroomDocumentCompressor
|
| 99 |
+
|
| 100 |
+
compressor = HeadroomDocumentCompressor(
|
| 101 |
+
max_documents=10,
|
| 102 |
+
min_relevance=0.3,
|
| 103 |
+
)
|
| 104 |
+
|
| 105 |
+
retriever = ContextualCompressionRetriever(
|
| 106 |
+
base_compressor=compressor,
|
| 107 |
+
base_retriever=base_retriever, # Any retriever
|
| 108 |
+
)
|
| 109 |
+
|
| 110 |
+
# Retrieves top 10 most relevant docs
|
| 111 |
+
docs = retriever.invoke("What is Python?")
|
| 112 |
+
|
| 113 |
+
Attributes:
|
| 114 |
+
max_documents: Maximum documents to return
|
| 115 |
+
min_relevance: Minimum relevance score (0-1) to include
|
| 116 |
+
prefer_diverse: Whether to prefer diverse results
|
| 117 |
+
"""
|
| 118 |
+
|
| 119 |
+
max_documents: int = 10
|
| 120 |
+
min_relevance: float = 0.0
|
| 121 |
+
prefer_diverse: bool = False
|
| 122 |
+
|
| 123 |
+
def __init__(
|
| 124 |
+
self,
|
| 125 |
+
max_documents: int = 10,
|
| 126 |
+
min_relevance: float = 0.0,
|
| 127 |
+
prefer_diverse: bool = False,
|
| 128 |
+
**kwargs: Any,
|
| 129 |
+
):
|
| 130 |
+
"""Initialize HeadroomDocumentCompressor.
|
| 131 |
+
|
| 132 |
+
Args:
|
| 133 |
+
max_documents: Maximum number of documents to return. Default 10.
|
| 134 |
+
min_relevance: Minimum relevance score (0-1) for a document to
|
| 135 |
+
be included. Default 0.0 (no minimum).
|
| 136 |
+
prefer_diverse: If True, use MMR-style selection to prefer
|
| 137 |
+
diverse results over pure relevance. Default False.
|
| 138 |
+
**kwargs: Additional arguments for BaseDocumentCompressor.
|
| 139 |
+
"""
|
| 140 |
+
_check_langchain_available()
|
| 141 |
+
|
| 142 |
+
super().__init__(**kwargs)
|
| 143 |
+
self.max_documents = max_documents
|
| 144 |
+
self.min_relevance = min_relevance
|
| 145 |
+
self.prefer_diverse = prefer_diverse
|
| 146 |
+
self._last_metrics: CompressionMetrics | None = None
|
| 147 |
+
|
| 148 |
+
def compress_documents(
|
| 149 |
+
self,
|
| 150 |
+
documents: Sequence[Document],
|
| 151 |
+
query: str,
|
| 152 |
+
callbacks: Callbacks = None,
|
| 153 |
+
) -> Sequence[Document]:
|
| 154 |
+
"""Compress documents based on relevance to query.
|
| 155 |
+
|
| 156 |
+
Args:
|
| 157 |
+
documents: Documents to compress.
|
| 158 |
+
query: Query to score relevance against.
|
| 159 |
+
callbacks: LangChain callbacks (unused).
|
| 160 |
+
|
| 161 |
+
Returns:
|
| 162 |
+
Compressed list of most relevant documents.
|
| 163 |
+
"""
|
| 164 |
+
if not documents:
|
| 165 |
+
self._last_metrics = CompressionMetrics(
|
| 166 |
+
documents_before=0,
|
| 167 |
+
documents_after=0,
|
| 168 |
+
documents_removed=0,
|
| 169 |
+
relevance_scores=[],
|
| 170 |
+
)
|
| 171 |
+
return []
|
| 172 |
+
|
| 173 |
+
if len(documents) <= self.max_documents:
|
| 174 |
+
# No compression needed
|
| 175 |
+
scores = [self._score_document(doc, query) for doc in documents]
|
| 176 |
+
self._last_metrics = CompressionMetrics(
|
| 177 |
+
documents_before=len(documents),
|
| 178 |
+
documents_after=len(documents),
|
| 179 |
+
documents_removed=0,
|
| 180 |
+
relevance_scores=scores,
|
| 181 |
+
)
|
| 182 |
+
return list(documents)
|
| 183 |
+
|
| 184 |
+
# Score all documents
|
| 185 |
+
scored = [(doc, self._score_document(doc, query)) for doc in documents]
|
| 186 |
+
|
| 187 |
+
if self.prefer_diverse:
|
| 188 |
+
# Use MMR-style selection for diversity
|
| 189 |
+
selected = self._select_diverse(scored, query)
|
| 190 |
+
else:
|
| 191 |
+
# Sort by relevance score
|
| 192 |
+
scored.sort(key=lambda x: x[1], reverse=True)
|
| 193 |
+
selected = scored[: self.max_documents]
|
| 194 |
+
|
| 195 |
+
# Filter by minimum relevance
|
| 196 |
+
if self.min_relevance > 0:
|
| 197 |
+
selected = [(doc, score) for doc, score in selected if score >= self.min_relevance]
|
| 198 |
+
|
| 199 |
+
# Track metrics
|
| 200 |
+
final_docs = [doc for doc, _ in selected]
|
| 201 |
+
final_scores = [score for _, score in selected]
|
| 202 |
+
|
| 203 |
+
self._last_metrics = CompressionMetrics(
|
| 204 |
+
documents_before=len(documents),
|
| 205 |
+
documents_after=len(final_docs),
|
| 206 |
+
documents_removed=len(documents) - len(final_docs),
|
| 207 |
+
relevance_scores=final_scores,
|
| 208 |
+
)
|
| 209 |
+
|
| 210 |
+
logger.info(
|
| 211 |
+
f"HeadroomDocumentCompressor: {len(documents)} -> {len(final_docs)} documents "
|
| 212 |
+
f"(avg relevance: {sum(final_scores) / len(final_scores) if final_scores else 0:.2f})"
|
| 213 |
+
)
|
| 214 |
+
|
| 215 |
+
return final_docs
|
| 216 |
+
|
| 217 |
+
def _score_document(self, doc: Document, query: str) -> float:
|
| 218 |
+
"""Score a document's relevance to the query using BM25-style scoring.
|
| 219 |
+
|
| 220 |
+
Args:
|
| 221 |
+
doc: Document to score.
|
| 222 |
+
query: Query to compare against.
|
| 223 |
+
|
| 224 |
+
Returns:
|
| 225 |
+
Relevance score between 0 and 1.
|
| 226 |
+
"""
|
| 227 |
+
content = doc.page_content.lower()
|
| 228 |
+
query_lower = query.lower()
|
| 229 |
+
|
| 230 |
+
# Tokenize
|
| 231 |
+
query_terms = self._tokenize(query_lower)
|
| 232 |
+
doc_terms = self._tokenize(content)
|
| 233 |
+
|
| 234 |
+
if not query_terms or not doc_terms:
|
| 235 |
+
return 0.0
|
| 236 |
+
|
| 237 |
+
# BM25-style scoring
|
| 238 |
+
k1 = 1.5
|
| 239 |
+
b = 0.75
|
| 240 |
+
avg_dl = 100 # Assume average document length
|
| 241 |
+
|
| 242 |
+
doc_len = len(doc_terms)
|
| 243 |
+
term_freqs: dict[str, int] = {}
|
| 244 |
+
for term in doc_terms:
|
| 245 |
+
term_freqs[term] = term_freqs.get(term, 0) + 1
|
| 246 |
+
|
| 247 |
+
score = 0.0
|
| 248 |
+
for term in query_terms:
|
| 249 |
+
if term in term_freqs:
|
| 250 |
+
tf = term_freqs[term]
|
| 251 |
+
# Simplified BM25 (without IDF since we don't have corpus stats)
|
| 252 |
+
numerator = tf * (k1 + 1)
|
| 253 |
+
denominator = tf + k1 * (1 - b + b * (doc_len / avg_dl))
|
| 254 |
+
score += numerator / denominator
|
| 255 |
+
|
| 256 |
+
# Normalize to 0-1 range
|
| 257 |
+
max_possible = len(query_terms) * (k1 + 1)
|
| 258 |
+
normalized = score / max_possible if max_possible > 0 else 0.0
|
| 259 |
+
|
| 260 |
+
# Boost for exact phrase matches
|
| 261 |
+
if query_lower in content:
|
| 262 |
+
normalized = min(1.0, normalized + 0.3)
|
| 263 |
+
|
| 264 |
+
return min(1.0, normalized)
|
| 265 |
+
|
| 266 |
+
def _tokenize(self, text: str) -> list[str]:
|
| 267 |
+
"""Tokenize text into terms.
|
| 268 |
+
|
| 269 |
+
Args:
|
| 270 |
+
text: Text to tokenize.
|
| 271 |
+
|
| 272 |
+
Returns:
|
| 273 |
+
List of tokens.
|
| 274 |
+
"""
|
| 275 |
+
# Simple tokenization: split on non-alphanumeric, filter short terms
|
| 276 |
+
tokens = re.findall(r"\b\w+\b", text)
|
| 277 |
+
return [t for t in tokens if len(t) > 1]
|
| 278 |
+
|
| 279 |
+
def _select_diverse(
|
| 280 |
+
self, scored_docs: list[tuple[Document, float]], query: str
|
| 281 |
+
) -> list[tuple[Document, float]]:
|
| 282 |
+
"""Select diverse documents using MMR-style approach.
|
| 283 |
+
|
| 284 |
+
Balances relevance with diversity to avoid redundant results.
|
| 285 |
+
|
| 286 |
+
Args:
|
| 287 |
+
scored_docs: List of (document, relevance_score) tuples.
|
| 288 |
+
query: Original query.
|
| 289 |
+
|
| 290 |
+
Returns:
|
| 291 |
+
Selected documents with diversity considered.
|
| 292 |
+
"""
|
| 293 |
+
if not scored_docs:
|
| 294 |
+
return []
|
| 295 |
+
|
| 296 |
+
# Sort by initial relevance
|
| 297 |
+
scored_docs = sorted(scored_docs, key=lambda x: x[1], reverse=True)
|
| 298 |
+
|
| 299 |
+
# Start with most relevant
|
| 300 |
+
selected = [scored_docs[0]]
|
| 301 |
+
remaining = scored_docs[1:]
|
| 302 |
+
|
| 303 |
+
lambda_param = 0.5 # Balance between relevance and diversity
|
| 304 |
+
|
| 305 |
+
while len(selected) < self.max_documents and remaining:
|
| 306 |
+
best_score = -1.0
|
| 307 |
+
best_idx = 0
|
| 308 |
+
|
| 309 |
+
for i, (doc, rel_score) in enumerate(remaining):
|
| 310 |
+
# Calculate max similarity to already selected docs
|
| 311 |
+
max_sim = max(self._document_similarity(doc, sel_doc) for sel_doc, _ in selected)
|
| 312 |
+
|
| 313 |
+
# MMR score: lambda * relevance - (1-lambda) * max_similarity
|
| 314 |
+
mmr_score = lambda_param * rel_score - (1 - lambda_param) * max_sim
|
| 315 |
+
|
| 316 |
+
if mmr_score > best_score:
|
| 317 |
+
best_score = mmr_score
|
| 318 |
+
best_idx = i
|
| 319 |
+
|
| 320 |
+
selected.append(remaining[best_idx])
|
| 321 |
+
remaining.pop(best_idx)
|
| 322 |
+
|
| 323 |
+
return selected
|
| 324 |
+
|
| 325 |
+
def _document_similarity(self, doc1: Document, doc2: Document) -> float:
|
| 326 |
+
"""Calculate similarity between two documents.
|
| 327 |
+
|
| 328 |
+
Uses Jaccard similarity on terms for simplicity.
|
| 329 |
+
|
| 330 |
+
Args:
|
| 331 |
+
doc1: First document.
|
| 332 |
+
doc2: Second document.
|
| 333 |
+
|
| 334 |
+
Returns:
|
| 335 |
+
Similarity score between 0 and 1.
|
| 336 |
+
"""
|
| 337 |
+
terms1 = set(self._tokenize(doc1.page_content.lower()))
|
| 338 |
+
terms2 = set(self._tokenize(doc2.page_content.lower()))
|
| 339 |
+
|
| 340 |
+
if not terms1 or not terms2:
|
| 341 |
+
return 0.0
|
| 342 |
+
|
| 343 |
+
intersection = len(terms1 & terms2)
|
| 344 |
+
union = len(terms1 | terms2)
|
| 345 |
+
|
| 346 |
+
return intersection / union if union > 0 else 0.0
|
| 347 |
+
|
| 348 |
+
@property
|
| 349 |
+
def last_metrics(self) -> CompressionMetrics | None:
|
| 350 |
+
"""Get metrics from the last compression operation."""
|
| 351 |
+
return self._last_metrics
|
| 352 |
+
|
| 353 |
+
def get_compression_stats(self) -> dict[str, Any]:
|
| 354 |
+
"""Get statistics from the last compression.
|
| 355 |
+
|
| 356 |
+
Returns:
|
| 357 |
+
Dictionary with compression metrics, or empty if no compression yet.
|
| 358 |
+
"""
|
| 359 |
+
if self._last_metrics is None:
|
| 360 |
+
return {}
|
| 361 |
+
|
| 362 |
+
return {
|
| 363 |
+
"documents_before": self._last_metrics.documents_before,
|
| 364 |
+
"documents_after": self._last_metrics.documents_after,
|
| 365 |
+
"documents_removed": self._last_metrics.documents_removed,
|
| 366 |
+
"average_relevance": (
|
| 367 |
+
sum(self._last_metrics.relevance_scores) / len(self._last_metrics.relevance_scores)
|
| 368 |
+
if self._last_metrics.relevance_scores
|
| 369 |
+
else 0.0
|
| 370 |
+
),
|
| 371 |
+
}
|
headroom/integrations/langchain/streaming.py
ADDED
|
@@ -0,0 +1,341 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Streaming metrics tracking for LangChain.
|
| 2 |
+
|
| 3 |
+
This module provides StreamingMetricsTracker for tracking output tokens
|
| 4 |
+
during streaming responses from LangChain models.
|
| 5 |
+
|
| 6 |
+
Example:
|
| 7 |
+
from langchain_openai import ChatOpenAI
|
| 8 |
+
from headroom.integrations import HeadroomChatModel, StreamingMetricsTracker
|
| 9 |
+
|
| 10 |
+
llm = HeadroomChatModel(ChatOpenAI(model="gpt-4o"))
|
| 11 |
+
tracker = StreamingMetricsTracker(model="gpt-4o")
|
| 12 |
+
|
| 13 |
+
for chunk in llm.stream("Tell me a story"):
|
| 14 |
+
tracker.add_chunk(chunk)
|
| 15 |
+
print(chunk.content, end="", flush=True)
|
| 16 |
+
|
| 17 |
+
print(f"\\nOutput tokens: {tracker.output_tokens}")
|
| 18 |
+
"""
|
| 19 |
+
|
| 20 |
+
from __future__ import annotations
|
| 21 |
+
|
| 22 |
+
import logging
|
| 23 |
+
from dataclasses import dataclass
|
| 24 |
+
from datetime import datetime
|
| 25 |
+
from typing import Any
|
| 26 |
+
|
| 27 |
+
# LangChain imports - these are optional dependencies
|
| 28 |
+
try:
|
| 29 |
+
from langchain_core.messages import AIMessageChunk
|
| 30 |
+
from langchain_core.outputs import ChatGenerationChunk
|
| 31 |
+
|
| 32 |
+
LANGCHAIN_AVAILABLE = True
|
| 33 |
+
except ImportError:
|
| 34 |
+
LANGCHAIN_AVAILABLE = False
|
| 35 |
+
AIMessageChunk = object # type: ignore[misc,assignment]
|
| 36 |
+
ChatGenerationChunk = object # type: ignore[misc,assignment]
|
| 37 |
+
|
| 38 |
+
from headroom.providers import OpenAIProvider
|
| 39 |
+
|
| 40 |
+
logger = logging.getLogger(__name__)
|
| 41 |
+
|
| 42 |
+
|
| 43 |
+
def _check_langchain_available() -> None:
|
| 44 |
+
"""Raise ImportError if LangChain is not installed."""
|
| 45 |
+
if not LANGCHAIN_AVAILABLE:
|
| 46 |
+
raise ImportError(
|
| 47 |
+
"LangChain is required for this integration. "
|
| 48 |
+
"Install with: pip install headroom[langchain] "
|
| 49 |
+
"or: pip install langchain-core"
|
| 50 |
+
)
|
| 51 |
+
|
| 52 |
+
|
| 53 |
+
@dataclass
|
| 54 |
+
class StreamingMetrics:
|
| 55 |
+
"""Metrics from a streaming response."""
|
| 56 |
+
|
| 57 |
+
output_tokens: int
|
| 58 |
+
chunk_count: int
|
| 59 |
+
content_length: int
|
| 60 |
+
start_time: datetime
|
| 61 |
+
end_time: datetime | None
|
| 62 |
+
duration_ms: float | None
|
| 63 |
+
|
| 64 |
+
def to_dict(self) -> dict[str, Any]:
|
| 65 |
+
"""Convert to dictionary."""
|
| 66 |
+
return {
|
| 67 |
+
"output_tokens": self.output_tokens,
|
| 68 |
+
"chunk_count": self.chunk_count,
|
| 69 |
+
"content_length": self.content_length,
|
| 70 |
+
"start_time": self.start_time.isoformat(),
|
| 71 |
+
"end_time": self.end_time.isoformat() if self.end_time else None,
|
| 72 |
+
"duration_ms": self.duration_ms,
|
| 73 |
+
}
|
| 74 |
+
|
| 75 |
+
|
| 76 |
+
class StreamingMetricsTracker:
|
| 77 |
+
"""Tracks output tokens and metrics during streaming.
|
| 78 |
+
|
| 79 |
+
Accumulates content from streaming chunks and provides accurate
|
| 80 |
+
token counting for the streamed output.
|
| 81 |
+
|
| 82 |
+
Example:
|
| 83 |
+
tracker = StreamingMetricsTracker(model="gpt-4o")
|
| 84 |
+
|
| 85 |
+
async for chunk in llm.astream(messages):
|
| 86 |
+
tracker.add_chunk(chunk)
|
| 87 |
+
print(chunk.content, end="")
|
| 88 |
+
|
| 89 |
+
print(f"\\nTokens: {tracker.output_tokens}")
|
| 90 |
+
print(f"Duration: {tracker.duration_ms}ms")
|
| 91 |
+
|
| 92 |
+
Attributes:
|
| 93 |
+
model: Model name for token counting
|
| 94 |
+
content: Accumulated content from all chunks
|
| 95 |
+
output_tokens: Estimated token count for output
|
| 96 |
+
chunk_count: Number of chunks received
|
| 97 |
+
"""
|
| 98 |
+
|
| 99 |
+
def __init__(
|
| 100 |
+
self,
|
| 101 |
+
model: str = "gpt-4o",
|
| 102 |
+
provider: Any = None,
|
| 103 |
+
):
|
| 104 |
+
"""Initialize StreamingMetricsTracker.
|
| 105 |
+
|
| 106 |
+
Args:
|
| 107 |
+
model: Model name for token counting. Default "gpt-4o".
|
| 108 |
+
provider: Headroom provider for token counting. Uses
|
| 109 |
+
OpenAIProvider if not specified.
|
| 110 |
+
"""
|
| 111 |
+
_check_langchain_available()
|
| 112 |
+
|
| 113 |
+
self._model = model
|
| 114 |
+
self._provider = provider or OpenAIProvider()
|
| 115 |
+
self._content = ""
|
| 116 |
+
self._chunk_count = 0
|
| 117 |
+
self._start_time: datetime | None = None
|
| 118 |
+
self._end_time: datetime | None = None
|
| 119 |
+
|
| 120 |
+
def add_chunk(self, chunk: Any) -> None:
|
| 121 |
+
"""Add a streaming chunk to the tracker.
|
| 122 |
+
|
| 123 |
+
Extracts content from various chunk types:
|
| 124 |
+
- AIMessageChunk
|
| 125 |
+
- ChatGenerationChunk
|
| 126 |
+
- dict with 'content' key
|
| 127 |
+
- string
|
| 128 |
+
|
| 129 |
+
Args:
|
| 130 |
+
chunk: Streaming chunk from LangChain.
|
| 131 |
+
"""
|
| 132 |
+
if self._start_time is None:
|
| 133 |
+
self._start_time = datetime.now()
|
| 134 |
+
|
| 135 |
+
self._chunk_count += 1
|
| 136 |
+
|
| 137 |
+
# Extract content from various chunk types
|
| 138 |
+
content = self._extract_content(chunk)
|
| 139 |
+
if content:
|
| 140 |
+
self._content += content
|
| 141 |
+
|
| 142 |
+
def _extract_content(self, chunk: Any) -> str:
|
| 143 |
+
"""Extract string content from a chunk.
|
| 144 |
+
|
| 145 |
+
Args:
|
| 146 |
+
chunk: Streaming chunk of various types.
|
| 147 |
+
|
| 148 |
+
Returns:
|
| 149 |
+
Extracted content string.
|
| 150 |
+
"""
|
| 151 |
+
# AIMessageChunk
|
| 152 |
+
if hasattr(chunk, "content"):
|
| 153 |
+
content = chunk.content
|
| 154 |
+
if isinstance(content, str):
|
| 155 |
+
return content
|
| 156 |
+
return str(content) if content else ""
|
| 157 |
+
|
| 158 |
+
# ChatGenerationChunk
|
| 159 |
+
if hasattr(chunk, "message") and hasattr(chunk.message, "content"):
|
| 160 |
+
content = chunk.message.content
|
| 161 |
+
if isinstance(content, str):
|
| 162 |
+
return content
|
| 163 |
+
return str(content) if content else ""
|
| 164 |
+
|
| 165 |
+
# dict
|
| 166 |
+
if isinstance(chunk, dict):
|
| 167 |
+
return str(chunk.get("content", ""))
|
| 168 |
+
|
| 169 |
+
# string
|
| 170 |
+
if isinstance(chunk, str):
|
| 171 |
+
return chunk
|
| 172 |
+
|
| 173 |
+
return ""
|
| 174 |
+
|
| 175 |
+
def finish(self) -> StreamingMetrics:
|
| 176 |
+
"""Mark streaming as complete and return final metrics.
|
| 177 |
+
|
| 178 |
+
Returns:
|
| 179 |
+
StreamingMetrics with final values.
|
| 180 |
+
"""
|
| 181 |
+
self._end_time = datetime.now()
|
| 182 |
+
|
| 183 |
+
duration_ms = None
|
| 184 |
+
if self._start_time:
|
| 185 |
+
duration_ms = (self._end_time - self._start_time).total_seconds() * 1000
|
| 186 |
+
|
| 187 |
+
return StreamingMetrics(
|
| 188 |
+
output_tokens=self.output_tokens,
|
| 189 |
+
chunk_count=self._chunk_count,
|
| 190 |
+
content_length=len(self._content),
|
| 191 |
+
start_time=self._start_time or self._end_time,
|
| 192 |
+
end_time=self._end_time,
|
| 193 |
+
duration_ms=duration_ms,
|
| 194 |
+
)
|
| 195 |
+
|
| 196 |
+
@property
|
| 197 |
+
def content(self) -> str:
|
| 198 |
+
"""Get accumulated content."""
|
| 199 |
+
return self._content
|
| 200 |
+
|
| 201 |
+
@property
|
| 202 |
+
def output_tokens(self) -> int:
|
| 203 |
+
"""Get estimated output token count."""
|
| 204 |
+
if not self._content:
|
| 205 |
+
return 0
|
| 206 |
+
token_counter = self._provider.get_token_counter(self._model)
|
| 207 |
+
return token_counter.count_text(self._content)
|
| 208 |
+
|
| 209 |
+
@property
|
| 210 |
+
def chunk_count(self) -> int:
|
| 211 |
+
"""Get number of chunks received."""
|
| 212 |
+
return self._chunk_count
|
| 213 |
+
|
| 214 |
+
@property
|
| 215 |
+
def duration_ms(self) -> float | None:
|
| 216 |
+
"""Get duration in milliseconds (after finish())."""
|
| 217 |
+
if self._start_time is None or self._end_time is None:
|
| 218 |
+
return None
|
| 219 |
+
return (self._end_time - self._start_time).total_seconds() * 1000
|
| 220 |
+
|
| 221 |
+
def reset(self) -> None:
|
| 222 |
+
"""Reset tracker for reuse."""
|
| 223 |
+
self._content = ""
|
| 224 |
+
self._chunk_count = 0
|
| 225 |
+
self._start_time = None
|
| 226 |
+
self._end_time = None
|
| 227 |
+
|
| 228 |
+
|
| 229 |
+
class StreamingMetricsCallback:
|
| 230 |
+
"""Context manager for tracking streaming metrics.
|
| 231 |
+
|
| 232 |
+
Provides a clean interface for tracking a complete streaming
|
| 233 |
+
response with automatic timing.
|
| 234 |
+
|
| 235 |
+
Example:
|
| 236 |
+
with StreamingMetricsCallback(model="gpt-4o") as tracker:
|
| 237 |
+
for chunk in llm.stream(messages):
|
| 238 |
+
tracker.add_chunk(chunk)
|
| 239 |
+
print(chunk.content, end="")
|
| 240 |
+
|
| 241 |
+
print(f"\\nMetrics: {tracker.metrics}")
|
| 242 |
+
|
| 243 |
+
Attributes:
|
| 244 |
+
tracker: The underlying StreamingMetricsTracker
|
| 245 |
+
metrics: Final metrics after context exit
|
| 246 |
+
"""
|
| 247 |
+
|
| 248 |
+
def __init__(self, model: str = "gpt-4o", provider: Any = None):
|
| 249 |
+
"""Initialize StreamingMetricsCallback.
|
| 250 |
+
|
| 251 |
+
Args:
|
| 252 |
+
model: Model name for token counting.
|
| 253 |
+
provider: Headroom provider for token counting.
|
| 254 |
+
"""
|
| 255 |
+
self._tracker = StreamingMetricsTracker(model=model, provider=provider)
|
| 256 |
+
self._metrics: StreamingMetrics | None = None
|
| 257 |
+
|
| 258 |
+
def __enter__(self) -> StreamingMetricsTracker:
|
| 259 |
+
"""Enter context, return tracker."""
|
| 260 |
+
return self._tracker
|
| 261 |
+
|
| 262 |
+
def __exit__(self, exc_type: Any, exc_val: Any, exc_tb: Any) -> None:
|
| 263 |
+
"""Exit context, finalize metrics."""
|
| 264 |
+
self._metrics = self._tracker.finish()
|
| 265 |
+
|
| 266 |
+
@property
|
| 267 |
+
def tracker(self) -> StreamingMetricsTracker:
|
| 268 |
+
"""Get the tracker."""
|
| 269 |
+
return self._tracker
|
| 270 |
+
|
| 271 |
+
@property
|
| 272 |
+
def metrics(self) -> StreamingMetrics | None:
|
| 273 |
+
"""Get final metrics (after context exit)."""
|
| 274 |
+
return self._metrics
|
| 275 |
+
|
| 276 |
+
|
| 277 |
+
def track_streaming_response(
|
| 278 |
+
stream: Any,
|
| 279 |
+
model: str = "gpt-4o",
|
| 280 |
+
provider: Any = None,
|
| 281 |
+
) -> tuple[str, StreamingMetrics]:
|
| 282 |
+
"""Track a complete streaming response.
|
| 283 |
+
|
| 284 |
+
Convenience function that consumes a stream and returns the
|
| 285 |
+
accumulated content and metrics.
|
| 286 |
+
|
| 287 |
+
Args:
|
| 288 |
+
stream: Iterable of streaming chunks.
|
| 289 |
+
model: Model name for token counting.
|
| 290 |
+
provider: Headroom provider for token counting.
|
| 291 |
+
|
| 292 |
+
Returns:
|
| 293 |
+
Tuple of (accumulated_content, metrics).
|
| 294 |
+
|
| 295 |
+
Example:
|
| 296 |
+
content, metrics = track_streaming_response(
|
| 297 |
+
llm.stream(messages),
|
| 298 |
+
model="gpt-4o"
|
| 299 |
+
)
|
| 300 |
+
print(f"Content: {content}")
|
| 301 |
+
print(f"Tokens: {metrics.output_tokens}")
|
| 302 |
+
"""
|
| 303 |
+
tracker = StreamingMetricsTracker(model=model, provider=provider)
|
| 304 |
+
|
| 305 |
+
for chunk in stream:
|
| 306 |
+
tracker.add_chunk(chunk)
|
| 307 |
+
|
| 308 |
+
metrics = tracker.finish()
|
| 309 |
+
return tracker.content, metrics
|
| 310 |
+
|
| 311 |
+
|
| 312 |
+
async def track_async_streaming_response(
|
| 313 |
+
stream: Any,
|
| 314 |
+
model: str = "gpt-4o",
|
| 315 |
+
provider: Any = None,
|
| 316 |
+
) -> tuple[str, StreamingMetrics]:
|
| 317 |
+
"""Track a complete async streaming response.
|
| 318 |
+
|
| 319 |
+
Async version of track_streaming_response.
|
| 320 |
+
|
| 321 |
+
Args:
|
| 322 |
+
stream: Async iterable of streaming chunks.
|
| 323 |
+
model: Model name for token counting.
|
| 324 |
+
provider: Headroom provider for token counting.
|
| 325 |
+
|
| 326 |
+
Returns:
|
| 327 |
+
Tuple of (accumulated_content, metrics).
|
| 328 |
+
|
| 329 |
+
Example:
|
| 330 |
+
content, metrics = await track_async_streaming_response(
|
| 331 |
+
llm.astream(messages),
|
| 332 |
+
model="gpt-4o"
|
| 333 |
+
)
|
| 334 |
+
"""
|
| 335 |
+
tracker = StreamingMetricsTracker(model=model, provider=provider)
|
| 336 |
+
|
| 337 |
+
async for chunk in stream:
|
| 338 |
+
tracker.add_chunk(chunk)
|
| 339 |
+
|
| 340 |
+
metrics = tracker.finish()
|
| 341 |
+
return tracker.content, metrics
|
headroom/integrations/mcp/__init__.py
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""MCP (Model Context Protocol) integration for Headroom.
|
| 2 |
+
|
| 3 |
+
This package provides compression utilities for MCP tool results,
|
| 4 |
+
helping reduce context usage when tools return large outputs.
|
| 5 |
+
|
| 6 |
+
Example:
|
| 7 |
+
from headroom.integrations.mcp import compress_tool_result
|
| 8 |
+
|
| 9 |
+
# Compress large tool output
|
| 10 |
+
result = compress_tool_result(
|
| 11 |
+
tool_name="search",
|
| 12 |
+
result=large_json_result,
|
| 13 |
+
max_chars=5000,
|
| 14 |
+
)
|
| 15 |
+
"""
|
| 16 |
+
|
| 17 |
+
from .server import (
|
| 18 |
+
DEFAULT_MCP_PROFILES,
|
| 19 |
+
HeadroomMCPClientWrapper,
|
| 20 |
+
HeadroomMCPCompressor,
|
| 21 |
+
MCPCompressionResult,
|
| 22 |
+
MCPToolProfile,
|
| 23 |
+
compress_tool_result,
|
| 24 |
+
compress_tool_result_with_metrics,
|
| 25 |
+
create_headroom_mcp_proxy,
|
| 26 |
+
)
|
| 27 |
+
|
| 28 |
+
__all__ = [
|
| 29 |
+
"HeadroomMCPCompressor",
|
| 30 |
+
"HeadroomMCPClientWrapper",
|
| 31 |
+
"MCPCompressionResult",
|
| 32 |
+
"MCPToolProfile",
|
| 33 |
+
"compress_tool_result",
|
| 34 |
+
"compress_tool_result_with_metrics",
|
| 35 |
+
"create_headroom_mcp_proxy",
|
| 36 |
+
"DEFAULT_MCP_PROFILES",
|
| 37 |
+
]
|
headroom/integrations/{mcp.py → mcp/server.py}
RENAMED
|
File without changes
|
headroom/transforms/llmlingua_compressor.py
CHANGED
|
@@ -88,7 +88,8 @@ def _get_llmlingua_compressor(model_name: str, device: str) -> Any:
|
|
| 88 |
from llmlingua import PromptCompressor
|
| 89 |
|
| 90 |
logger.info(
|
| 91 |
-
"Loading LLMLingua-2 model: %s on device: %s
|
|
|
|
| 92 |
model_name,
|
| 93 |
device,
|
| 94 |
)
|
|
|
|
| 88 |
from llmlingua import PromptCompressor
|
| 89 |
|
| 90 |
logger.info(
|
| 91 |
+
"Loading LLMLingua-2 model: %s on device: %s "
|
| 92 |
+
"(this may take 10-30s on first run)",
|
| 93 |
model_name,
|
| 94 |
device,
|
| 95 |
)
|
pyproject.toml
CHANGED
|
@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
|
|
| 4 |
|
| 5 |
[project]
|
| 6 |
name = "headroom-ai"
|
| 7 |
-
version = "0.2.
|
| 8 |
description = "The Context Optimization Layer for LLM Applications - Cut costs by 50-90%"
|
| 9 |
readme = "README.md"
|
| 10 |
license = "Apache-2.0"
|
|
|
|
| 4 |
|
| 5 |
[project]
|
| 6 |
name = "headroom-ai"
|
| 7 |
+
version = "0.2.3"
|
| 8 |
description = "The Context Optimization Layer for LLM Applications - Cut costs by 50-90%"
|
| 9 |
readme = "README.md"
|
| 10 |
license = "Apache-2.0"
|
tests/test_integrations/langchain/__init__.py
ADDED
|
File without changes
|
tests/test_integrations/{test_langchain.py → langchain/test_chat_model.py}
RENAMED
|
@@ -488,7 +488,7 @@ class TestOptimizeMessages:
|
|
| 488 |
"""Basic message optimization."""
|
| 489 |
from headroom.integrations import optimize_messages
|
| 490 |
|
| 491 |
-
with patch("headroom.integrations.langchain.TransformPipeline") as MockPipeline:
|
| 492 |
mock_instance = MagicMock()
|
| 493 |
mock_result = MagicMock()
|
| 494 |
mock_result.messages = [
|
|
@@ -513,7 +513,7 @@ class TestOptimizeMessages:
|
|
| 513 |
|
| 514 |
config = HeadroomConfig(default_mode=HeadroomMode.AUDIT)
|
| 515 |
|
| 516 |
-
with patch("headroom.integrations.langchain.TransformPipeline") as MockPipeline:
|
| 517 |
mock_instance = MagicMock()
|
| 518 |
mock_result = MagicMock()
|
| 519 |
mock_result.messages = []
|
|
@@ -547,7 +547,7 @@ class TestOptimizeMessages:
|
|
| 547 |
ToolMessage(content="Sunny", tool_call_id="1"),
|
| 548 |
]
|
| 549 |
|
| 550 |
-
with patch("headroom.integrations.langchain.TransformPipeline") as MockPipeline:
|
| 551 |
mock_instance = MagicMock()
|
| 552 |
mock_result = MagicMock()
|
| 553 |
mock_result.messages = [
|
|
|
|
| 488 |
"""Basic message optimization."""
|
| 489 |
from headroom.integrations import optimize_messages
|
| 490 |
|
| 491 |
+
with patch("headroom.integrations.langchain.chat_model.TransformPipeline") as MockPipeline:
|
| 492 |
mock_instance = MagicMock()
|
| 493 |
mock_result = MagicMock()
|
| 494 |
mock_result.messages = [
|
|
|
|
| 513 |
|
| 514 |
config = HeadroomConfig(default_mode=HeadroomMode.AUDIT)
|
| 515 |
|
| 516 |
+
with patch("headroom.integrations.langchain.chat_model.TransformPipeline") as MockPipeline:
|
| 517 |
mock_instance = MagicMock()
|
| 518 |
mock_result = MagicMock()
|
| 519 |
mock_result.messages = []
|
|
|
|
| 547 |
ToolMessage(content="Sunny", tool_call_id="1"),
|
| 548 |
]
|
| 549 |
|
| 550 |
+
with patch("headroom.integrations.langchain.chat_model.TransformPipeline") as MockPipeline:
|
| 551 |
mock_instance = MagicMock()
|
| 552 |
mock_result = MagicMock()
|
| 553 |
mock_result.messages = [
|
tests/test_integrations/{test_langchain_evals.py → langchain/test_evals.py}
RENAMED
|
File without changes
|
tests/test_integrations/langchain/test_extended.py
ADDED
|
@@ -0,0 +1,646 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Tests for extended LangChain integration modules.
|
| 2 |
+
|
| 3 |
+
Tests cover:
|
| 4 |
+
1. langchain_providers - Provider auto-detection
|
| 5 |
+
2. langchain_memory - HeadroomChatMessageHistory
|
| 6 |
+
3. langchain_retriever - HeadroomDocumentCompressor
|
| 7 |
+
4. langchain_agents - HeadroomToolWrapper
|
| 8 |
+
5. langchain_langsmith - LangSmith integration
|
| 9 |
+
6. langchain_streaming - Streaming metrics
|
| 10 |
+
"""
|
| 11 |
+
|
| 12 |
+
import json
|
| 13 |
+
from unittest.mock import MagicMock
|
| 14 |
+
|
| 15 |
+
import pytest
|
| 16 |
+
|
| 17 |
+
# Check if LangChain is available
|
| 18 |
+
try:
|
| 19 |
+
from langchain_core.documents import Document
|
| 20 |
+
from langchain_core.messages import AIMessage, HumanMessage
|
| 21 |
+
from langchain_core.tools import StructuredTool
|
| 22 |
+
|
| 23 |
+
LANGCHAIN_AVAILABLE = True
|
| 24 |
+
except ImportError:
|
| 25 |
+
LANGCHAIN_AVAILABLE = False
|
| 26 |
+
|
| 27 |
+
# Skip all tests if LangChain not installed
|
| 28 |
+
pytestmark = pytest.mark.skipif(not LANGCHAIN_AVAILABLE, reason="LangChain not installed")
|
| 29 |
+
|
| 30 |
+
|
| 31 |
+
class TestProviderDetection:
|
| 32 |
+
"""Tests for langchain_providers module."""
|
| 33 |
+
|
| 34 |
+
def test_detect_openai_provider(self):
|
| 35 |
+
"""Detect OpenAI from ChatOpenAI class."""
|
| 36 |
+
from headroom.integrations.langchain.providers import detect_provider
|
| 37 |
+
|
| 38 |
+
mock_model = MagicMock()
|
| 39 |
+
mock_model.__class__.__name__ = "ChatOpenAI"
|
| 40 |
+
mock_model.__class__.__module__ = "langchain_openai.chat_models"
|
| 41 |
+
|
| 42 |
+
provider = detect_provider(mock_model)
|
| 43 |
+
assert provider == "openai"
|
| 44 |
+
|
| 45 |
+
def test_detect_anthropic_provider(self):
|
| 46 |
+
"""Detect Anthropic from ChatAnthropic class."""
|
| 47 |
+
from headroom.integrations.langchain.providers import detect_provider
|
| 48 |
+
|
| 49 |
+
mock_model = MagicMock()
|
| 50 |
+
mock_model.__class__.__name__ = "ChatAnthropic"
|
| 51 |
+
mock_model.__class__.__module__ = "langchain_anthropic.chat_models"
|
| 52 |
+
|
| 53 |
+
provider = detect_provider(mock_model)
|
| 54 |
+
assert provider == "anthropic"
|
| 55 |
+
|
| 56 |
+
def test_detect_google_provider(self):
|
| 57 |
+
"""Detect Google from ChatGoogleGenerativeAI class."""
|
| 58 |
+
from headroom.integrations.langchain.providers import detect_provider
|
| 59 |
+
|
| 60 |
+
mock_model = MagicMock()
|
| 61 |
+
mock_model.__class__.__name__ = "ChatGoogleGenerativeAI"
|
| 62 |
+
mock_model.__class__.__module__ = "langchain_google_genai"
|
| 63 |
+
|
| 64 |
+
provider = detect_provider(mock_model)
|
| 65 |
+
assert provider == "google"
|
| 66 |
+
|
| 67 |
+
def test_detect_fallback_to_openai(self):
|
| 68 |
+
"""Fall back to OpenAI for unknown models."""
|
| 69 |
+
from headroom.integrations.langchain.providers import detect_provider
|
| 70 |
+
|
| 71 |
+
mock_model = MagicMock()
|
| 72 |
+
mock_model.__class__.__name__ = "CustomChatModel"
|
| 73 |
+
mock_model.__class__.__module__ = "my_custom_module"
|
| 74 |
+
|
| 75 |
+
provider = detect_provider(mock_model)
|
| 76 |
+
assert provider == "openai"
|
| 77 |
+
|
| 78 |
+
def test_detect_from_model_name_claude(self):
|
| 79 |
+
"""Detect Anthropic from model name containing 'claude'."""
|
| 80 |
+
from headroom.integrations.langchain.providers import detect_provider
|
| 81 |
+
|
| 82 |
+
mock_model = MagicMock()
|
| 83 |
+
mock_model.__class__.__name__ = "CustomModel"
|
| 84 |
+
mock_model.__class__.__module__ = "custom"
|
| 85 |
+
mock_model.model_name = "claude-3-5-sonnet-20241022"
|
| 86 |
+
|
| 87 |
+
provider = detect_provider(mock_model)
|
| 88 |
+
assert provider == "anthropic"
|
| 89 |
+
|
| 90 |
+
def test_get_headroom_provider_openai(self):
|
| 91 |
+
"""Get OpenAIProvider for OpenAI model."""
|
| 92 |
+
from headroom.integrations.langchain.providers import get_headroom_provider
|
| 93 |
+
from headroom.providers import OpenAIProvider
|
| 94 |
+
|
| 95 |
+
mock_model = MagicMock()
|
| 96 |
+
mock_model.__class__.__name__ = "ChatOpenAI"
|
| 97 |
+
mock_model.__class__.__module__ = "langchain_openai"
|
| 98 |
+
|
| 99 |
+
provider = get_headroom_provider(mock_model)
|
| 100 |
+
assert isinstance(provider, OpenAIProvider)
|
| 101 |
+
|
| 102 |
+
def test_get_headroom_provider_anthropic(self):
|
| 103 |
+
"""Get AnthropicProvider for Anthropic model."""
|
| 104 |
+
from headroom.integrations.langchain.providers import get_headroom_provider
|
| 105 |
+
from headroom.providers import AnthropicProvider
|
| 106 |
+
|
| 107 |
+
mock_model = MagicMock()
|
| 108 |
+
mock_model.__class__.__name__ = "ChatAnthropic"
|
| 109 |
+
mock_model.__class__.__module__ = "langchain_anthropic"
|
| 110 |
+
|
| 111 |
+
provider = get_headroom_provider(mock_model)
|
| 112 |
+
assert isinstance(provider, AnthropicProvider)
|
| 113 |
+
|
| 114 |
+
def test_get_model_name_from_langchain(self):
|
| 115 |
+
"""Extract model name from LangChain model."""
|
| 116 |
+
from headroom.integrations.langchain.providers import get_model_name_from_langchain
|
| 117 |
+
|
| 118 |
+
mock_model = MagicMock()
|
| 119 |
+
mock_model.model_name = "gpt-4o"
|
| 120 |
+
|
| 121 |
+
name = get_model_name_from_langchain(mock_model)
|
| 122 |
+
assert name == "gpt-4o"
|
| 123 |
+
|
| 124 |
+
def test_get_model_name_fallback(self):
|
| 125 |
+
"""Fall back when model name not available."""
|
| 126 |
+
from headroom.integrations.langchain.providers import get_model_name_from_langchain
|
| 127 |
+
|
| 128 |
+
mock_model = MagicMock(spec=[])
|
| 129 |
+
mock_model.__class__.__name__ = "ChatOpenAI"
|
| 130 |
+
|
| 131 |
+
name = get_model_name_from_langchain(mock_model)
|
| 132 |
+
assert name == "gpt-4o" # Default for OpenAI
|
| 133 |
+
|
| 134 |
+
|
| 135 |
+
class TestHeadroomChatMessageHistory:
|
| 136 |
+
"""Tests for HeadroomChatMessageHistory memory wrapper."""
|
| 137 |
+
|
| 138 |
+
def test_init(self):
|
| 139 |
+
"""Initialize with base history."""
|
| 140 |
+
from headroom.integrations.langchain.memory import HeadroomChatMessageHistory
|
| 141 |
+
|
| 142 |
+
mock_history = MagicMock()
|
| 143 |
+
mock_history.messages = []
|
| 144 |
+
|
| 145 |
+
wrapper = HeadroomChatMessageHistory(
|
| 146 |
+
mock_history,
|
| 147 |
+
compress_threshold_tokens=4000,
|
| 148 |
+
keep_recent_turns=5,
|
| 149 |
+
)
|
| 150 |
+
|
| 151 |
+
assert wrapper._base is mock_history
|
| 152 |
+
assert wrapper._threshold == 4000
|
| 153 |
+
assert wrapper._keep_recent_turns == 5
|
| 154 |
+
|
| 155 |
+
def test_messages_passthrough_under_threshold(self):
|
| 156 |
+
"""Messages pass through when under threshold."""
|
| 157 |
+
from headroom.integrations.langchain.memory import HeadroomChatMessageHistory
|
| 158 |
+
|
| 159 |
+
mock_history = MagicMock()
|
| 160 |
+
mock_history.messages = [
|
| 161 |
+
HumanMessage(content="Hello"),
|
| 162 |
+
AIMessage(content="Hi there!"),
|
| 163 |
+
]
|
| 164 |
+
|
| 165 |
+
wrapper = HeadroomChatMessageHistory(
|
| 166 |
+
mock_history,
|
| 167 |
+
compress_threshold_tokens=10000, # High threshold
|
| 168 |
+
)
|
| 169 |
+
|
| 170 |
+
messages = wrapper.messages
|
| 171 |
+
assert len(messages) == 2
|
| 172 |
+
assert messages[0].content == "Hello"
|
| 173 |
+
|
| 174 |
+
def test_add_message_delegates(self):
|
| 175 |
+
"""add_message delegates to base history."""
|
| 176 |
+
from headroom.integrations.langchain.memory import HeadroomChatMessageHistory
|
| 177 |
+
|
| 178 |
+
mock_history = MagicMock()
|
| 179 |
+
mock_history.messages = []
|
| 180 |
+
|
| 181 |
+
wrapper = HeadroomChatMessageHistory(mock_history)
|
| 182 |
+
message = HumanMessage(content="Test")
|
| 183 |
+
wrapper.add_message(message)
|
| 184 |
+
|
| 185 |
+
mock_history.add_message.assert_called_once_with(message)
|
| 186 |
+
|
| 187 |
+
def test_clear_delegates(self):
|
| 188 |
+
"""clear delegates to base history."""
|
| 189 |
+
from headroom.integrations.langchain.memory import HeadroomChatMessageHistory
|
| 190 |
+
|
| 191 |
+
mock_history = MagicMock()
|
| 192 |
+
mock_history.messages = []
|
| 193 |
+
|
| 194 |
+
wrapper = HeadroomChatMessageHistory(mock_history)
|
| 195 |
+
wrapper.clear()
|
| 196 |
+
|
| 197 |
+
mock_history.clear.assert_called_once()
|
| 198 |
+
|
| 199 |
+
def test_get_compression_stats(self):
|
| 200 |
+
"""Get compression statistics."""
|
| 201 |
+
from headroom.integrations.langchain.memory import HeadroomChatMessageHistory
|
| 202 |
+
|
| 203 |
+
mock_history = MagicMock()
|
| 204 |
+
mock_history.messages = []
|
| 205 |
+
|
| 206 |
+
wrapper = HeadroomChatMessageHistory(mock_history)
|
| 207 |
+
stats = wrapper.get_compression_stats()
|
| 208 |
+
|
| 209 |
+
assert "compression_count" in stats
|
| 210 |
+
assert "total_tokens_saved" in stats
|
| 211 |
+
assert stats["compression_count"] == 0
|
| 212 |
+
|
| 213 |
+
|
| 214 |
+
class TestHeadroomDocumentCompressor:
|
| 215 |
+
"""Tests for HeadroomDocumentCompressor retriever integration."""
|
| 216 |
+
|
| 217 |
+
def test_init(self):
|
| 218 |
+
"""Initialize with defaults."""
|
| 219 |
+
from headroom.integrations.langchain.retriever import HeadroomDocumentCompressor
|
| 220 |
+
|
| 221 |
+
compressor = HeadroomDocumentCompressor()
|
| 222 |
+
|
| 223 |
+
assert compressor.max_documents == 10
|
| 224 |
+
assert compressor.min_relevance == 0.0
|
| 225 |
+
assert compressor.prefer_diverse is False
|
| 226 |
+
|
| 227 |
+
def test_init_custom(self):
|
| 228 |
+
"""Initialize with custom settings."""
|
| 229 |
+
from headroom.integrations.langchain.retriever import HeadroomDocumentCompressor
|
| 230 |
+
|
| 231 |
+
compressor = HeadroomDocumentCompressor(
|
| 232 |
+
max_documents=5,
|
| 233 |
+
min_relevance=0.5,
|
| 234 |
+
prefer_diverse=True,
|
| 235 |
+
)
|
| 236 |
+
|
| 237 |
+
assert compressor.max_documents == 5
|
| 238 |
+
assert compressor.min_relevance == 0.5
|
| 239 |
+
assert compressor.prefer_diverse is True
|
| 240 |
+
|
| 241 |
+
def test_compress_passthrough_under_limit(self):
|
| 242 |
+
"""Pass through when under max_documents."""
|
| 243 |
+
from headroom.integrations.langchain.retriever import HeadroomDocumentCompressor
|
| 244 |
+
|
| 245 |
+
compressor = HeadroomDocumentCompressor(max_documents=10)
|
| 246 |
+
|
| 247 |
+
docs = [
|
| 248 |
+
Document(page_content="Python is a programming language."),
|
| 249 |
+
Document(page_content="JavaScript runs in browsers."),
|
| 250 |
+
]
|
| 251 |
+
|
| 252 |
+
result = compressor.compress_documents(docs, "What is Python?")
|
| 253 |
+
|
| 254 |
+
assert len(result) == 2
|
| 255 |
+
|
| 256 |
+
def test_compress_reduces_to_max(self):
|
| 257 |
+
"""Compress when over max_documents."""
|
| 258 |
+
from headroom.integrations.langchain.retriever import HeadroomDocumentCompressor
|
| 259 |
+
|
| 260 |
+
compressor = HeadroomDocumentCompressor(max_documents=2)
|
| 261 |
+
|
| 262 |
+
docs = [
|
| 263 |
+
Document(page_content="Python is a programming language."),
|
| 264 |
+
Document(page_content="Java is also a language."),
|
| 265 |
+
Document(page_content="Weather today is sunny."),
|
| 266 |
+
Document(page_content="Cats are cute animals."),
|
| 267 |
+
]
|
| 268 |
+
|
| 269 |
+
result = compressor.compress_documents(docs, "programming language")
|
| 270 |
+
|
| 271 |
+
assert len(result) == 2
|
| 272 |
+
|
| 273 |
+
def test_compress_prefers_relevant(self):
|
| 274 |
+
"""Keep most relevant documents."""
|
| 275 |
+
from headroom.integrations.langchain.retriever import HeadroomDocumentCompressor
|
| 276 |
+
|
| 277 |
+
compressor = HeadroomDocumentCompressor(max_documents=1)
|
| 278 |
+
|
| 279 |
+
docs = [
|
| 280 |
+
Document(page_content="Weather today is sunny."),
|
| 281 |
+
Document(page_content="Python programming tutorial basics."),
|
| 282 |
+
Document(page_content="Cats are cute animals."),
|
| 283 |
+
]
|
| 284 |
+
|
| 285 |
+
result = compressor.compress_documents(docs, "Python tutorial")
|
| 286 |
+
|
| 287 |
+
assert len(result) == 1
|
| 288 |
+
assert "Python" in result[0].page_content
|
| 289 |
+
|
| 290 |
+
def test_metrics_tracked(self):
|
| 291 |
+
"""Compression metrics are tracked."""
|
| 292 |
+
from headroom.integrations.langchain.retriever import HeadroomDocumentCompressor
|
| 293 |
+
|
| 294 |
+
compressor = HeadroomDocumentCompressor(max_documents=2)
|
| 295 |
+
|
| 296 |
+
docs = [
|
| 297 |
+
Document(page_content="Doc 1"),
|
| 298 |
+
Document(page_content="Doc 2"),
|
| 299 |
+
Document(page_content="Doc 3"),
|
| 300 |
+
]
|
| 301 |
+
|
| 302 |
+
compressor.compress_documents(docs, "query")
|
| 303 |
+
|
| 304 |
+
metrics = compressor.last_metrics
|
| 305 |
+
assert metrics is not None
|
| 306 |
+
assert metrics.documents_before == 3
|
| 307 |
+
assert metrics.documents_after == 2
|
| 308 |
+
assert metrics.documents_removed == 1
|
| 309 |
+
|
| 310 |
+
def test_get_compression_stats(self):
|
| 311 |
+
"""Get compression statistics."""
|
| 312 |
+
from headroom.integrations.langchain.retriever import HeadroomDocumentCompressor
|
| 313 |
+
|
| 314 |
+
compressor = HeadroomDocumentCompressor(max_documents=1)
|
| 315 |
+
docs = [Document(page_content="A"), Document(page_content="B")]
|
| 316 |
+
|
| 317 |
+
compressor.compress_documents(docs, "A")
|
| 318 |
+
stats = compressor.get_compression_stats()
|
| 319 |
+
|
| 320 |
+
assert "documents_before" in stats
|
| 321 |
+
assert "documents_after" in stats
|
| 322 |
+
assert "average_relevance" in stats
|
| 323 |
+
|
| 324 |
+
|
| 325 |
+
class TestHeadroomToolWrapper:
|
| 326 |
+
"""Tests for HeadroomToolWrapper agent integration."""
|
| 327 |
+
|
| 328 |
+
def test_init(self):
|
| 329 |
+
"""Initialize wrapper."""
|
| 330 |
+
from headroom.integrations.langchain.agents import HeadroomToolWrapper
|
| 331 |
+
|
| 332 |
+
mock_tool = MagicMock()
|
| 333 |
+
mock_tool.name = "test_tool"
|
| 334 |
+
mock_tool.description = "A test tool"
|
| 335 |
+
|
| 336 |
+
wrapper = HeadroomToolWrapper(mock_tool)
|
| 337 |
+
|
| 338 |
+
assert wrapper.name == "test_tool"
|
| 339 |
+
assert wrapper.description == "A test tool"
|
| 340 |
+
|
| 341 |
+
def test_call_passthrough_small_output(self):
|
| 342 |
+
"""Small outputs pass through without compression."""
|
| 343 |
+
from headroom.integrations.langchain.agents import HeadroomToolWrapper
|
| 344 |
+
|
| 345 |
+
mock_tool = MagicMock()
|
| 346 |
+
mock_tool.name = "test"
|
| 347 |
+
mock_tool.description = "test"
|
| 348 |
+
mock_tool.invoke.return_value = "small result"
|
| 349 |
+
|
| 350 |
+
wrapper = HeadroomToolWrapper(mock_tool, min_chars_to_compress=1000)
|
| 351 |
+
result = wrapper("query")
|
| 352 |
+
|
| 353 |
+
assert result == "small result"
|
| 354 |
+
|
| 355 |
+
def test_call_compresses_large_json(self):
|
| 356 |
+
"""Large JSON outputs get compressed."""
|
| 357 |
+
from headroom.integrations.langchain.agents import HeadroomToolWrapper
|
| 358 |
+
|
| 359 |
+
mock_tool = MagicMock()
|
| 360 |
+
mock_tool.name = "search"
|
| 361 |
+
mock_tool.description = "search"
|
| 362 |
+
|
| 363 |
+
# Large JSON output
|
| 364 |
+
large_output = json.dumps([{"id": i, "data": "x" * 100} for i in range(50)])
|
| 365 |
+
mock_tool.invoke.return_value = large_output
|
| 366 |
+
|
| 367 |
+
wrapper = HeadroomToolWrapper(mock_tool, min_chars_to_compress=100)
|
| 368 |
+
result = wrapper("query")
|
| 369 |
+
|
| 370 |
+
# Should be smaller after compression
|
| 371 |
+
assert len(result) <= len(large_output)
|
| 372 |
+
|
| 373 |
+
def test_as_langchain_tool(self):
|
| 374 |
+
"""Convert to LangChain tool."""
|
| 375 |
+
from headroom.integrations.langchain.agents import HeadroomToolWrapper
|
| 376 |
+
|
| 377 |
+
mock_tool = MagicMock()
|
| 378 |
+
mock_tool.name = "test"
|
| 379 |
+
mock_tool.description = "test tool"
|
| 380 |
+
mock_tool.invoke.return_value = "result"
|
| 381 |
+
|
| 382 |
+
wrapper = HeadroomToolWrapper(mock_tool)
|
| 383 |
+
lc_tool = wrapper.as_langchain_tool()
|
| 384 |
+
|
| 385 |
+
assert isinstance(lc_tool, StructuredTool)
|
| 386 |
+
assert lc_tool.name == "test"
|
| 387 |
+
|
| 388 |
+
def test_wrap_tools_with_headroom(self):
|
| 389 |
+
"""Wrap multiple tools at once."""
|
| 390 |
+
from headroom.integrations.langchain.agents import wrap_tools_with_headroom
|
| 391 |
+
|
| 392 |
+
tools = []
|
| 393 |
+
for i in range(3):
|
| 394 |
+
mock = MagicMock()
|
| 395 |
+
mock.name = f"tool_{i}"
|
| 396 |
+
mock.description = f"Tool {i}"
|
| 397 |
+
mock.invoke.return_value = "result"
|
| 398 |
+
tools.append(mock)
|
| 399 |
+
|
| 400 |
+
wrapped = wrap_tools_with_headroom(tools)
|
| 401 |
+
|
| 402 |
+
assert len(wrapped) == 3
|
| 403 |
+
assert all(isinstance(t, StructuredTool) for t in wrapped)
|
| 404 |
+
|
| 405 |
+
def test_metrics_collector(self):
|
| 406 |
+
"""Tool metrics are collected."""
|
| 407 |
+
from headroom.integrations.langchain.agents import (
|
| 408 |
+
HeadroomToolWrapper,
|
| 409 |
+
ToolMetricsCollector,
|
| 410 |
+
)
|
| 411 |
+
|
| 412 |
+
collector = ToolMetricsCollector()
|
| 413 |
+
|
| 414 |
+
mock_tool = MagicMock()
|
| 415 |
+
mock_tool.name = "test"
|
| 416 |
+
mock_tool.description = "test"
|
| 417 |
+
mock_tool.invoke.return_value = "result"
|
| 418 |
+
|
| 419 |
+
wrapper = HeadroomToolWrapper(mock_tool, metrics_collector=collector)
|
| 420 |
+
wrapper("query")
|
| 421 |
+
|
| 422 |
+
assert len(collector.metrics) == 1
|
| 423 |
+
assert collector.metrics[0].tool_name == "test"
|
| 424 |
+
|
| 425 |
+
|
| 426 |
+
class TestHeadroomLangSmithCallbackHandler:
|
| 427 |
+
"""Tests for LangSmith integration."""
|
| 428 |
+
|
| 429 |
+
def test_init(self):
|
| 430 |
+
"""Initialize handler."""
|
| 431 |
+
from headroom.integrations.langchain.langsmith import (
|
| 432 |
+
HeadroomLangSmithCallbackHandler,
|
| 433 |
+
)
|
| 434 |
+
|
| 435 |
+
handler = HeadroomLangSmithCallbackHandler(auto_update_runs=False)
|
| 436 |
+
|
| 437 |
+
assert handler._auto_update is False
|
| 438 |
+
assert handler._pending_metrics == {}
|
| 439 |
+
|
| 440 |
+
def test_set_headroom_metrics(self):
|
| 441 |
+
"""Set metrics for a run."""
|
| 442 |
+
from headroom.integrations.langchain.langsmith import (
|
| 443 |
+
HeadroomLangSmithCallbackHandler,
|
| 444 |
+
)
|
| 445 |
+
|
| 446 |
+
handler = HeadroomLangSmithCallbackHandler(auto_update_runs=False)
|
| 447 |
+
|
| 448 |
+
handler.set_headroom_metrics(
|
| 449 |
+
run_id="test-run-123",
|
| 450 |
+
tokens_before=1000,
|
| 451 |
+
tokens_after=800,
|
| 452 |
+
transforms_applied=["smart_crusher"],
|
| 453 |
+
)
|
| 454 |
+
|
| 455 |
+
assert "test-run-123" in handler._pending_metrics
|
| 456 |
+
metrics = handler._pending_metrics["test-run-123"]
|
| 457 |
+
assert metrics.tokens_before == 1000
|
| 458 |
+
assert metrics.tokens_after == 800
|
| 459 |
+
assert metrics.tokens_saved == 200
|
| 460 |
+
assert metrics.savings_percent == 20.0
|
| 461 |
+
|
| 462 |
+
def test_get_run_metrics(self):
|
| 463 |
+
"""Get metrics for a specific run."""
|
| 464 |
+
from headroom.integrations.langchain.langsmith import (
|
| 465 |
+
HeadroomLangSmithCallbackHandler,
|
| 466 |
+
)
|
| 467 |
+
|
| 468 |
+
handler = HeadroomLangSmithCallbackHandler(auto_update_runs=False)
|
| 469 |
+
handler._run_metrics["run-1"] = {"headroom.tokens_saved": 100}
|
| 470 |
+
|
| 471 |
+
metrics = handler.get_run_metrics("run-1")
|
| 472 |
+
assert metrics["headroom.tokens_saved"] == 100
|
| 473 |
+
|
| 474 |
+
def test_get_summary(self):
|
| 475 |
+
"""Get summary statistics."""
|
| 476 |
+
from headroom.integrations.langchain.langsmith import (
|
| 477 |
+
HeadroomLangSmithCallbackHandler,
|
| 478 |
+
)
|
| 479 |
+
|
| 480 |
+
handler = HeadroomLangSmithCallbackHandler(auto_update_runs=False)
|
| 481 |
+
handler._run_metrics = {
|
| 482 |
+
"run-1": {"headroom.tokens_saved": 100, "headroom.savings_percent": 20},
|
| 483 |
+
"run-2": {"headroom.tokens_saved": 200, "headroom.savings_percent": 30},
|
| 484 |
+
}
|
| 485 |
+
|
| 486 |
+
summary = handler.get_summary()
|
| 487 |
+
assert summary["total_runs"] == 2
|
| 488 |
+
assert summary["total_tokens_saved"] == 300
|
| 489 |
+
assert summary["average_savings_percent"] == 25.0
|
| 490 |
+
|
| 491 |
+
def test_reset(self):
|
| 492 |
+
"""Reset clears all metrics."""
|
| 493 |
+
from headroom.integrations.langchain.langsmith import (
|
| 494 |
+
HeadroomLangSmithCallbackHandler,
|
| 495 |
+
)
|
| 496 |
+
|
| 497 |
+
handler = HeadroomLangSmithCallbackHandler(auto_update_runs=False)
|
| 498 |
+
handler._run_metrics = {"run-1": {}}
|
| 499 |
+
handler._pending_metrics = {"run-2": MagicMock()}
|
| 500 |
+
|
| 501 |
+
handler.reset()
|
| 502 |
+
|
| 503 |
+
assert handler._run_metrics == {}
|
| 504 |
+
assert handler._pending_metrics == {}
|
| 505 |
+
|
| 506 |
+
|
| 507 |
+
class TestStreamingMetricsTracker:
|
| 508 |
+
"""Tests for streaming metrics tracking."""
|
| 509 |
+
|
| 510 |
+
def test_init(self):
|
| 511 |
+
"""Initialize tracker."""
|
| 512 |
+
from headroom.integrations.langchain.streaming import StreamingMetricsTracker
|
| 513 |
+
|
| 514 |
+
tracker = StreamingMetricsTracker(model="gpt-4o")
|
| 515 |
+
|
| 516 |
+
assert tracker._model == "gpt-4o"
|
| 517 |
+
assert tracker._content == ""
|
| 518 |
+
assert tracker._chunk_count == 0
|
| 519 |
+
|
| 520 |
+
def test_add_chunk_string(self):
|
| 521 |
+
"""Add string chunks."""
|
| 522 |
+
from headroom.integrations.langchain.streaming import StreamingMetricsTracker
|
| 523 |
+
|
| 524 |
+
tracker = StreamingMetricsTracker()
|
| 525 |
+
tracker.add_chunk("Hello ")
|
| 526 |
+
tracker.add_chunk("world!")
|
| 527 |
+
|
| 528 |
+
assert tracker.content == "Hello world!"
|
| 529 |
+
assert tracker.chunk_count == 2
|
| 530 |
+
|
| 531 |
+
def test_add_chunk_with_content_attr(self):
|
| 532 |
+
"""Add chunks with content attribute."""
|
| 533 |
+
from headroom.integrations.langchain.streaming import StreamingMetricsTracker
|
| 534 |
+
|
| 535 |
+
tracker = StreamingMetricsTracker()
|
| 536 |
+
|
| 537 |
+
chunk1 = MagicMock()
|
| 538 |
+
chunk1.content = "Hello "
|
| 539 |
+
chunk2 = MagicMock()
|
| 540 |
+
chunk2.content = "world!"
|
| 541 |
+
|
| 542 |
+
tracker.add_chunk(chunk1)
|
| 543 |
+
tracker.add_chunk(chunk2)
|
| 544 |
+
|
| 545 |
+
assert tracker.content == "Hello world!"
|
| 546 |
+
|
| 547 |
+
def test_output_tokens(self):
|
| 548 |
+
"""Count output tokens."""
|
| 549 |
+
from headroom.integrations.langchain.streaming import StreamingMetricsTracker
|
| 550 |
+
|
| 551 |
+
tracker = StreamingMetricsTracker(model="gpt-4o")
|
| 552 |
+
tracker.add_chunk("Hello world, this is a test message.")
|
| 553 |
+
|
| 554 |
+
tokens = tracker.output_tokens
|
| 555 |
+
assert tokens > 0
|
| 556 |
+
|
| 557 |
+
def test_finish(self):
|
| 558 |
+
"""Finish tracking and get metrics."""
|
| 559 |
+
from headroom.integrations.langchain.streaming import StreamingMetricsTracker
|
| 560 |
+
|
| 561 |
+
tracker = StreamingMetricsTracker()
|
| 562 |
+
tracker.add_chunk("Test content")
|
| 563 |
+
metrics = tracker.finish()
|
| 564 |
+
|
| 565 |
+
assert metrics.chunk_count == 1
|
| 566 |
+
assert metrics.content_length == len("Test content")
|
| 567 |
+
assert metrics.duration_ms is not None
|
| 568 |
+
assert metrics.end_time is not None
|
| 569 |
+
|
| 570 |
+
def test_reset(self):
|
| 571 |
+
"""Reset tracker for reuse."""
|
| 572 |
+
from headroom.integrations.langchain.streaming import StreamingMetricsTracker
|
| 573 |
+
|
| 574 |
+
tracker = StreamingMetricsTracker()
|
| 575 |
+
tracker.add_chunk("Content")
|
| 576 |
+
tracker.finish()
|
| 577 |
+
|
| 578 |
+
tracker.reset()
|
| 579 |
+
|
| 580 |
+
assert tracker.content == ""
|
| 581 |
+
assert tracker.chunk_count == 0
|
| 582 |
+
|
| 583 |
+
def test_streaming_metrics_callback(self):
|
| 584 |
+
"""Test context manager interface."""
|
| 585 |
+
from headroom.integrations.langchain.streaming import StreamingMetricsCallback
|
| 586 |
+
|
| 587 |
+
with StreamingMetricsCallback(model="gpt-4o") as tracker:
|
| 588 |
+
tracker.add_chunk("Hello")
|
| 589 |
+
tracker.add_chunk(" world")
|
| 590 |
+
|
| 591 |
+
# After context exit, metrics should be available
|
| 592 |
+
# (accessed via the callback object, not the tracker)
|
| 593 |
+
|
| 594 |
+
def test_track_streaming_response(self):
|
| 595 |
+
"""Track a complete streaming response."""
|
| 596 |
+
from headroom.integrations.langchain.streaming import track_streaming_response
|
| 597 |
+
|
| 598 |
+
chunks = ["Hello ", "world", "!"]
|
| 599 |
+
content, metrics = track_streaming_response(iter(chunks), model="gpt-4o")
|
| 600 |
+
|
| 601 |
+
assert content == "Hello world!"
|
| 602 |
+
assert metrics.chunk_count == 3
|
| 603 |
+
|
| 604 |
+
|
| 605 |
+
class TestAutoDetectProviderInChatModel:
|
| 606 |
+
"""Tests for auto_detect_provider in HeadroomChatModel."""
|
| 607 |
+
|
| 608 |
+
def test_auto_detect_enabled_by_default(self):
|
| 609 |
+
"""auto_detect_provider is True by default."""
|
| 610 |
+
from headroom.integrations import HeadroomChatModel
|
| 611 |
+
|
| 612 |
+
mock_model = MagicMock()
|
| 613 |
+
mock_model._llm_type = "test"
|
| 614 |
+
mock_model._identifying_params = {}
|
| 615 |
+
mock_model.__class__.__name__ = "ChatOpenAI"
|
| 616 |
+
mock_model.__class__.__module__ = "langchain_openai"
|
| 617 |
+
|
| 618 |
+
model = HeadroomChatModel(mock_model)
|
| 619 |
+
assert model.auto_detect_provider is True
|
| 620 |
+
|
| 621 |
+
def test_auto_detect_can_be_disabled(self):
|
| 622 |
+
"""auto_detect_provider can be set to False."""
|
| 623 |
+
from headroom.integrations import HeadroomChatModel
|
| 624 |
+
|
| 625 |
+
mock_model = MagicMock()
|
| 626 |
+
mock_model._llm_type = "test"
|
| 627 |
+
mock_model._identifying_params = {}
|
| 628 |
+
|
| 629 |
+
model = HeadroomChatModel(mock_model, auto_detect_provider=False)
|
| 630 |
+
assert model.auto_detect_provider is False
|
| 631 |
+
|
| 632 |
+
def test_pipeline_uses_detected_provider(self):
|
| 633 |
+
"""Pipeline uses auto-detected provider."""
|
| 634 |
+
from headroom.integrations import HeadroomChatModel
|
| 635 |
+
from headroom.providers import AnthropicProvider
|
| 636 |
+
|
| 637 |
+
mock_model = MagicMock()
|
| 638 |
+
mock_model._llm_type = "test"
|
| 639 |
+
mock_model._identifying_params = {}
|
| 640 |
+
mock_model.__class__.__name__ = "ChatAnthropic"
|
| 641 |
+
mock_model.__class__.__module__ = "langchain_anthropic"
|
| 642 |
+
|
| 643 |
+
model = HeadroomChatModel(mock_model)
|
| 644 |
+
_ = model.pipeline # Force lazy init
|
| 645 |
+
|
| 646 |
+
assert isinstance(model._provider, AnthropicProvider)
|
tests/test_integrations/mcp/__init__.py
ADDED
|
File without changes
|
tests/test_integrations/{test_mcp.py → mcp/test_server.py}
RENAMED
|
File without changes
|
uv.lock
CHANGED
|
@@ -6,6 +6,25 @@ resolution-markers = [
|
|
| 6 |
"python_full_version < '3.11'",
|
| 7 |
]
|
| 8 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
[[package]]
|
| 10 |
name = "annotated-doc"
|
| 11 |
version = "0.0.4"
|
|
@@ -362,8 +381,8 @@ wheels = [
|
|
| 362 |
]
|
| 363 |
|
| 364 |
[[package]]
|
| 365 |
-
name = "headroom"
|
| 366 |
-
version = "0.2.
|
| 367 |
source = { editable = "." }
|
| 368 |
dependencies = [
|
| 369 |
{ name = "pydantic" },
|
|
@@ -375,11 +394,18 @@ all = [
|
|
| 375 |
{ name = "fastapi" },
|
| 376 |
{ name = "httpx" },
|
| 377 |
{ name = "jinja2" },
|
|
|
|
| 378 |
{ name = "numpy", version = "2.2.6", source = { registry = "https://pypi.netflix.net/simple" }, marker = "python_full_version < '3.11'" },
|
| 379 |
{ name = "numpy", version = "2.4.0", source = { registry = "https://pypi.netflix.net/simple" }, marker = "python_full_version >= '3.11'" },
|
| 380 |
{ name = "sentence-transformers" },
|
|
|
|
|
|
|
|
|
|
| 381 |
{ name = "uvicorn" },
|
| 382 |
]
|
|
|
|
|
|
|
|
|
|
| 383 |
dev = [
|
| 384 |
{ name = "anthropic" },
|
| 385 |
{ name = "mypy" },
|
|
@@ -389,6 +415,11 @@ dev = [
|
|
| 389 |
{ name = "pytest-cov" },
|
| 390 |
{ name = "ruff" },
|
| 391 |
]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 392 |
proxy = [
|
| 393 |
{ name = "fastapi" },
|
| 394 |
{ name = "httpx" },
|
|
@@ -407,9 +438,10 @@ reports = [
|
|
| 407 |
requires-dist = [
|
| 408 |
{ name = "anthropic", marker = "extra == 'dev'", specifier = ">=0.18.0" },
|
| 409 |
{ name = "fastapi", marker = "extra == 'proxy'", specifier = ">=0.100.0" },
|
| 410 |
-
{ name = "headroom", extras = ["relevance", "proxy", "reports"], marker = "extra == 'all'" },
|
| 411 |
{ name = "httpx", marker = "extra == 'proxy'", specifier = ">=0.24.0" },
|
| 412 |
{ name = "jinja2", marker = "extra == 'reports'", specifier = ">=3.0.0" },
|
|
|
|
| 413 |
{ name = "mypy", marker = "extra == 'dev'", specifier = ">=1.0.0" },
|
| 414 |
{ name = "numpy", marker = "extra == 'relevance'", specifier = ">=1.24.0" },
|
| 415 |
{ name = "openai", marker = "extra == 'dev'", specifier = ">=1.0.0" },
|
|
@@ -420,6 +452,9 @@ requires-dist = [
|
|
| 420 |
{ name = "ruff", marker = "extra == 'dev'", specifier = ">=0.1.0" },
|
| 421 |
{ name = "sentence-transformers", marker = "extra == 'relevance'", specifier = ">=2.2.0" },
|
| 422 |
{ name = "tiktoken", specifier = ">=0.5.0" },
|
|
|
|
|
|
|
|
|
|
| 423 |
{ name = "uvicorn", marker = "extra == 'proxy'", specifier = ">=0.23.0" },
|
| 424 |
]
|
| 425 |
|
|
@@ -708,6 +743,24 @@ wheels = [
|
|
| 708 |
{ url = "https://pypi.netflix.net/packages/19544946795/librt-0.7.7-cp314-cp314t-win_arm64.whl", hash = "sha256:142c2cd91794b79fd0ce113bd658993b7ede0fe93057668c2f98a45ca00b7e91", size = 39724 },
|
| 709 |
]
|
| 710 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 711 |
[[package]]
|
| 712 |
name = "markupsafe"
|
| 713 |
version = "3.0.3"
|
|
@@ -882,6 +935,21 @@ wheels = [
|
|
| 882 |
{ url = "https://pypi.netflix.net/packages/19441125158/networkx-3.6.1-py3-none-any.whl", hash = "sha256:d47fbf302e7d9cbbb9e2555a0d267983d2aa476bac30e90dfbe5669bd57f3762", size = 2068504 },
|
| 883 |
]
|
| 884 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 885 |
[[package]]
|
| 886 |
name = "numpy"
|
| 887 |
version = "2.2.6"
|
|
@@ -1225,6 +1293,34 @@ wheels = [
|
|
| 1225 |
{ url = "https://pypi.netflix.net/packages/18687957486/pluggy-1.6.0-py3-none-any.whl", hash = "sha256:e920276dd6813095e9377c0bc5566d94c932c33b27a3e3945d8389c374dd4746", size = 20538 },
|
| 1226 |
]
|
| 1227 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1228 |
[[package]]
|
| 1229 |
name = "pydantic"
|
| 1230 |
version = "2.12.5"
|
|
@@ -2193,6 +2289,115 @@ wheels = [
|
|
| 2193 |
{ url = "https://pypi.netflix.net/packages/19387983499/transformers-4.57.3-py3-none-any.whl", hash = "sha256:c77d353a4851b1880191603d36acb313411d3577f6e2897814f333841f7003f4", size = 11993463 },
|
| 2194 |
]
|
| 2195 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2196 |
[[package]]
|
| 2197 |
name = "triton"
|
| 2198 |
version = "3.5.1"
|
|
|
|
| 6 |
"python_full_version < '3.11'",
|
| 7 |
]
|
| 8 |
|
| 9 |
+
[[package]]
|
| 10 |
+
name = "accelerate"
|
| 11 |
+
version = "1.12.0"
|
| 12 |
+
source = { registry = "https://pypi.netflix.net/simple" }
|
| 13 |
+
dependencies = [
|
| 14 |
+
{ name = "huggingface-hub" },
|
| 15 |
+
{ name = "numpy", version = "2.2.6", source = { registry = "https://pypi.netflix.net/simple" }, marker = "python_full_version < '3.11'" },
|
| 16 |
+
{ name = "numpy", version = "2.4.0", source = { registry = "https://pypi.netflix.net/simple" }, marker = "python_full_version >= '3.11'" },
|
| 17 |
+
{ name = "packaging" },
|
| 18 |
+
{ name = "psutil" },
|
| 19 |
+
{ name = "pyyaml" },
|
| 20 |
+
{ name = "safetensors" },
|
| 21 |
+
{ name = "torch" },
|
| 22 |
+
]
|
| 23 |
+
sdist = { url = "https://pypi.netflix.net/packages/19372078203/accelerate-1.12.0.tar.gz", hash = "sha256:70988c352feb481887077d2ab845125024b2a137a5090d6d7a32b57d03a45df6", size = 398399 }
|
| 24 |
+
wheels = [
|
| 25 |
+
{ url = "https://pypi.netflix.net/packages/19372078202/accelerate-1.12.0-py3-none-any.whl", hash = "sha256:3e2091cd341423207e2f084a6654b1efcd250dc326f2a37d6dde446e07cabb11", size = 380935 },
|
| 26 |
+
]
|
| 27 |
+
|
| 28 |
[[package]]
|
| 29 |
name = "annotated-doc"
|
| 30 |
version = "0.0.4"
|
|
|
|
| 381 |
]
|
| 382 |
|
| 383 |
[[package]]
|
| 384 |
+
name = "headroom-ai"
|
| 385 |
+
version = "0.2.3"
|
| 386 |
source = { editable = "." }
|
| 387 |
dependencies = [
|
| 388 |
{ name = "pydantic" },
|
|
|
|
| 394 |
{ name = "fastapi" },
|
| 395 |
{ name = "httpx" },
|
| 396 |
{ name = "jinja2" },
|
| 397 |
+
{ name = "llmlingua" },
|
| 398 |
{ name = "numpy", version = "2.2.6", source = { registry = "https://pypi.netflix.net/simple" }, marker = "python_full_version < '3.11'" },
|
| 399 |
{ name = "numpy", version = "2.4.0", source = { registry = "https://pypi.netflix.net/simple" }, marker = "python_full_version >= '3.11'" },
|
| 400 |
{ name = "sentence-transformers" },
|
| 401 |
+
{ name = "torch" },
|
| 402 |
+
{ name = "transformers" },
|
| 403 |
+
{ name = "tree-sitter-language-pack" },
|
| 404 |
{ name = "uvicorn" },
|
| 405 |
]
|
| 406 |
+
code = [
|
| 407 |
+
{ name = "tree-sitter-language-pack" },
|
| 408 |
+
]
|
| 409 |
dev = [
|
| 410 |
{ name = "anthropic" },
|
| 411 |
{ name = "mypy" },
|
|
|
|
| 415 |
{ name = "pytest-cov" },
|
| 416 |
{ name = "ruff" },
|
| 417 |
]
|
| 418 |
+
llmlingua = [
|
| 419 |
+
{ name = "llmlingua" },
|
| 420 |
+
{ name = "torch" },
|
| 421 |
+
{ name = "transformers" },
|
| 422 |
+
]
|
| 423 |
proxy = [
|
| 424 |
{ name = "fastapi" },
|
| 425 |
{ name = "httpx" },
|
|
|
|
| 438 |
requires-dist = [
|
| 439 |
{ name = "anthropic", marker = "extra == 'dev'", specifier = ">=0.18.0" },
|
| 440 |
{ name = "fastapi", marker = "extra == 'proxy'", specifier = ">=0.100.0" },
|
| 441 |
+
{ name = "headroom-ai", extras = ["relevance", "proxy", "reports", "llmlingua", "code"], marker = "extra == 'all'" },
|
| 442 |
{ name = "httpx", marker = "extra == 'proxy'", specifier = ">=0.24.0" },
|
| 443 |
{ name = "jinja2", marker = "extra == 'reports'", specifier = ">=3.0.0" },
|
| 444 |
+
{ name = "llmlingua", marker = "extra == 'llmlingua'", specifier = ">=0.2.0" },
|
| 445 |
{ name = "mypy", marker = "extra == 'dev'", specifier = ">=1.0.0" },
|
| 446 |
{ name = "numpy", marker = "extra == 'relevance'", specifier = ">=1.24.0" },
|
| 447 |
{ name = "openai", marker = "extra == 'dev'", specifier = ">=1.0.0" },
|
|
|
|
| 452 |
{ name = "ruff", marker = "extra == 'dev'", specifier = ">=0.1.0" },
|
| 453 |
{ name = "sentence-transformers", marker = "extra == 'relevance'", specifier = ">=2.2.0" },
|
| 454 |
{ name = "tiktoken", specifier = ">=0.5.0" },
|
| 455 |
+
{ name = "torch", marker = "extra == 'llmlingua'", specifier = ">=2.0.0" },
|
| 456 |
+
{ name = "transformers", marker = "extra == 'llmlingua'", specifier = ">=4.30.0" },
|
| 457 |
+
{ name = "tree-sitter-language-pack", marker = "extra == 'code'", specifier = ">=0.10.0" },
|
| 458 |
{ name = "uvicorn", marker = "extra == 'proxy'", specifier = ">=0.23.0" },
|
| 459 |
]
|
| 460 |
|
|
|
|
| 743 |
{ url = "https://pypi.netflix.net/packages/19544946795/librt-0.7.7-cp314-cp314t-win_arm64.whl", hash = "sha256:142c2cd91794b79fd0ce113bd658993b7ede0fe93057668c2f98a45ca00b7e91", size = 39724 },
|
| 744 |
]
|
| 745 |
|
| 746 |
+
[[package]]
|
| 747 |
+
name = "llmlingua"
|
| 748 |
+
version = "0.2.2"
|
| 749 |
+
source = { registry = "https://pypi.netflix.net/simple" }
|
| 750 |
+
dependencies = [
|
| 751 |
+
{ name = "accelerate" },
|
| 752 |
+
{ name = "nltk" },
|
| 753 |
+
{ name = "numpy", version = "2.2.6", source = { registry = "https://pypi.netflix.net/simple" }, marker = "python_full_version < '3.11'" },
|
| 754 |
+
{ name = "numpy", version = "2.4.0", source = { registry = "https://pypi.netflix.net/simple" }, marker = "python_full_version >= '3.11'" },
|
| 755 |
+
{ name = "tiktoken" },
|
| 756 |
+
{ name = "torch" },
|
| 757 |
+
{ name = "transformers" },
|
| 758 |
+
]
|
| 759 |
+
sdist = { url = "https://pypi.netflix.net/packages/19606733170/llmlingua-0.2.2.tar.gz", hash = "sha256:1a0caedd8d5a65512a85dadb6bfda6f5b3c4b45e5cb9e7b1c6009573f9058572", size = 59753 }
|
| 760 |
+
wheels = [
|
| 761 |
+
{ url = "https://pypi.netflix.net/packages/19606733169/llmlingua-0.2.2-py3-none-any.whl", hash = "sha256:da55137efe0db78063b3395396efe8a0dcfe4ae5a09aea0d503c34b7bf1d800c", size = 30536 },
|
| 762 |
+
]
|
| 763 |
+
|
| 764 |
[[package]]
|
| 765 |
name = "markupsafe"
|
| 766 |
version = "3.0.3"
|
|
|
|
| 935 |
{ url = "https://pypi.netflix.net/packages/19441125158/networkx-3.6.1-py3-none-any.whl", hash = "sha256:d47fbf302e7d9cbbb9e2555a0d267983d2aa476bac30e90dfbe5669bd57f3762", size = 2068504 },
|
| 936 |
]
|
| 937 |
|
| 938 |
+
[[package]]
|
| 939 |
+
name = "nltk"
|
| 940 |
+
version = "3.9.2"
|
| 941 |
+
source = { registry = "https://pypi.netflix.net/simple" }
|
| 942 |
+
dependencies = [
|
| 943 |
+
{ name = "click" },
|
| 944 |
+
{ name = "joblib" },
|
| 945 |
+
{ name = "regex" },
|
| 946 |
+
{ name = "tqdm" },
|
| 947 |
+
]
|
| 948 |
+
sdist = { url = "https://pypi.netflix.net/packages/19152095449/nltk-3.9.2.tar.gz", hash = "sha256:0f409e9b069ca4177c1903c3e843eef90c7e92992fa4931ae607da6de49e1419", size = 2887629 }
|
| 949 |
+
wheels = [
|
| 950 |
+
{ url = "https://pypi.netflix.net/packages/19152095448/nltk-3.9.2-py3-none-any.whl", hash = "sha256:1e209d2b3009110635ed9709a67a1a3e33a10f799490fa71cf4bec218c11c88a", size = 1513404 },
|
| 951 |
+
]
|
| 952 |
+
|
| 953 |
[[package]]
|
| 954 |
name = "numpy"
|
| 955 |
version = "2.2.6"
|
|
|
|
| 1293 |
{ url = "https://pypi.netflix.net/packages/18687957486/pluggy-1.6.0-py3-none-any.whl", hash = "sha256:e920276dd6813095e9377c0bc5566d94c932c33b27a3e3945d8389c374dd4746", size = 20538 },
|
| 1294 |
]
|
| 1295 |
|
| 1296 |
+
[[package]]
|
| 1297 |
+
name = "psutil"
|
| 1298 |
+
version = "7.2.1"
|
| 1299 |
+
source = { registry = "https://pypi.netflix.net/simple" }
|
| 1300 |
+
sdist = { url = "https://pypi.netflix.net/packages/19533562506/psutil-7.2.1.tar.gz", hash = "sha256:f7583aec590485b43ca601dd9cea0dcd65bd7bb21d30ef4ddbf4ea6b5ed1bdd3", size = 490253 }
|
| 1301 |
+
wheels = [
|
| 1302 |
+
{ url = "https://pypi.netflix.net/packages/19533562496/psutil-7.2.1-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:ba9f33bb525b14c3ea563b2fd521a84d2fa214ec59e3e6a2858f78d0844dd60d", size = 129624 },
|
| 1303 |
+
{ url = "https://pypi.netflix.net/packages/19533562497/psutil-7.2.1-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:81442dac7abfc2f4f4385ea9e12ddf5a796721c0f6133260687fec5c3780fa49", size = 130132 },
|
| 1304 |
+
{ url = "https://pypi.netflix.net/packages/19533562498/psutil-7.2.1-cp313-cp313t-manylinux2010_x86_64.manylinux_2_12_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:ea46c0d060491051d39f0d2cff4f98d5c72b288289f57a21556cc7d504db37fc", size = 180612 },
|
| 1305 |
+
{ url = "https://pypi.netflix.net/packages/19533562499/psutil-7.2.1-cp313-cp313t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:35630d5af80d5d0d49cfc4d64c1c13838baf6717a13effb35869a5919b854cdf", size = 183201 },
|
| 1306 |
+
{ url = "https://pypi.netflix.net/packages/19533562500/psutil-7.2.1-cp313-cp313t-win_amd64.whl", hash = "sha256:923f8653416604e356073e6e0bccbe7c09990acef442def2f5640dd0faa9689f", size = 139081 },
|
| 1307 |
+
{ url = "https://pypi.netflix.net/packages/19533562501/psutil-7.2.1-cp313-cp313t-win_arm64.whl", hash = "sha256:cfbe6b40ca48019a51827f20d830887b3107a74a79b01ceb8cc8de4ccb17b672", size = 134767 },
|
| 1308 |
+
{ url = "https://pypi.netflix.net/packages/19533562502/psutil-7.2.1-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:494c513ccc53225ae23eec7fe6e1482f1b8a44674241b54561f755a898650679", size = 129716 },
|
| 1309 |
+
{ url = "https://pypi.netflix.net/packages/19533562503/psutil-7.2.1-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:3fce5f92c22b00cdefd1645aa58ab4877a01679e901555067b1bd77039aa589f", size = 130133 },
|
| 1310 |
+
{ url = "https://pypi.netflix.net/packages/19533562504/psutil-7.2.1-cp314-cp314t-manylinux2010_x86_64.manylinux_2_12_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:93f3f7b0bb07711b49626e7940d6fe52aa9940ad86e8f7e74842e73189712129", size = 181518 },
|
| 1311 |
+
{ url = "https://pypi.netflix.net/packages/19533562505/psutil-7.2.1-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:d34d2ca888208eea2b5c68186841336a7f5e0b990edec929be909353a202768a", size = 184348 },
|
| 1312 |
+
{ url = "https://pypi.netflix.net/packages/19533563921/psutil-7.2.1-cp314-cp314t-win_amd64.whl", hash = "sha256:2ceae842a78d1603753561132d5ad1b2f8a7979cb0c283f5b52fb4e6e14b1a79", size = 140400 },
|
| 1313 |
+
{ url = "https://pypi.netflix.net/packages/19533563922/psutil-7.2.1-cp314-cp314t-win_arm64.whl", hash = "sha256:08a2f175e48a898c8eb8eace45ce01777f4785bc744c90aa2cc7f2fa5462a266", size = 135430 },
|
| 1314 |
+
{ url = "https://pypi.netflix.net/packages/19533563923/psutil-7.2.1-cp36-abi3-macosx_10_9_x86_64.whl", hash = "sha256:b2e953fcfaedcfbc952b44744f22d16575d3aa78eb4f51ae74165b4e96e55f42", size = 128137 },
|
| 1315 |
+
{ url = "https://pypi.netflix.net/packages/19533563924/psutil-7.2.1-cp36-abi3-macosx_11_0_arm64.whl", hash = "sha256:05cc68dbb8c174828624062e73078e7e35406f4ca2d0866c272c2410d8ef06d1", size = 128947 },
|
| 1316 |
+
{ url = "https://pypi.netflix.net/packages/19533563925/psutil-7.2.1-cp36-abi3-manylinux2010_x86_64.manylinux_2_12_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:5e38404ca2bb30ed7267a46c02f06ff842e92da3bb8c5bfdadbd35a5722314d8", size = 154694 },
|
| 1317 |
+
{ url = "https://pypi.netflix.net/packages/19533563926/psutil-7.2.1-cp36-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:ab2b98c9fc19f13f59628d94df5cc4cc4844bc572467d113a8b517d634e362c6", size = 156136 },
|
| 1318 |
+
{ url = "https://pypi.netflix.net/packages/19533563927/psutil-7.2.1-cp36-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:f78baafb38436d5a128f837fab2d92c276dfb48af01a240b861ae02b2413ada8", size = 148108 },
|
| 1319 |
+
{ url = "https://pypi.netflix.net/packages/19533565348/psutil-7.2.1-cp36-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:99a4cd17a5fdd1f3d014396502daa70b5ec21bf4ffe38393e152f8e449757d67", size = 147402 },
|
| 1320 |
+
{ url = "https://pypi.netflix.net/packages/19533565349/psutil-7.2.1-cp37-abi3-win_amd64.whl", hash = "sha256:b1b0671619343aa71c20ff9767eced0483e4fc9e1f489d50923738caf6a03c17", size = 136938 },
|
| 1321 |
+
{ url = "https://pypi.netflix.net/packages/19533565350/psutil-7.2.1-cp37-abi3-win_arm64.whl", hash = "sha256:0d67c1822c355aa6f7314d92018fb4268a76668a536f133599b91edd48759442", size = 133836 },
|
| 1322 |
+
]
|
| 1323 |
+
|
| 1324 |
[[package]]
|
| 1325 |
name = "pydantic"
|
| 1326 |
version = "2.12.5"
|
|
|
|
| 2289 |
{ url = "https://pypi.netflix.net/packages/19387983499/transformers-4.57.3-py3-none-any.whl", hash = "sha256:c77d353a4851b1880191603d36acb313411d3577f6e2897814f333841f7003f4", size = 11993463 },
|
| 2290 |
]
|
| 2291 |
|
| 2292 |
+
[[package]]
|
| 2293 |
+
name = "tree-sitter"
|
| 2294 |
+
version = "0.25.2"
|
| 2295 |
+
source = { registry = "https://pypi.netflix.net/simple" }
|
| 2296 |
+
sdist = { url = "https://pypi.netflix.net/packages/19129803294/tree-sitter-0.25.2.tar.gz", hash = "sha256:fe43c158555da46723b28b52e058ad444195afd1db3ca7720c59a254544e9c20", size = 177961 }
|
| 2297 |
+
wheels = [
|
| 2298 |
+
{ url = "https://pypi.netflix.net/packages/19129800490/tree_sitter-0.25.2-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:72a510931c3c25f134aac2daf4eb4feca99ffe37a35896d7150e50ac3eee06c7", size = 146749 },
|
| 2299 |
+
{ url = "https://pypi.netflix.net/packages/19129800491/tree_sitter-0.25.2-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:44488e0e78146f87baaa009736886516779253d6d6bac3ef636ede72bc6a8234", size = 137766 },
|
| 2300 |
+
{ url = "https://pypi.netflix.net/packages/19129800492/tree_sitter-0.25.2-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:c2f8e7d6b2f8489d4a9885e3adcaef4bc5ff0a275acd990f120e29c4ab3395c5", size = 599809 },
|
| 2301 |
+
{ url = "https://pypi.netflix.net/packages/19129800493/tree_sitter-0.25.2-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:20b570690f87f1da424cd690e51cc56728d21d63f4abd4b326d382a30353acc7", size = 627676 },
|
| 2302 |
+
{ url = "https://pypi.netflix.net/packages/19129800494/tree_sitter-0.25.2-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:a0ec41b895da717bc218a42a3a7a0bfcfe9a213d7afaa4255353901e0e21f696", size = 624281 },
|
| 2303 |
+
{ url = "https://pypi.netflix.net/packages/19129800495/tree_sitter-0.25.2-cp310-cp310-win_amd64.whl", hash = "sha256:7712335855b2307a21ae86efe949c76be36c6068d76df34faa27ce9ee40ff444", size = 127295 },
|
| 2304 |
+
{ url = "https://pypi.netflix.net/packages/19129800496/tree_sitter-0.25.2-cp310-cp310-win_arm64.whl", hash = "sha256:a925364eb7fbb9cdce55a9868f7525a1905af512a559303bd54ef468fd88cb37", size = 113991 },
|
| 2305 |
+
{ url = "https://pypi.netflix.net/packages/19129800497/tree_sitter-0.25.2-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:b8ca72d841215b6573ed0655b3a5cd1133f9b69a6fa561aecad40dca9029d75b", size = 146752 },
|
| 2306 |
+
{ url = "https://pypi.netflix.net/packages/19129800498/tree_sitter-0.25.2-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:cc0351cfe5022cec5a77645f647f92a936b38850346ed3f6d6babfbeeeca4d26", size = 137765 },
|
| 2307 |
+
{ url = "https://pypi.netflix.net/packages/19129800499/tree_sitter-0.25.2-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:1799609636c0193e16c38f366bda5af15b1ce476df79ddaae7dd274df9e44266", size = 604643 },
|
| 2308 |
+
{ url = "https://pypi.netflix.net/packages/19129800500/tree_sitter-0.25.2-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:3e65ae456ad0d210ee71a89ee112ac7e72e6c2e5aac1b95846ecc7afa68a194c", size = 632229 },
|
| 2309 |
+
{ url = "https://pypi.netflix.net/packages/19129800501/tree_sitter-0.25.2-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:49ee3c348caa459244ec437ccc7ff3831f35977d143f65311572b8ba0a5f265f", size = 629861 },
|
| 2310 |
+
{ url = "https://pypi.netflix.net/packages/19129800502/tree_sitter-0.25.2-cp311-cp311-win_amd64.whl", hash = "sha256:56ac6602c7d09c2c507c55e58dc7026b8988e0475bd0002f8a386cce5e8e8adc", size = 127304 },
|
| 2311 |
+
{ url = "https://pypi.netflix.net/packages/19129800503/tree_sitter-0.25.2-cp311-cp311-win_arm64.whl", hash = "sha256:b3d11a3a3ac89bb8a2543d75597f905a9926f9c806f40fcca8242922d1cc6ad5", size = 113990 },
|
| 2312 |
+
{ url = "https://pypi.netflix.net/packages/19129801135/tree_sitter-0.25.2-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:ddabfff809ffc983fc9963455ba1cecc90295803e06e140a4c83e94c1fa3d960", size = 146941 },
|
| 2313 |
+
{ url = "https://pypi.netflix.net/packages/19129801136/tree_sitter-0.25.2-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:c0c0ab5f94938a23fe81928a21cc0fac44143133ccc4eb7eeb1b92f84748331c", size = 137699 },
|
| 2314 |
+
{ url = "https://pypi.netflix.net/packages/19129801137/tree_sitter-0.25.2-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:dd12d80d91d4114ca097626eb82714618dcdfacd6a5e0955216c6485c350ef99", size = 607125 },
|
| 2315 |
+
{ url = "https://pypi.netflix.net/packages/19129801138/tree_sitter-0.25.2-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:b43a9e4c89d4d0839de27cd4d6902d33396de700e9ff4c5ab7631f277a85ead9", size = 635418 },
|
| 2316 |
+
{ url = "https://pypi.netflix.net/packages/19129801139/tree_sitter-0.25.2-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:fbb1706407c0e451c4f8cc016fec27d72d4b211fdd3173320b1ada7a6c74c3ac", size = 631250 },
|
| 2317 |
+
{ url = "https://pypi.netflix.net/packages/19129801140/tree_sitter-0.25.2-cp312-cp312-win_amd64.whl", hash = "sha256:6d0302550bbe4620a5dc7649517c4409d74ef18558276ce758419cf09e578897", size = 127156 },
|
| 2318 |
+
{ url = "https://pypi.netflix.net/packages/19129801141/tree_sitter-0.25.2-cp312-cp312-win_arm64.whl", hash = "sha256:0c8b6682cac77e37cfe5cf7ec388844957f48b7bd8d6321d0ca2d852994e10d5", size = 113984 },
|
| 2319 |
+
{ url = "https://pypi.netflix.net/packages/19129801142/tree_sitter-0.25.2-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:0628671f0de69bb279558ef6b640bcfc97864fe0026d840f872728a86cd6b6cd", size = 146926 },
|
| 2320 |
+
{ url = "https://pypi.netflix.net/packages/19129801143/tree_sitter-0.25.2-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:f5ddcd3e291a749b62521f71fc953f66f5fd9743973fd6dd962b092773569601", size = 137712 },
|
| 2321 |
+
{ url = "https://pypi.netflix.net/packages/19129801144/tree_sitter-0.25.2-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:bd88fbb0f6c3a0f28f0a68d72df88e9755cf5215bae146f5a1bdc8362b772053", size = 607873 },
|
| 2322 |
+
{ url = "https://pypi.netflix.net/packages/19129801145/tree_sitter-0.25.2-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:b878e296e63661c8e124177cc3084b041ba3f5936b43076d57c487822426f614", size = 636313 },
|
| 2323 |
+
{ url = "https://pypi.netflix.net/packages/19129801146/tree_sitter-0.25.2-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:d77605e0d353ba3fe5627e5490f0fbfe44141bafa4478d88ef7954a61a848dae", size = 631370 },
|
| 2324 |
+
{ url = "https://pypi.netflix.net/packages/19129803321/tree_sitter-0.25.2-cp313-cp313-win_amd64.whl", hash = "sha256:463c032bd02052d934daa5f45d183e0521ceb783c2548501cf034b0beba92c9b", size = 127157 },
|
| 2325 |
+
{ url = "https://pypi.netflix.net/packages/19129803322/tree_sitter-0.25.2-cp313-cp313-win_arm64.whl", hash = "sha256:b3f63a1796886249bd22c559a5944d64d05d43f2be72961624278eff0dcc5cb8", size = 113975 },
|
| 2326 |
+
{ url = "https://pypi.netflix.net/packages/19129803323/tree_sitter-0.25.2-cp314-cp314-macosx_10_13_x86_64.whl", hash = "sha256:65d3c931013ea798b502782acab986bbf47ba2c452610ab0776cf4a8ef150fc0", size = 146776 },
|
| 2327 |
+
{ url = "https://pypi.netflix.net/packages/19129803324/tree_sitter-0.25.2-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:bda059af9d621918efb813b22fb06b3fe00c3e94079c6143fcb2c565eb44cb87", size = 137732 },
|
| 2328 |
+
{ url = "https://pypi.netflix.net/packages/19129803325/tree_sitter-0.25.2-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:eac4e8e4c7060c75f395feec46421eb61212cb73998dbe004b7384724f3682ab", size = 609456 },
|
| 2329 |
+
{ url = "https://pypi.netflix.net/packages/19129803326/tree_sitter-0.25.2-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:260586381b23be33b6191a07cea3d44ecbd6c01aa4c6b027a0439145fcbc3358", size = 636772 },
|
| 2330 |
+
{ url = "https://pypi.netflix.net/packages/19129803327/tree_sitter-0.25.2-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:7d2ee1acbacebe50ba0f85fff1bc05e65d877958f00880f49f9b2af38dce1af0", size = 631522 },
|
| 2331 |
+
{ url = "https://pypi.netflix.net/packages/19129803328/tree_sitter-0.25.2-cp314-cp314-win_amd64.whl", hash = "sha256:4973b718fcadfb04e59e746abfbb0288694159c6aeecd2add59320c03368c721", size = 130864 },
|
| 2332 |
+
{ url = "https://pypi.netflix.net/packages/19129803329/tree_sitter-0.25.2-cp314-cp314-win_arm64.whl", hash = "sha256:b8d4429954a3beb3e844e2872610d2a4800ba4eb42bb1990c6a4b1949b18459f", size = 117470 },
|
| 2333 |
+
]
|
| 2334 |
+
|
| 2335 |
+
[[package]]
|
| 2336 |
+
name = "tree-sitter-c-sharp"
|
| 2337 |
+
version = "0.23.1"
|
| 2338 |
+
source = { registry = "https://pypi.netflix.net/simple" }
|
| 2339 |
+
sdist = { url = "https://pypi.netflix.net/packages/18519163555/tree_sitter_c_sharp-0.23.1.tar.gz", hash = "sha256:322e2cfd3a547a840375276b2aea3335fa6458aeac082f6c60fec3f745c967eb", size = 1317728 }
|
| 2340 |
+
wheels = [
|
| 2341 |
+
{ url = "https://pypi.netflix.net/packages/18519163548/tree_sitter_c_sharp-0.23.1-cp39-abi3-macosx_10_9_x86_64.whl", hash = "sha256:2b612a6e5bd17bb7fa2aab4bb6fc1fba45c94f09cb034ab332e45603b86e32fd", size = 372235 },
|
| 2342 |
+
{ url = "https://pypi.netflix.net/packages/18519163549/tree_sitter_c_sharp-0.23.1-cp39-abi3-macosx_11_0_arm64.whl", hash = "sha256:1a8b98f62bc53efcd4d971151950c9b9cd5cbe3bacdb0cd69fdccac63350d83e", size = 419046 },
|
| 2343 |
+
{ url = "https://pypi.netflix.net/packages/18519163550/tree_sitter_c_sharp-0.23.1-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:986e93d845a438ec3c4416401aa98e6a6f6631d644bbbc2e43fcb915c51d255d", size = 415999 },
|
| 2344 |
+
{ url = "https://pypi.netflix.net/packages/18519163551/tree_sitter_c_sharp-0.23.1-cp39-abi3-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a8024e466b2f5611c6dc90321f232d8584893c7fb88b75e4a831992f877616d2", size = 402830 },
|
| 2345 |
+
{ url = "https://pypi.netflix.net/packages/18519163552/tree_sitter_c_sharp-0.23.1-cp39-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:7f9bf876866835492281d336b9e1f9626ab668737f74e914c31d285261507da7", size = 397880 },
|
| 2346 |
+
{ url = "https://pypi.netflix.net/packages/18519163553/tree_sitter_c_sharp-0.23.1-cp39-abi3-win_amd64.whl", hash = "sha256:ae9a9e859e8f44e2b07578d44f9a220d3fa25b688966708af6aa55d42abeebb3", size = 377562 },
|
| 2347 |
+
{ url = "https://pypi.netflix.net/packages/18519163554/tree_sitter_c_sharp-0.23.1-cp39-abi3-win_arm64.whl", hash = "sha256:c81548347a93347be4f48cb63ec7d60ef4b0efa91313330e69641e49aa5a08c5", size = 375157 },
|
| 2348 |
+
]
|
| 2349 |
+
|
| 2350 |
+
[[package]]
|
| 2351 |
+
name = "tree-sitter-embedded-template"
|
| 2352 |
+
version = "0.25.0"
|
| 2353 |
+
source = { registry = "https://pypi.netflix.net/simple" }
|
| 2354 |
+
sdist = { url = "https://pypi.netflix.net/packages/19023467751/tree_sitter_embedded_template-0.25.0.tar.gz", hash = "sha256:7d72d5e8a1d1d501a7c90e841b51f1449a90cc240be050e4fb85c22dab991d50", size = 14114 }
|
| 2355 |
+
wheels = [
|
| 2356 |
+
{ url = "https://pypi.netflix.net/packages/19023467743/tree_sitter_embedded_template-0.25.0-cp310-abi3-macosx_10_9_x86_64.whl", hash = "sha256:fa0d06467199aeb33fb3d6fa0665bf9b7d5a32621ffdaf37fd8249f8a8050649", size = 10266 },
|
| 2357 |
+
{ url = "https://pypi.netflix.net/packages/19023467744/tree_sitter_embedded_template-0.25.0-cp310-abi3-macosx_11_0_arm64.whl", hash = "sha256:fc7aacbc2985a5d7e7fe7334f44dffe24c38fb0a8295c4188a04cf21a3d64a73", size = 10650 },
|
| 2358 |
+
{ url = "https://pypi.netflix.net/packages/19023467745/tree_sitter_embedded_template-0.25.0-cp310-abi3-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:a7c88c3dd8b94b3c9efe8ae071ff6b1b936a27ac5f6e651845c3b9631fa4c1c2", size = 18268 },
|
| 2359 |
+
{ url = "https://pypi.netflix.net/packages/19023467746/tree_sitter_embedded_template-0.25.0-cp310-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:025f7ca84218dcd8455efc901bdbcc2689fb694f3a636c0448e322a23d4bc96b", size = 19068 },
|
| 2360 |
+
{ url = "https://pypi.netflix.net/packages/19023467747/tree_sitter_embedded_template-0.25.0-cp310-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:b5dc1aef6ffa3fae621fe037d85dd98948b597afba20df29d779c426be813ee5", size = 18518 },
|
| 2361 |
+
{ url = "https://pypi.netflix.net/packages/19023467748/tree_sitter_embedded_template-0.25.0-cp310-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:d0a35cfe634c44981a516243bc039874580e02a2990669313730187ce83a5bc6", size = 18267 },
|
| 2362 |
+
{ url = "https://pypi.netflix.net/packages/19023467749/tree_sitter_embedded_template-0.25.0-cp310-abi3-win_amd64.whl", hash = "sha256:3e05a4ac013d54505e75ae48e1a0e9db9aab19949fe15d9f4c7345b11a84a069", size = 13049 },
|
| 2363 |
+
{ url = "https://pypi.netflix.net/packages/19023467750/tree_sitter_embedded_template-0.25.0-cp310-abi3-win_arm64.whl", hash = "sha256:2751d402179ac0e83f2065b249d8fe6df0718153f1636bcb6a02bde3e5730db9", size = 11978 },
|
| 2364 |
+
]
|
| 2365 |
+
|
| 2366 |
+
[[package]]
|
| 2367 |
+
name = "tree-sitter-language-pack"
|
| 2368 |
+
version = "0.13.0"
|
| 2369 |
+
source = { registry = "https://pypi.netflix.net/simple" }
|
| 2370 |
+
dependencies = [
|
| 2371 |
+
{ name = "tree-sitter" },
|
| 2372 |
+
{ name = "tree-sitter-c-sharp" },
|
| 2373 |
+
{ name = "tree-sitter-embedded-template" },
|
| 2374 |
+
{ name = "tree-sitter-yaml" },
|
| 2375 |
+
]
|
| 2376 |
+
sdist = { url = "https://pypi.netflix.net/packages/19391792931/tree_sitter_language_pack-0.13.0.tar.gz", hash = "sha256:032034c5e27b1f6e00730b9e7c2dbc8203b4700d0c681fd019d6defcf61183ec", size = 51353370 }
|
| 2377 |
+
wheels = [
|
| 2378 |
+
{ url = "https://pypi.netflix.net/packages/19391792760/tree_sitter_language_pack-0.13.0-cp310-abi3-macosx_10_15_universal2.whl", hash = "sha256:0e7eae812b40a2dc8a12eb2f5c55e130eb892706a0bee06215dd76affeb00d07", size = 32991857 },
|
| 2379 |
+
{ url = "https://pypi.netflix.net/packages/19391792761/tree_sitter_language_pack-0.13.0-cp310-abi3-manylinux2014_aarch64.whl", hash = "sha256:7fdacf383418a845b20772118fcb53ad245f9c5d409bd07dae16acec65151756", size = 20092989 },
|
| 2380 |
+
{ url = "https://pypi.netflix.net/packages/19391792762/tree_sitter_language_pack-0.13.0-cp310-abi3-manylinux2014_x86_64.whl", hash = "sha256:0d4f261fce387ae040dae7e4d1c1aca63d84c88320afcc0961c123bec0be8377", size = 19952029 },
|
| 2381 |
+
{ url = "https://pypi.netflix.net/packages/19391792845/tree_sitter_language_pack-0.13.0-cp310-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:78f369dc4d456c5b08d659939e662c2f9b9fba8c0ec5538a1f973e01edfcf04d", size = 19944614 },
|
| 2382 |
+
{ url = "https://pypi.netflix.net/packages/19391792846/tree_sitter_language_pack-0.13.0-cp310-abi3-win_amd64.whl", hash = "sha256:1cdbc88a03dacd47bec69e56cc20c48eace1fbb6f01371e89c3ee6a2e8f34db1", size = 16896852 },
|
| 2383 |
+
]
|
| 2384 |
+
|
| 2385 |
+
[[package]]
|
| 2386 |
+
name = "tree-sitter-yaml"
|
| 2387 |
+
version = "0.7.2"
|
| 2388 |
+
source = { registry = "https://pypi.netflix.net/simple" }
|
| 2389 |
+
sdist = { url = "https://pypi.netflix.net/packages/19176087043/tree_sitter_yaml-0.7.2.tar.gz", hash = "sha256:756db4c09c9d9e97c81699e8f941cb8ce4e51104927f6090eefe638ee567d32c", size = 84882 }
|
| 2390 |
+
wheels = [
|
| 2391 |
+
{ url = "https://pypi.netflix.net/packages/19176087035/tree_sitter_yaml-0.7.2-cp310-abi3-macosx_10_9_x86_64.whl", hash = "sha256:7e269ddcfcab8edb14fbb1f1d34eed1e1e26888f78f94eedfe7cc98c60f8bc9f", size = 43898 },
|
| 2392 |
+
{ url = "https://pypi.netflix.net/packages/19176087036/tree_sitter_yaml-0.7.2-cp310-abi3-macosx_11_0_arm64.whl", hash = "sha256:0807b7966e23ddf7dddc4545216e28b5a58cdadedcecca86b8d8c74271a07870", size = 44691 },
|
| 2393 |
+
{ url = "https://pypi.netflix.net/packages/19176087037/tree_sitter_yaml-0.7.2-cp310-abi3-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:f1a5c60c98b6c4c037aae023569f020d0c489fad8dc26fdfd5510363c9c29a41", size = 91430 },
|
| 2394 |
+
{ url = "https://pypi.netflix.net/packages/19176087038/tree_sitter_yaml-0.7.2-cp310-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:88636d19d0654fd24f4f242eaaafa90f6f5ebdba8a62e4b32d251ed156c51a2a", size = 92428 },
|
| 2395 |
+
{ url = "https://pypi.netflix.net/packages/19176087039/tree_sitter_yaml-0.7.2-cp310-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:1d2e8f0bb14aa4537320952d0f9607eef3021d5aada8383c34ebeece17db1e06", size = 90580 },
|
| 2396 |
+
{ url = "https://pypi.netflix.net/packages/19176087040/tree_sitter_yaml-0.7.2-cp310-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:74ca712c50fc9d7dbc68cb36b4a7811d6e67a5466b5a789f19bf8dd6084ef752", size = 90455 },
|
| 2397 |
+
{ url = "https://pypi.netflix.net/packages/19176087041/tree_sitter_yaml-0.7.2-cp310-abi3-win_amd64.whl", hash = "sha256:7587b5ca00fc4f9a548eff649697a3b395370b2304b399ceefa2087d8a6c9186", size = 45514 },
|
| 2398 |
+
{ url = "https://pypi.netflix.net/packages/19176087042/tree_sitter_yaml-0.7.2-cp310-abi3-win_arm64.whl", hash = "sha256:f63c227b18e7ce7587bce124578f0bbf1f890ac63d3e3cd027417574273642c4", size = 44065 },
|
| 2399 |
+
]
|
| 2400 |
+
|
| 2401 |
[[package]]
|
| 2402 |
name = "triton"
|
| 2403 |
version = "3.5.1"
|