Spaces:
Build error
Build error
Headroom Documentation
Welcome to the Headroom documentation.
Getting Started
| Guide | Description |
|---|---|
| Quickstart | 5-minute setup |
| SDK Guide | Python SDK usage |
| Proxy Guide | Proxy server deployment |
Framework Integrations
| Framework | Description |
|---|---|
| LangChain | Chat models, memory, retrievers, agents, streaming |
| Agno | Model wrapper, hooks, multi-provider support |
| MCP | See CCR Guide for tool compression |
Core Concepts
| Topic | Description |
|---|---|
| Universal Compression | ML-based content detection + structure preservation |
| Transforms | How compression works |
| CCR | Reversible compression architecture |
| Configuration | All configuration options |
Advanced
| Topic | Description |
|---|---|
| Text Compression | Opt-in utilities for search/logs |
| LLMLingua | ML-based compression |
| Metrics | Monitoring and observability |
| Errors | Error handling |
Reference
| Topic | Description |
|---|---|
| API Reference | Complete API docs |
| Architecture | Internal design |
| Troubleshooting | Common issues |
Overview
Headroom is the Context Optimization Layer for LLM applications. It reduces your LLM costs by 50-90% through intelligent context compression.
How It Works
- Universal Compression — ML-based content detection with structure-preserving compression
- SmartCrusher — Compresses JSON tool outputs, keeping errors, anomalies, and relevant items
- CacheAligner — Stabilizes message prefixes so provider caching works
- RollingWindow — Manages context limits without breaking tool call pairs
- CCR — Caches original data so compression is reversible
Safety Guarantees
- Never removes human content
- Never breaks tool call ordering
- Parse failures pass through unchanged
- LLM can always retrieve original data
Getting Help
- GitHub Issues — Bug reports
- GitHub Discussions — Questions