headroom / docs /README.md
chopratejas's picture
Add Agno integration documentation
aaaca9f
|
Raw
History Blame
2.38 kB

Headroom Documentation

Welcome to the Headroom documentation.

Getting Started

Guide Description
Quickstart 5-minute setup
SDK Guide Python SDK usage
Proxy Guide Proxy server deployment

Framework Integrations

Framework Description
LangChain Chat models, memory, retrievers, agents, streaming
Agno Model wrapper, hooks, multi-provider support
MCP See CCR Guide for tool compression

Core Concepts

Topic Description
Universal Compression ML-based content detection + structure preservation
Transforms How compression works
CCR Reversible compression architecture
Configuration All configuration options

Advanced

Topic Description
Text Compression Opt-in utilities for search/logs
LLMLingua ML-based compression
Metrics Monitoring and observability
Errors Error handling

Reference

Topic Description
API Reference Complete API docs
Architecture Internal design
Troubleshooting Common issues

Overview

Headroom is the Context Optimization Layer for LLM applications. It reduces your LLM costs by 50-90% through intelligent context compression.

How It Works

  1. Universal Compression — ML-based content detection with structure-preserving compression
  2. SmartCrusher — Compresses JSON tool outputs, keeping errors, anomalies, and relevant items
  3. CacheAligner — Stabilizes message prefixes so provider caching works
  4. RollingWindow — Manages context limits without breaking tool call pairs
  5. CCR — Caches original data so compression is reversible

Safety Guarantees

  • Never removes human content
  • Never breaks tool call ordering
  • Parse failures pass through unchanged
  • LLM can always retrieve original data

Getting Help