Spaces:
Build error
Build error
File size: 2,151 Bytes
175746c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 | # Getting Started with Headroom
This guide will help you get up and running with Headroom in under 5 minutes.
## Installation
```bash
# Core package (minimal dependencies)
pip install headroom
# With proxy server
pip install headroom[proxy]
# With semantic relevance (for smarter compression)
pip install headroom[relevance]
# Everything
pip install headroom[all]
```
## Quick Start: Proxy Mode (Recommended)
The easiest way to use Headroom is as a proxy server:
```bash
# Start the proxy
headroom proxy --port 8787
```
Then point your LLM client at it:
```bash
# Claude Code
ANTHROPIC_BASE_URL=http://localhost:8787 claude
# OpenAI-compatible clients
OPENAI_BASE_URL=http://localhost:8787/v1 your-app
```
That's it! All your requests now go through Headroom and get optimized automatically.
## Quick Start: Python SDK
If you want programmatic control:
```python
from headroom import HeadroomClient
from openai import OpenAI
# Create a wrapped client
client = HeadroomClient(
original_client=OpenAI(),
default_mode="optimize",
)
# Use exactly like the original
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"},
],
)
```
## Modes
### Audit Mode
Observe without modifying:
```python
client = HeadroomClient(
original_client=OpenAI(),
default_mode="audit",
)
# Logs metrics but doesn't change requests
```
### Optimize Mode
Apply transforms to reduce tokens:
```python
client = HeadroomClient(
original_client=OpenAI(),
default_mode="optimize",
)
# Compresses tool outputs, aligns cache prefixes, etc.
```
### Simulate Mode
Preview what optimizations would do:
```python
plan = client.chat.completions.simulate(
model="gpt-4o",
messages=[...],
)
print(f"Would save {plan.tokens_saved} tokens")
print(f"Transforms: {plan.transforms_applied}")
```
## Next Steps
- [Proxy Server Documentation](proxy.md) - Configure the proxy
- [Transforms Reference](transforms.md) - Understand each transform
- [API Reference](api.md) - Full API documentation
|