Spaces:

minhtudragon
/

headroom

Build error

chopratejas commited on Apr 20

Commit

7296d66

1 Parent(s): 3ec47c3

fix: bundle ast-grep/difftastic/scc + generic tool_result interceptor framework

What this does, in plain terms:

Headroom's proxy now ships with three CLI tools (ast-grep, difftastic,
scc) that it can use to shrink tool_result payloads before they reach
the model. The goal is simple: when Claude Code (or Codex, Aider, etc.)
asks the model to reason about a big file or diff, we swap the verbose
output for a compact, same-meaning version. Fewer tokens per turn, same
answers, lower bill.

Today a single interceptor is wired: ast-grep on Read. When an agent
reads a large code file, the proxy replaces the file body with an
outline of its top-level functions/classes plus docstrings. In live
tests that cut prompt tokens 74–76% on both OpenAI and Anthropic,
same answer either way.

How it works:
- `pip install headroom-ai` now installs ast-grep via a PyPI wheel
(core dep). difftastic and scc are fetched once at proxy startup
from pinned upstream GitHub releases and cached per-user.
- A generic registry (`headroom/proxy/interceptors/`) lets us add more
tool-aware rewrites in one file each: declare `matches()` and
`transform()`, call `register()`, done. No proxy or metrics plumbing
per tool.
- Safety rails built in: pass-through when a Read specifies a line
range; second Read of the same file in a conversation returns full
content (progressive disclosure); any failing interceptor logs and
skips, never crashes a request.

Opt-in for now:
- Off by default while this ships. Turn on with
`headroom proxy --intercept-tool-results` or
`HEADROOM_INTERCEPT_ENABLED=1`, so we can measure before flipping
defaults.

What users see after turning it on:
- First `headroom wrap claude` boot is ~5s longer (binaries fetched).
Every subsequent run is cache-only.
- Existing `transforms_applied` field in metrics gets entries like
`interceptor:ast-grep`, so savings show up in current dashboards
and HTML reports with no UI change.

Other housekeeping in this PR:
- uv.lock moved to .gitignore — regenerated locally per environment.
- 35 unit + integration tests, ruff + mypy clean.
- Dead-code audit done: removed `binaries.run()`, `needs_filesystem`
plumbing, unused `_kind` tuple elements, unused `tool_output`
parameter, and the never-set HEADROOM_SKIP_TOOLS_BOOTSTRAP env.

Files changed (14) hide show

.gitignore +3 -0
headroom/binaries.py +494 -0
headroom/cli/main.py +1 -0
headroom/cli/proxy.py +21 -0
headroom/cli/tools.py +226 -0
headroom/proxy/interceptors/__init__.py +32 -0
headroom/proxy/interceptors/astgrep.py +246 -0
headroom/proxy/interceptors/base.py +261 -0
headroom/tools.json +89 -0
headroom/transforms/pipeline.py +12 -0
pyproject.toml +1 -0
tests/test_binaries.py +281 -0
tests/test_bundled_tools_savings.py +367 -0
tests/test_tool_result_interceptors.py +400 -0

.gitignore CHANGED Viewed

@@ -213,3 +213,6 @@ headroom-managed/
 # Release metadata artifact
 .releaseetadata

 # Release metadata artifact
 .releaseetadata
+# uv lockfile: regenerated locally; not committed
+uv.lock

headroom/binaries.py ADDED Viewed

	@@ -0,0 +1,494 @@

+"""Fetcher for bundled CLI tool binaries.
+`pip install headroom-ai` pulls `ast-grep-cli` as a proper PyPI binary wheel
+(core dependency), so ast-grep is always on PATH. The other two high-value
+tools — `difft` (difftastic) and `scc` — are fetched from pinned upstream
+GitHub releases at proxy startup, verified, cached per-user, and exec'd.
+Supported platforms: linux (glibc + musl) x86_64/aarch64, macOS x86_64/arm64,
+Windows x86_64. Unsupported platforms raise PlatformNotSupported; callers in
+the compression pipeline should fall back to their non-accelerated path.
+Env vars:
+    HEADROOM_BINARIES_MIRROR   base URL that replaces https://github.com
+    HEADROOM_BINARIES_CACHE    override cache dir
+    HEADROOM_BINARIES_OFFLINE  if set, never reach the network
+"""
+from __future__ import annotations
+import functools
+import hashlib
+import json
+import os
+import platform
+import shutil
+import subprocess
+import sys
+import tarfile
+import tempfile
+import urllib.error
+import urllib.request
+import zipfile
+from dataclasses import dataclass
+from pathlib import Path
+from typing import Any
+__all__ = [
+    "BinaryError",
+    "BinaryFetchError",
+    "PlatformNotSupported",
+    "Sha256Mismatch",
+    "OfflineError",
+    "PlatformKey",
+    "detect_platform",
+    "cache_dir",
+    "resolve",
+    "which",
+    "status",
+    "ensure_tools",
+]
+# ---------- Exceptions ---------------------------------------------------- #
+class BinaryError(Exception):
+    """Base exception for the binaries module."""
+class BinaryFetchError(BinaryError):
+    """Raised when a download fails or an archive cannot be extracted."""
+class PlatformNotSupported(BinaryError):
+    """Raised when the current OS/arch is not covered by a tool's registry."""
+class Sha256Mismatch(BinaryError):
+    """Raised when a downloaded asset's SHA256 does not match the pin."""
+class OfflineError(BinaryError):
+    """Raised when a network fetch is required but HEADROOM_BINARIES_OFFLINE is set."""
+# ---------- Platform detection -------------------------------------------- #
+@dataclass(frozen=True)
+class PlatformKey:
+    os: str  # "linux" | "darwin" | "windows"
+    arch: str  # "x86_64" | "aarch64"
+    libc: str  # "gnu" | "musl" | "n/a"
+    def key(self) -> str:
+        # Compact form used as registry lookup key and cache subdirectory.
+        if self.os == "linux":
+            return f"{self.os}-{self.arch}-{self.libc}"
+        return f"{self.os}-{self.arch}"
+def _machine_to_arch(machine: str) -> str:
+    m = machine.lower()
+    if m in ("x86_64", "amd64"):
+        return "x86_64"
+    if m in ("aarch64", "arm64"):
+        return "aarch64"
+    return m  # return as-is; lookup will fail cleanly with PlatformNotSupported
+def _is_musl() -> bool:
+    # Best-effort musl detection on Linux. Never raises.
+    try:
+        out = subprocess.run(
+            ["ldd", "--version"],
+            capture_output=True,
+            text=True,
+            timeout=2,
+            check=False,
+        )
+        return "musl" in (out.stdout + out.stderr).lower()
+    except (FileNotFoundError, subprocess.TimeoutExpired, OSError):
+        return False
+@functools.lru_cache(maxsize=1)
+def detect_platform() -> PlatformKey:
+    arch = _machine_to_arch(platform.machine())
+    if sys.platform.startswith("linux"):
+        return PlatformKey("linux", arch, "musl" if _is_musl() else "gnu")
+    if sys.platform == "darwin":
+        return PlatformKey("darwin", arch, "n/a")
+    if sys.platform.startswith("win"):
+        return PlatformKey("windows", arch, "n/a")
+    return PlatformKey(sys.platform, arch, "n/a")
+# ---------- Cache dir ----------------------------------------------------- #
+def cache_dir() -> Path:
+    override = os.environ.get("HEADROOM_BINARIES_CACHE")
+    if override:
+        return Path(override).expanduser().resolve()
+    if sys.platform.startswith("win"):
+        base = os.environ.get("LOCALAPPDATA") or str(Path.home() / "AppData" / "Local")
+        return Path(base) / "headroom" / "bin"
+    if sys.platform == "darwin":
+        return Path.home() / "Library" / "Caches" / "headroom" / "bin"
+    xdg = os.environ.get("XDG_CACHE_HOME")
+    base = Path(xdg) if xdg else Path.home() / ".cache"
+    return base / "headroom" / "bin"
+# ---------- Registry ------------------------------------------------------ #
+_REGISTRY_PATH = Path(__file__).parent / "tools.json"
+@functools.lru_cache(maxsize=1)
+def _registry() -> dict[str, Any]:
+    with _REGISTRY_PATH.open("r", encoding="utf-8") as f:
+        data: dict[str, Any] = json.load(f)
+    return data
+def _tool_entry(tool: str) -> dict[str, Any]:
+    reg = _registry()
+    tools: dict[str, Any] = reg.get("tools", {})
+    if tool not in tools:
+        raise KeyError(f"unknown tool {tool!r}; known: {sorted(tools)}")
+    entry: dict[str, Any] = tools[tool]
+    return entry
+def _is_pypi_tool(tool: str) -> bool:
+    entry = _tool_entry(tool)
+    return entry.get("version") == "pypi" or not entry.get("assets")
+def _asset_for_platform(tool: str, plat: PlatformKey) -> dict[str, Any]:
+    entry = _tool_entry(tool)
+    if _is_pypi_tool(tool):
+        raise PlatformNotSupported(
+            f"{tool}: distributed via PyPI only; `pip install headroom-ai` "
+            f"should have placed `{entry.get('binary', tool)}` on PATH."
+        )
+    assets: dict[str, Any] = entry.get("assets", {})
+    asset: dict[str, Any] | None = assets.get(plat.key())
+    if asset is None:
+        supported = sorted(assets.keys())
+        raise PlatformNotSupported(
+            f"{tool}: no prebuilt binary for {plat.key()}; supported: {supported}"
+        )
+    return asset
+def _mirror_url(url: str) -> str:
+    mirror = os.environ.get("HEADROOM_BINARIES_MIRROR")
+    if not mirror:
+        return url
+    # Only substitute the github.com host so that paths remain intact.
+    for prefix in ("https://github.com", "https://objects.githubusercontent.com"):
+        if url.startswith(prefix):
+            return mirror.rstrip("/") + url[len(prefix) :]
+    return url
+# ---------- Download + verify --------------------------------------------- #
+def _download(url: str, dest: Path, *, progress: bool = True) -> None:
+    if os.environ.get("HEADROOM_BINARIES_OFFLINE"):
+        raise OfflineError(f"offline mode (HEADROOM_BINARIES_OFFLINE=1) but fetch required: {url}")
+    dest.parent.mkdir(parents=True, exist_ok=True)
+    final_url = _mirror_url(url)
+    req = urllib.request.Request(final_url, headers={"User-Agent": "headroom-binaries/1"})
+    try:
+        with urllib.request.urlopen(req, timeout=60) as resp:  # noqa: S310 (https)
+            total = int(resp.headers.get("Content-Length") or 0)
+            _stream_to(resp, dest, total, label=dest.name, show_progress=progress)
+    except urllib.error.URLError as e:
+        raise BinaryFetchError(f"failed to download {final_url}: {e}") from e
+def _stream_to(src: Any, dest: Path, total: int, *, label: str, show_progress: bool) -> None:
+    # Rich progress if available and stderr is a tty; otherwise silent chunked copy.
+    try:
+        if show_progress and sys.stderr.isatty():
+            from rich.progress import (
+                BarColumn,
+                DownloadColumn,
+                Progress,
+                TextColumn,
+                TimeRemainingColumn,
+                TransferSpeedColumn,
+            )
+            with Progress(
+                TextColumn("[bold blue]{task.description}"),
+                BarColumn(),
+                DownloadColumn(),
+                TransferSpeedColumn(),
+                TimeRemainingColumn(),
+            ) as prog:
+                task = prog.add_task(label, total=total or None)
+                with dest.open("wb") as out:
+                    while chunk := src.read(1024 * 64):
+                        out.write(chunk)
+                        prog.update(task, advance=len(chunk))
+            return
+    except ImportError:
+        pass
+    with dest.open("wb") as out:
+        shutil.copyfileobj(src, out)
+def _sha256_file(path: Path) -> str:
+    h = hashlib.sha256()
+    with path.open("rb") as f:
+        for chunk in iter(lambda: f.read(1024 * 64), b""):
+            h.update(chunk)
+    return h.hexdigest()
+def _verify_sha256(path: Path, expected: str | None) -> None:
+    if not expected:
+        # Upstream release not SHA-pinned in registry. We trusted HTTPS + the
+        # GitHub CDN for the download; log nothing here — `doctor` surfaces.
+        return
+    got = _sha256_file(path)
+    if got.lower() != expected.lower():
+        path.unlink(missing_ok=True)
+        raise Sha256Mismatch(f"sha256 mismatch for {path.name}: expected {expected}, got {got}")
+# ---------- Archive extraction ------------------------------------------- #
+def _extract(archive: Path, member: str, dest: Path) -> None:
+    """Extract `member` from archive into `dest` (single-file binary)."""
+    dest.parent.mkdir(parents=True, exist_ok=True)
+    name = archive.name.lower()
+    try:
+        if name.endswith(".tar.gz") or name.endswith(".tgz"):
+            with tarfile.open(archive, "r:gz") as tf:
+                _extract_member_from_tar(tf, member, dest)
+        elif name.endswith(".zip"):
+            with zipfile.ZipFile(archive) as zf:
+                _extract_member_from_zip(zf, member, dest)
+        elif name.endswith(".gz") and "." not in name[:-3]:
+            # bare .gz of a single binary
+            import gzip
+            with gzip.open(archive, "rb") as gz, dest.open("wb") as out:
+                shutil.copyfileobj(gz, out)
+        else:
+            # Not an archive — treat the downloaded file itself as the binary.
+            shutil.copy2(archive, dest)
+    except (tarfile.TarError, zipfile.BadZipFile, OSError) as e:
+        raise BinaryFetchError(f"failed to extract {archive.name}: {e}") from e
+def _extract_member_from_tar(tf: tarfile.TarFile, member: str, dest: Path) -> None:
+    # Match by basename so that registries can specify "difft" even though the
+    # upstream tar may include a leading directory like "difft-0.64.0/difft".
+    wanted = member.lower()
+    for m in tf.getmembers():
+        base = m.name.rsplit("/", 1)[-1].lower()
+        if base == wanted and m.isfile():
+            extracted = tf.extractfile(m)
+            if extracted is None:
+                continue
+            with dest.open("wb") as out:
+                shutil.copyfileobj(extracted, out)
+            return
+    raise BinaryFetchError(f"archive did not contain expected member {member!r}")
+def _extract_member_from_zip(zf: zipfile.ZipFile, member: str, dest: Path) -> None:
+    wanted = member.lower()
+    for info in zf.infolist():
+        base = info.filename.rsplit("/", 1)[-1].lower()
+        if base == wanted and not info.is_dir():
+            with zf.open(info) as src, dest.open("wb") as out:
+                shutil.copyfileobj(src, out)
+            return
+    raise BinaryFetchError(f"archive did not contain expected member {member!r}")
+# ---------- Public API ---------------------------------------------------- #
+def _binary_name(tool: str, plat: PlatformKey) -> str:
+    entry = _tool_entry(tool)
+    base = entry.get("binary", tool)
+    return f"{base}.exe" if plat.os == "windows" else base
+def _cached_path(tool: str, version: str, plat: PlatformKey) -> Path:
+    return cache_dir() / f"{tool}-{version}-{plat.key()}" / _binary_name(tool, plat)
+def _in_registry(tool: str) -> bool:
+    return tool in _registry().get("tools", {})
+def _path_lookup(tool: str) -> Path | None:
+    """Find `tool` on PATH or in this interpreter's Scripts/bin directory.
+    PyPI binary wheels (e.g. ast-grep-cli) install their console scripts into
+    sys.prefix/bin (or sys.prefix/Scripts on Windows). That directory is on
+    PATH when the venv is activated, but subprocesses started by a non-active
+    interpreter can miss it, so we check it explicitly as a fallback.
+    """
+    candidates = [tool]
+    if _in_registry(tool):
+        alias = _tool_entry(tool).get("binary")
+        if alias and alias != tool:
+            candidates.append(alias)
+    for name in candidates:
+        found = shutil.which(name)
+        if found:
+            return Path(found)
+    scripts_dir = Path(sys.prefix) / ("Scripts" if sys.platform.startswith("win") else "bin")
+    for name in candidates:
+        exe = scripts_dir / (name + (".exe" if sys.platform.startswith("win") else ""))
+        if exe.exists():
+            return exe
+    return None
+def which(tool: str) -> Path | None:
+    """Return a path to `tool` if it is on PATH or already cached, else None.
+    Never triggers a network fetch. Callers that want the tool to be installed
+    on demand should use `resolve()` instead.
+    """
+    on_path = _path_lookup(tool)
+    if on_path:
+        return on_path
+    if not _in_registry(tool):
+        return None
+    try:
+        plat = detect_platform()
+        _asset_for_platform(tool, plat)  # raises if unsupported
+    except PlatformNotSupported:
+        return None
+    path = _cached_path(tool, _tool_entry(tool)["version"], plat)
+    return path if path.exists() else None
+def resolve(tool: str) -> Path:
+    """Return a path to the tool binary, fetching it on first use.
+    Raises PlatformNotSupported if the tool is unavailable on this platform,
+    OfflineError if a fetch is required but HEADROOM_BINARIES_OFFLINE is set,
+    Sha256Mismatch if verification fails, BinaryFetchError on other IO errors.
+    """
+    on_path = _path_lookup(tool)
+    if on_path:
+        return on_path
+    if not _in_registry(tool):
+        raise KeyError(f"unknown tool {tool!r}")
+    plat = detect_platform()
+    entry = _tool_entry(tool)
+    asset = _asset_for_platform(tool, plat)
+    version = entry["version"]
+    binary_path = _cached_path(tool, version, plat)
+    if binary_path.exists():
+        return binary_path
+    # Not cached — fetch, verify, extract.
+    url = asset["url"]
+    sha256 = asset.get("sha256")
+    member = asset.get("member", _binary_name(tool, plat))
+    with tempfile.TemporaryDirectory(prefix="headroom-fetch-") as tmp:
+        tmp_dir = Path(tmp)
+        download_path = tmp_dir / Path(url).name
+        _download(url, download_path)
+        _verify_sha256(download_path, sha256)
+        staging = tmp_dir / "out"
+        _extract(download_path, member, staging)
+        binary_path.parent.mkdir(parents=True, exist_ok=True)
+        # Atomic-ish move: write to sibling then rename.
+        tmp_final = binary_path.with_suffix(binary_path.suffix + ".partial")
+        shutil.move(str(staging), tmp_final)
+        try:
+            tmp_final.chmod(0o755)
+        except OSError:
+            pass  # Windows or restricted FS — .exe is already executable
+        os.replace(tmp_final, binary_path)
+    return binary_path
+def ensure_tools(quiet: bool = False) -> dict[str, Path | None]:
+    """Install every tool in the registry if missing. Safe to call repeatedly.
+    Called at proxy startup and on first `headroom` CLI invocation so that no
+    tool fetch ever happens inside a live request. Skips tools that are on
+    PATH, already cached, or distributed via PyPI-only (ast-grep).
+    Returns a map of tool_name -> resolved Path (or None if unsupported).
+    Never raises; unsupported platforms or offline errors are logged via
+    stderr and the tool is skipped.
+    """
+    out: dict[str, Path | None] = {}
+    for name in _registry().get("tools", {}):
+        try:
+            if _is_pypi_tool(name):
+                # ast-grep ships via pip; just record whether it's on PATH.
+                out[name] = _path_lookup(name)
+                continue
+            out[name] = resolve(name)
+        except (PlatformNotSupported, OfflineError, BinaryFetchError, Sha256Mismatch) as e:
+            out[name] = None
+            if not quiet:
+                print(f"headroom: skipping {name}: {e}", file=sys.stderr)
+    return out
+def status() -> list[dict[str, Any]]:
+    """Return a list of status dicts for every tool in the registry.
+    Used by `headroom tools doctor`. Never fetches — only inspects.
+    """
+    out: list[dict[str, Any]] = []
+    plat = detect_platform()
+    for name, entry in _registry().get("tools", {}).items():
+        row: dict[str, Any] = {
+            "tool": name,
+            "version": entry.get("version"),
+            "platform": plat.key(),
+            "source": entry.get("source", "fetched"),
+            "path": None,
+            "state": "missing",
+        }
+        # Honor PATH.
+        on_path = shutil.which(name) or (
+            shutil.which(entry["binary"]) if entry.get("binary") else None
+        )
+        if on_path:
+            row["path"] = on_path
+            row["state"] = "on-path"
+            out.append(row)
+            continue
+        try:
+            _asset_for_platform(name, plat)
+        except PlatformNotSupported as e:
+            row["state"] = "unsupported-platform"
+            row["detail"] = str(e)
+            out.append(row)
+            continue
+        cached = _cached_path(name, entry["version"], plat)
+        if cached.exists():
+            row["path"] = str(cached)
+            row["state"] = "cached"
+        out.append(row)
+    return out

headroom/cli/main.py CHANGED Viewed

@@ -42,6 +42,7 @@ def _register_commands() -> None:
         mcp,  # noqa: F401
         perf,  # noqa: F401
         proxy,  # noqa: F401
         wrap,  # noqa: F401
     )

         mcp,  # noqa: F401
         perf,  # noqa: F401
         proxy,  # noqa: F401
+        tools,  # noqa: F401
         wrap,  # noqa: F401
     )

headroom/cli/proxy.py CHANGED Viewed

@@ -44,6 +44,14 @@ from .main import main
         "Legacy aliases are accepted. Default: token. Env: HEADROOM_MODE"
     ),
 )
 @click.option("--no-optimize", is_flag=True, help="Disable optimization (passthrough mode)")
 @click.option("--no-cache", is_flag=True, help="Disable semantic caching")
 @click.option("--no-rate-limit", is_flag=True, help="Disable rate limiting")
@@ -211,6 +219,7 @@ def proxy(
     mode: str | None,
     host: str,
     port: int,
     no_optimize: bool,
     no_cache: bool,
     no_rate_limit: bool,
@@ -267,6 +276,18 @@ def proxy(
         click.echo(f"Details: {e}")
         raise SystemExit(1) from None
     # Resolve API URL overrides: CLI flag > env var > None
     effective_anthropic_api_url = anthropic_api_url or os.environ.get("ANTHROPIC_TARGET_API_URL")
     effective_openai_api_url = openai_api_url or os.environ.get("OPENAI_TARGET_API_URL")

         "Legacy aliases are accepted. Default: token. Env: HEADROOM_MODE"
     ),
 )
+@click.option(
+    "--intercept-tool-results",
+    is_flag=True,
+    help=(
+        "Opt in to tool_result interceptors (ast-grep Read outliner, etc.). "
+        "Off by default while this feature ships."
+    ),
+)
 @click.option("--no-optimize", is_flag=True, help="Disable optimization (passthrough mode)")
 @click.option("--no-cache", is_flag=True, help="Disable semantic caching")
 @click.option("--no-rate-limit", is_flag=True, help="Disable rate limiting")
     mode: str | None,
     host: str,
     port: int,
+    intercept_tool_results: bool,
     no_optimize: bool,
     no_cache: bool,
     no_rate_limit: bool,
         click.echo(f"Details: {e}")
         raise SystemExit(1) from None
+    # Ensure bundled CLI tools (ast-grep, difftastic, scc) are present before
+    # the proxy starts accepting traffic. Never happens inside a live request —
+    # tools are downloaded once at startup if missing, then cached per-user.
+    from headroom.binaries import ensure_tools
+    ensure_tools()
+    # Opt-in: turn on tool_result interceptors (ast-grep Read outline, etc.).
+    # The TransformPipeline reads this env var at construction time.
+    if intercept_tool_results:
+        os.environ["HEADROOM_INTERCEPT_ENABLED"] = "1"
     # Resolve API URL overrides: CLI flag > env var > None
     effective_anthropic_api_url = anthropic_api_url or os.environ.get("ANTHROPIC_TARGET_API_URL")
     effective_openai_api_url = openai_api_url or os.environ.get("OPENAI_TARGET_API_URL")

headroom/cli/tools.py ADDED Viewed

	@@ -0,0 +1,226 @@

+"""CLI: passthrough subcommands for bundled tools and a `tools` management group.
+Exposes:
+    headroom sg …           ->  ast-grep (from the ast-grep-cli PyPI wheel)
+    headroom diff A B …     ->  difftastic
+    headroom loc [PATH] …   ->  scc
+    headroom tools install  ->  pre-fetch all bundled binaries
+    headroom tools doctor   ->  print a status table
+    headroom tools list     ->  show the registry
+The passthrough commands forward every argument, stdin, stdout, stderr, and
+the exit code verbatim, so agents can invoke them via their existing shell
+tool without any Headroom-specific protocol.
+"""
+from __future__ import annotations
+import os
+import subprocess
+import sys
+from collections.abc import Sequence
+import click
+from headroom import binaries
+from .main import main
+_PASSTHROUGH_CTX = {
+    "ignore_unknown_options": True,
+    "allow_extra_args": True,
+    "help_option_names": [],  # let the underlying tool handle --help
+}
+def _exec_tool(tool: str, argv: Sequence[str]) -> None:
+    try:
+        path = binaries.resolve(tool)
+    except binaries.PlatformNotSupported as e:
+        click.secho(f"error: {e}", fg="red", err=True)
+        sys.exit(2)
+    except binaries.OfflineError as e:
+        click.secho(
+            f"error: {e}\nHint: run `headroom tools install` on a networked machine, "
+            f"or pass --from <bundle.tar.gz>.",
+            fg="red",
+            err=True,
+        )
+        sys.exit(2)
+    except (binaries.Sha256Mismatch, binaries.BinaryFetchError) as e:
+        click.secho(f"error: {e}", fg="red", err=True)
+        sys.exit(2)
+    # Replace the current process on POSIX for correct signal handling;
+    # fall back to subprocess on Windows where os.execv is awkward.
+    cmd = [str(path), *argv]
+    if os.name == "posix":
+        os.execv(cmd[0], cmd)  # never returns
+    else:
+        completed = subprocess.run(cmd, check=False)
+        sys.exit(completed.returncode)
+@main.command(
+    "sg",
+    context_settings=_PASSTHROUGH_CTX,
+    short_help="Run ast-grep (AST-aware structural search/replace).",
+    add_help_option=False,
+)
+@click.argument("args", nargs=-1, type=click.UNPROCESSED)
+def sg_cmd(args: tuple[str, ...]) -> None:
+    """Forward every argument to ast-grep."""
+    _exec_tool("ast-grep", list(args))
+@main.command(
+    "diff",
+    context_settings=_PASSTHROUGH_CTX,
+    short_help="Run difftastic (structural diff).",
+    add_help_option=False,
+)
+@click.argument("args", nargs=-1, type=click.UNPROCESSED)
+def diff_cmd(args: tuple[str, ...]) -> None:
+    """Forward every argument to difftastic (`difft`)."""
+    _exec_tool("difft", list(args))
+@main.command(
+    "loc",
+    context_settings=_PASSTHROUGH_CTX,
+    short_help="Run scc (fast lines-of-code / repo-shape probe).",
+    add_help_option=False,
+)
+@click.argument("args", nargs=-1, type=click.UNPROCESSED)
+def loc_cmd(args: tuple[str, ...]) -> None:
+    """Forward every argument to scc."""
+    _exec_tool("scc", list(args))
+@main.group("tools")
+def tools_group() -> None:
+    """Manage bundled CLI tool binaries (ast-grep, difft, scc)."""
+@tools_group.command("list")
+def tools_list_cmd() -> None:
+    """Print the tool registry (versions, platforms, cache dir)."""
+    from rich.console import Console
+    from rich.table import Table
+    console = Console()
+    plat = binaries.detect_platform()
+    console.print(f"[dim]platform:[/dim] {plat.key()}")
+    console.print(f"[dim]cache:[/dim] {binaries.cache_dir()}")
+    table = Table(show_header=True, header_style="bold")
+    table.add_column("tool")
+    table.add_column("version")
+    table.add_column("source")
+    table.add_column("platforms")
+    reg = binaries._registry()  # noqa: SLF001 (intentional internal read)
+    for name, entry in reg.get("tools", {}).items():
+        platforms = ", ".join(sorted(entry.get("assets", {}).keys())) or "(pypi)"
+        table.add_row(name, str(entry.get("version")), entry.get("source", ""), platforms)
+    console.print(table)
+@tools_group.command("doctor")
+@click.option("--json", "emit_json", is_flag=True, help="Emit JSON instead of a table.")
+def tools_doctor_cmd(emit_json: bool) -> None:
+    """Check the status of every bundled tool."""
+    rows = binaries.status()
+    if emit_json:
+        import json as _json
+        click.echo(_json.dumps(rows, indent=2))
+        broken = any(r["state"] in ("missing", "unsupported-platform") for r in rows)
+        sys.exit(1 if broken else 0)
+    from rich.console import Console
+    from rich.table import Table
+    console = Console()
+    table = Table(show_header=True, header_style="bold")
+    for col in ("tool", "state", "version", "platform", "path"):
+        table.add_column(col)
+    state_style = {
+        "on-path": "green",
+        "cached": "green",
+        "missing": "yellow",
+        "unsupported-platform": "red",
+    }
+    broken = False
+    for r in rows:
+        style = state_style.get(r["state"], "white")
+        if r["state"] in ("missing", "unsupported-platform"):
+            broken = True
+        table.add_row(
+            r["tool"],
+            f"[{style}]{r['state']}[/{style}]",
+            str(r.get("version")),
+            r.get("platform", ""),
+            r.get("path") or "-",
+        )
+    console.print(table)
+    from rich.markup import escape as _escape
+    for r in rows:
+        if r.get("detail"):
+            console.print(f"[dim]{r['tool']}:[/dim] {_escape(r['detail'])}")
+    sys.exit(1 if broken else 0)
+@tools_group.command("install")
+@click.option(
+    "--tool",
+    "tools",
+    multiple=True,
+    help="Install only the named tool (repeatable). Default: all.",
+)
+@click.option(
+    "--force",
+    is_flag=True,
+    help="Re-fetch even if the binary is already cached.",
+)
+def tools_install_cmd(tools: tuple[str, ...], force: bool) -> None:
+    """Pre-fetch all bundled tool binaries into the per-user cache."""
+    reg = binaries._registry()  # noqa: SLF001
+    selected = list(tools) if tools else list(reg.get("tools", {}).keys())
+    exit_code = 0
+    for name in selected:
+        if name not in reg.get("tools", {}):
+            click.secho(f"unknown tool {name!r}; skipping", fg="yellow", err=True)
+            exit_code = 1
+            continue
+        if binaries._is_pypi_tool(name):  # noqa: SLF001
+            on_path = binaries._path_lookup(name)  # noqa: SLF001
+            if on_path:
+                click.echo(f"{name}: on PATH at {on_path} (pypi wheel)")
+            else:
+                click.secho(
+                    f"{name}: not on PATH — `pip install headroom-ai` should provide it",
+                    fg="yellow",
+                )
+                exit_code = 1
+            continue
+        if force:
+            plat = binaries.detect_platform()
+            try:
+                cached = binaries._cached_path(  # noqa: SLF001
+                    name, reg["tools"][name]["version"], plat
+                )
+                if cached.exists():
+                    cached.unlink()
+            except Exception:  # noqa: BLE001
+                pass
+        try:
+            path = binaries.resolve(name)
+            click.secho(f"{name}: installed → {path}", fg="green")
+        except binaries.PlatformNotSupported as e:
+            click.secho(f"{name}: {e}", fg="red")
+            exit_code = 1
+        except (binaries.BinaryFetchError, binaries.Sha256Mismatch, binaries.OfflineError) as e:
+            click.secho(f"{name}: {e}", fg="red")
+            exit_code = 1
+    sys.exit(exit_code)

headroom/proxy/interceptors/__init__.py ADDED Viewed

	@@ -0,0 +1,32 @@

+"""Tool-result interceptors.
+An interceptor rewrites a single tool_result's text before it reaches the
+model. Each interceptor is self-contained: declare a `matches()` predicate
+and a `transform()` function, register it in the `INTERCEPTORS` list, and
+the proxy pipeline will call it automatically.
+Adding a new interceptor later is one file plus one `register()` call — no
+proxy or metrics changes required.
+"""
+# Side-effect: register the built-in interceptors.
+from . import astgrep  # noqa: F401
+from .base import (
+    INTERCEPTORS,
+    InterceptionResult,
+    ToolResultInterceptor,
+    ToolResultInterceptorTransform,
+    TransformSpan,
+    apply_to_messages,
+    register,
+)
+__all__ = [
+    "INTERCEPTORS",
+    "InterceptionResult",
+    "ToolResultInterceptor",
+    "ToolResultInterceptorTransform",
+    "TransformSpan",
+    "apply_to_messages",
+    "register",
+]

headroom/proxy/interceptors/astgrep.py ADDED Viewed

	@@ -0,0 +1,246 @@

+"""ast-grep interceptor: replace verbose Read outputs with function-level outlines.
+Matches Claude Code's `Read` tool (and equivalent) when the file is code and
+the output is large enough to benefit. Invokes ast-grep to locate top-level
+function and class definitions and emits a compact outline: each signature
+followed by an elided body marker. Falls back to the original text if
+ast-grep isn't available, the extension isn't supported, or there are fewer
+than three definitions to outline.
+"""
+from __future__ import annotations
+import json
+import logging
+import os
+import subprocess
+import tempfile
+from pathlib import Path
+from typing import Any
+from headroom import binaries
+from . import base
+logger = logging.getLogger(__name__)
+# Latency floor: below this size, the subprocess cost of running ast-grep
+# isn't worth the tiny win. It is NOT a semantic threshold — the framework
+# rejects any rewrite that doesn't actually shrink tokens, so we don't need
+# a "big enough to matter" check here, only a "big enough to justify the
+# fork()" check.
+MIN_CHARS_TO_REWRITE = int(os.environ.get("HEADROOM_INTERCEPT_READ_MIN_CHARS", "500"))
+# Tool_input keys that indicate the model targeted a specific line range;
+# outlining would frustrate that intent and likely cause a re-read.
+_RANGE_KEYS = ("offset", "limit", "line_range", "start_line", "end_line", "ranges")
+# ast-grep --lang is passed these values; only extensions with a stable
+# grammar are included.
+_EXT_TO_LANG: dict[str, str] = {
+    ".py": "python",
+    ".ts": "typescript",
+    ".tsx": "tsx",
+    ".js": "javascript",
+    ".jsx": "jsx",
+    ".go": "go",
+    ".rs": "rust",
+    ".java": "java",
+    ".rb": "ruby",
+    ".c": "c",
+    ".h": "c",
+    ".cpp": "cpp",
+    ".cc": "cpp",
+    ".hpp": "cpp",
+}
+# Top-level declaration patterns per language. We emit the signature line
+# of whatever ast-grep matches here, so any pattern that anchors on a
+# declaration's starting line works.
+_PATTERNS: dict[str, list[str]] = {
+    "python": ["def $NAME", "class $NAME", "async def $NAME"],
+    "typescript": ["function $NAME", "class $NAME"],
+    "tsx": ["function $NAME", "class $NAME"],
+    "javascript": ["function $NAME", "class $NAME"],
+    "jsx": ["function $NAME", "class $NAME"],
+    "go": ["func $NAME"],
+    "rust": ["fn $NAME", "struct $NAME", "enum $NAME"],
+    "java": ["class $NAME", "interface $NAME"],
+    "ruby": ["def $NAME", "class $NAME"],
+    "c": ["$RET $NAME($$$ARGS) { $$$BODY }"],
+    "cpp": ["$RET $NAME($$$ARGS) { $$$BODY }"],
+}
+OUTLINE_MARKER = "    # ... (body elided by Headroom; Read a specific line range to see it)\n"
+class AstGrepReadOutline:
+    """Interceptor that outlines verbose code-file Read outputs."""
+    name = "ast-grep"
+    def matches(
+        self,
+        tool_name: str | None,
+        tool_input: dict[str, Any],
+        tool_output: str,
+    ) -> bool:
+        if tool_name not in ("Read", "read_file", "view", "cat"):
+            return False
+        if len(tool_output) < MIN_CHARS_TO_REWRITE:
+            return False
+        # Respect explicit line ranges — the model wants those specific lines.
+        if any(k in tool_input for k in _RANGE_KEYS):
+            return False
+        return _detect_lang_from_input(tool_input) is not None
+    def transform(
+        self,
+        tool_name: str | None,
+        tool_input: dict[str, Any],
+        tool_output: str,
+    ) -> str | None:
+        lang = _detect_lang_from_input(tool_input)
+        if not lang:
+            return None
+        try:
+            exe = binaries.resolve("ast-grep")
+        except binaries.PlatformNotSupported:
+            return None
+        matches = _run_ast_grep(exe, lang, tool_output)
+        if not matches:
+            return None
+        outline = _build_outline(matches, tool_output)
+        return outline if outline else None
+    def progressive_disclosure_key(
+        self,
+        tool_name: str | None,
+        tool_input: dict[str, Any],
+    ) -> str | None:
+        """Key by file_path so a second Read of the same file passes through."""
+        return _path_from_input(tool_input)
+def _detect_lang_from_input(tool_input: dict[str, Any]) -> str | None:
+    path = _path_from_input(tool_input)
+    if not path:
+        return None
+    ext = Path(path).suffix.lower()
+    return _EXT_TO_LANG.get(ext)
+def _path_from_input(tool_input: dict[str, Any]) -> str | None:
+    for key in ("file_path", "path", "filePath", "filename"):
+        v = tool_input.get(key)
+        if isinstance(v, str) and v:
+            return v
+    return None
+def _run_ast_grep(
+    exe: Path | str,
+    lang: str,
+    source: str,
+) -> list[dict[str, Any]]:
+    """Run ast-grep against `source` and return the JSON match records.
+    Writes `source` to a tempfile because ast-grep's CLI operates on files.
+    """
+    all_matches: list[dict[str, Any]] = []
+    patterns = _PATTERNS.get(lang, [])
+    if not patterns:
+        return []
+    # Use the canonical extension so ast-grep can pick the right grammar.
+    ext = next((e for e, L in _EXT_TO_LANG.items() if L == lang), ".txt")
+    with tempfile.NamedTemporaryFile(mode="w", suffix=ext, delete=False, encoding="utf-8") as tmp:
+        tmp.write(source)
+        tmp_path = tmp.name
+    try:
+        for pattern in patterns:
+            try:
+                completed = subprocess.run(
+                    [
+                        str(exe),
+                        "run",
+                        "--pattern",
+                        pattern,
+                        "--lang",
+                        lang,
+                        "--json=stream",
+                        tmp_path,
+                    ],
+                    capture_output=True,
+                    text=True,
+                    timeout=5,
+                    check=False,
+                )
+            except (subprocess.TimeoutExpired, OSError) as e:
+                logger.debug("ast-grep timed out or failed: %s", e)
+                continue
+            if completed.returncode != 0:
+                continue
+            for line in completed.stdout.splitlines():
+                line = line.strip()
+                if not line:
+                    continue
+                try:
+                    all_matches.append(json.loads(line))
+                except json.JSONDecodeError:
+                    continue
+    finally:
+        try:
+            Path(tmp_path).unlink()
+        except OSError:
+            pass
+    return all_matches
+def _build_outline(matches: list[dict[str, Any]], source: str) -> str | None:
+    """Build a compact outline from ast-grep matches.
+    Emits each definition's signature line + docstring (if next line is a
+    string literal) + an elision marker. Matches are sorted by byte offset
+    so the outline tracks the original file order.
+    """
+    lines = source.splitlines(keepends=True)
+    outline_chunks: list[str] = []
+    seen_starts: set[int] = set()
+    matches.sort(key=lambda m: m.get("range", {}).get("byteOffset", {}).get("start", 0))
+    for m in matches:
+        start = m.get("range", {}).get("start", {})
+        line_idx = start.get("line")
+        if not isinstance(line_idx, int) or line_idx in seen_starts:
+            continue
+        seen_starts.add(line_idx)
+        if line_idx >= len(lines):
+            continue
+        signature_line = lines[line_idx].rstrip("\n")
+        outline_chunks.append(signature_line + "\n")
+        # Best-effort: if the next non-blank line is a docstring, keep it.
+        next_idx = line_idx + 1
+        while next_idx < len(lines) and not lines[next_idx].strip():
+            next_idx += 1
+        if next_idx < len(lines):
+            nl = lines[next_idx].lstrip()
+            if nl.startswith(('"""', "'''", "/**", "//", "#")):
+                outline_chunks.append(lines[next_idx])
+        outline_chunks.append(OUTLINE_MARKER)
+    if not outline_chunks:
+        return None
+    header = (
+        "[headroom: outlined by ast-grep — "
+        f"{len(seen_starts)} definition(s); "
+        "bodies elided. Re-read the file with a line range to see a specific body.]\n"
+    )
+    return header + "".join(outline_chunks)
+base.register(AstGrepReadOutline())

headroom/proxy/interceptors/base.py ADDED Viewed

	@@ -0,0 +1,261 @@

+"""Protocol + registry + Transform adapter for tool_result interceptors."""
+from __future__ import annotations
+import logging
+from dataclasses import dataclass
+from typing import Any, Protocol, runtime_checkable
+from headroom.cache.compression_cache import (
+    _extract_tool_result_content,
+    _is_tool_result_message,
+    _swap_tool_result_content,
+)
+from headroom.config import TransformResult
+from headroom.tokenizer import Tokenizer
+from headroom.transforms.base import Transform
+logger = logging.getLogger(__name__)
+@runtime_checkable
+class ToolResultInterceptor(Protocol):
+    """A stateless rewriter for a single tool_result's text content.
+    Implementations MUST be idempotent and MUST return either a strictly
+    smaller string (measured in tokens) or None to pass through. Never raise
+    — errors should be caught internally and logged; the pipeline always
+    tolerates a no-op interceptor.
+    Interceptors MAY implement `progressive_disclosure_key()` to opt into
+    one-shot behavior: the framework tracks which keys have already been
+    rewritten in the current conversation, and skips subsequent matches on
+    the same key so that the model gets full content if it asks again.
+    """
+    name: str  # e.g. "ast-grep", "difft", "scc"
+    def matches(
+        self,
+        tool_name: str | None,
+        tool_input: dict[str, Any],
+        tool_output: str,
+    ) -> bool: ...
+    def transform(
+        self,
+        tool_name: str | None,
+        tool_input: dict[str, Any],
+        tool_output: str,
+    ) -> str | None: ...
+    def progressive_disclosure_key(
+        self,
+        tool_name: str | None,
+        tool_input: dict[str, Any],
+    ) -> str | None:
+        """Optional: return a stable content key (e.g. file path).
+        If a key is returned and the same (interceptor.name, key) pair was
+        already successfully rewritten earlier in the messages, subsequent
+        occurrences pass through unchanged. Return None to opt out.
+        """
+        ...
+@dataclass(frozen=True)
+class TransformSpan:
+    """Per-interceptor measurement emitted for dashboard/metrics."""
+    tool: str
+    tokens_before: int
+    tokens_after: int
+    @property
+    def tokens_saved(self) -> int:
+        return max(self.tokens_before - self.tokens_after, 0)
+@dataclass
+class InterceptionResult:
+    messages: list[dict[str, Any]]
+    spans: list[TransformSpan]
+INTERCEPTORS: list[ToolResultInterceptor] = []
+def register(interceptor: ToolResultInterceptor) -> None:
+    """Add an interceptor to the registry. Idempotent on name."""
+    for existing in INTERCEPTORS:
+        if existing.name == interceptor.name:
+            return
+    INTERCEPTORS.append(interceptor)
+def _find_tool_use(
+    messages: list[dict[str, Any]],
+    tool_use_id: str,
+) -> tuple[str | None, dict[str, Any]]:
+    """Walk prior messages to find the tool_use block that produced a given id.
+    Returns (tool_name, tool_input) or (None, {}) if not found.
+    """
+    for msg in messages:
+        content = msg.get("content")
+        if isinstance(content, list):
+            for block in content:
+                if not isinstance(block, dict):
+                    continue
+                # Anthropic: {"type": "tool_use", "id": ..., "name": ..., "input": {...}}
+                if block.get("type") == "tool_use" and block.get("id") == tool_use_id:
+                    return (
+                        block.get("name"),
+                        block.get("input") or {},
+                    )
+        # OpenAI: assistant message with `tool_calls` list
+        tool_calls = msg.get("tool_calls")
+        if isinstance(tool_calls, list):
+            for call in tool_calls:
+                if isinstance(call, dict) and call.get("id") == tool_use_id:
+                    fn = call.get("function") or {}
+                    # arguments is a JSON string in OpenAI; decode best-effort
+                    import json as _json
+                    args: dict[str, Any] = {}
+                    if isinstance(fn.get("arguments"), str):
+                        try:
+                            args = _json.loads(fn["arguments"])
+                        except Exception:  # noqa: BLE001
+                            args = {}
+                    elif isinstance(fn.get("arguments"), dict):
+                        args = fn["arguments"]
+                    return fn.get("name"), args
+    return None, {}
+def _tool_use_id_for_message(msg: dict[str, Any]) -> str | None:
+    """Return the tool_use_id linked to a tool_result message."""
+    # Anthropic format
+    content = msg.get("content")
+    if isinstance(content, list):
+        for block in content:
+            if isinstance(block, dict) and block.get("type") == "tool_result":
+                tuid = block.get("tool_use_id")
+                if isinstance(tuid, str):
+                    return tuid
+    # OpenAI format
+    if msg.get("role") == "tool":
+        tcid = msg.get("tool_call_id")
+        if isinstance(tcid, str):
+            return tcid
+    return None
+def apply_to_messages(
+    messages: list[dict[str, Any]],
+    tokenizer: Tokenizer,
+) -> InterceptionResult:
+    """Run every registered interceptor against every tool_result in `messages`.
+    Returns the (possibly) rewritten message list and a list of spans that
+    actually saved tokens.
+    """
+    if not INTERCEPTORS:
+        return InterceptionResult(messages=messages, spans=[])
+    new_messages: list[dict[str, Any]] = []
+    spans: list[TransformSpan] = []
+    # Progressive disclosure: per-interceptor set of keys already rewritten
+    # earlier in this message list. Prevents the second Read of the same
+    # file from being outlined again — the model evidently came back for
+    # more, so give it the raw content.
+    fired: dict[str, set[str]] = {}
+    for msg in messages:
+        if not _is_tool_result_message(msg):
+            new_messages.append(msg)
+            continue
+        original = _extract_tool_result_content(msg)
+        if not isinstance(original, str) or not original:
+            new_messages.append(msg)
+            continue
+        tuid = _tool_use_id_for_message(msg)
+        tool_name: str | None = None
+        tool_input: dict[str, Any] = {}
+        if tuid:
+            tool_name, tool_input = _find_tool_use(messages, tuid)
+        current = original
+        for interceptor in INTERCEPTORS:
+            # Progressive disclosure: skip if already fired for this key.
+            key: str | None = None
+            key_fn = getattr(interceptor, "progressive_disclosure_key", None)
+            if callable(key_fn):
+                try:
+                    key = key_fn(tool_name, tool_input)
+                except Exception as e:  # noqa: BLE001
+                    logger.warning("interceptor %s key() failed: %s", interceptor.name, e)
+                    key = None
+            if key and key in fired.get(interceptor.name, set()):
+                continue
+            try:
+                if not interceptor.matches(tool_name, tool_input, current):
+                    continue
+                rewritten = interceptor.transform(tool_name, tool_input, current)
+            except Exception as e:  # noqa: BLE001 — never crash a request
+                logger.warning("interceptor %s failed: %s", interceptor.name, e)
+                continue
+            if not rewritten or rewritten == current:
+                continue
+            before = tokenizer.count_text(current)
+            after = tokenizer.count_text(rewritten)
+            if after >= before:
+                continue  # refuse to enlarge
+            spans.append(
+                TransformSpan(
+                    tool=interceptor.name,
+                    tokens_before=before,
+                    tokens_after=after,
+                )
+            )
+            current = rewritten
+            if key:
+                fired.setdefault(interceptor.name, set()).add(key)
+        new_messages.append(
+            _swap_tool_result_content(msg, current) if current is not original else msg
+        )
+    return InterceptionResult(messages=new_messages, spans=spans)
+class ToolResultInterceptorTransform(Transform):
+    """Pipeline-level adapter: runs interceptors as the first compression stage.
+    Placed at transforms[0] so downstream compressors operate on the already-
+    shrunk content. Transform names of firing interceptors are added to
+    `transforms_applied` so they appear in existing dashboards/metrics.
+    """
+    name = "tool_result_interceptors"
+    def apply(
+        self,
+        messages: list[dict[str, Any]],
+        tokenizer: Tokenizer,
+        **kwargs: Any,
+    ) -> TransformResult:
+        result = apply_to_messages(messages, tokenizer)
+        tokens_after = tokenizer.count_messages(result.messages)
+        tokens_before = tokens_after + sum(s.tokens_saved for s in result.spans)
+        transforms_applied = [f"interceptor:{s.tool}" for s in result.spans] if result.spans else []
+        return TransformResult(
+            messages=result.messages,
+            tokens_before=tokens_before,
+            tokens_after=tokens_after,
+            transforms_applied=transforms_applied,
+        )

headroom/tools.json ADDED Viewed

	@@ -0,0 +1,89 @@

+{
+  "_comment": "Registry of externally fetched CLI tool binaries. Bump versions and SHA256s via the weekly tools-version-check CI job (see .github/workflows/). sha256=null means HTTPS-trust-only (initial bootstrap); the CI job fills real SHAs per release.",
+  "tools": {
+    "difft": {
+      "version": "0.64.0",
+      "binary": "difft",
+      "source": "Wilfred/difftastic",
+      "homepage": "https://difftastic.wilfred.me.uk/",
+      "assets": {
+        "linux-x86_64-gnu": {
+          "url": "https://github.com/Wilfred/difftastic/releases/download/0.64.0/difft-x86_64-unknown-linux-gnu.tar.gz",
+          "member": "difft",
+          "sha256": null
+        },
+        "linux-aarch64-gnu": {
+          "url": "https://github.com/Wilfred/difftastic/releases/download/0.64.0/difft-aarch64-unknown-linux-gnu.tar.gz",
+          "member": "difft",
+          "sha256": null
+        },
+        "darwin-x86_64": {
+          "url": "https://github.com/Wilfred/difftastic/releases/download/0.64.0/difft-x86_64-apple-darwin.tar.gz",
+          "member": "difft",
+          "sha256": null
+        },
+        "darwin-aarch64": {
+          "url": "https://github.com/Wilfred/difftastic/releases/download/0.64.0/difft-aarch64-apple-darwin.tar.gz",
+          "member": "difft",
+          "sha256": null
+        },
+        "windows-x86_64": {
+          "url": "https://github.com/Wilfred/difftastic/releases/download/0.64.0/difft-x86_64-pc-windows-msvc.zip",
+          "member": "difft.exe",
+          "sha256": null
+        }
+      }
+    },
+    "scc": {
+      "version": "3.5.0",
+      "binary": "scc",
+      "source": "boyter/scc",
+      "homepage": "https://github.com/boyter/scc",
+      "assets": {
+        "linux-x86_64-gnu": {
+          "url": "https://github.com/boyter/scc/releases/download/v3.5.0/scc_Linux_x86_64.tar.gz",
+          "member": "scc",
+          "sha256": null
+        },
+        "linux-x86_64-musl": {
+          "url": "https://github.com/boyter/scc/releases/download/v3.5.0/scc_Linux_x86_64.tar.gz",
+          "member": "scc",
+          "sha256": null
+        },
+        "linux-aarch64-gnu": {
+          "url": "https://github.com/boyter/scc/releases/download/v3.5.0/scc_Linux_arm64.tar.gz",
+          "member": "scc",
+          "sha256": null
+        },
+        "linux-aarch64-musl": {
+          "url": "https://github.com/boyter/scc/releases/download/v3.5.0/scc_Linux_arm64.tar.gz",
+          "member": "scc",
+          "sha256": null
+        },
+        "darwin-x86_64": {
+          "url": "https://github.com/boyter/scc/releases/download/v3.5.0/scc_Darwin_x86_64.tar.gz",
+          "member": "scc",
+          "sha256": null
+        },
+        "darwin-aarch64": {
+          "url": "https://github.com/boyter/scc/releases/download/v3.5.0/scc_Darwin_arm64.tar.gz",
+          "member": "scc",
+          "sha256": null
+        },
+        "windows-x86_64": {
+          "url": "https://github.com/boyter/scc/releases/download/v3.5.0/scc_Windows_x86_64.zip",
+          "member": "scc.exe",
+          "sha256": null
+        }
+      }
+    },
+    "ast-grep": {
+      "version": "pypi",
+      "binary": "ast-grep",
+      "source": "ast-grep/ast-grep (PyPI: ast-grep-cli)",
+      "homepage": "https://ast-grep.github.io/",
+      "_comment": "Installed via the ast-grep-cli PyPI wheel; we never fetch from GitHub for this tool. Listed here so `headroom tools doctor` can report it.",
+      "assets": {}
+    }
+  }
+}

headroom/transforms/pipeline.py CHANGED Viewed

@@ -75,6 +75,18 @@ class TransformPipeline:
         # Order matters!
         # 1. Cache Aligner (prefix stabilization)
         if self.config.cache_aligner.enabled:
             transforms.append(CacheAligner(self.config.cache_aligner))

         # Order matters!
+        # 0. Tool-result interceptors (ast-grep Read outline, etc.) run first
+        # so downstream compressors operate on the already-shrunk content.
+        # OPT-IN: enable with HEADROOM_INTERCEPT_ENABLED=1 or `headroom proxy
+        # --intercept-tool-results`. Off by default while this ships — lets
+        # users try it and compare before we make it the default.
+        import os as _os
+        if _os.environ.get("HEADROOM_INTERCEPT_ENABLED"):
+            from headroom.proxy.interceptors import ToolResultInterceptorTransform
+            transforms.append(ToolResultInterceptorTransform())
         # 1. Cache Aligner (prefix stabilization)
         if self.config.cache_aligner.enabled:
             transforms.append(CacheAligner(self.config.cache_aligner))

pyproject.toml CHANGED Viewed

@@ -51,6 +51,7 @@ dependencies = [
     "click>=8.1.0",               # CLI framework
     "rich>=13.0.0",               # Rich terminal output
     "opentelemetry-api>=1.24.0",  # Safe no-op OTEL API for instrumentation
 ]
 [project.optional-dependencies]

     "click>=8.1.0",               # CLI framework
     "rich>=13.0.0",               # Rich terminal output
     "opentelemetry-api>=1.24.0",  # Safe no-op OTEL API for instrumentation
+    "ast-grep-cli>=0.30.0",       # AST-aware code slicing (CodeCompressor); binary wheel
 ]
 [project.optional-dependencies]

tests/test_binaries.py ADDED Viewed

	@@ -0,0 +1,281 @@

+"""Unit tests for headroom.binaries — the lazy fetcher for bundled CLI tools.
+No network access. A fake urlopen serves bytes from an in-memory fixture.
+"""
+from __future__ import annotations
+import hashlib
+import io
+import os
+import sys
+import tarfile
+import zipfile
+import pytest
+from headroom import binaries
+# -------- Fixtures -------------------------------------------------------- #
+@pytest.fixture(autouse=True)
+def _clear_caches(monkeypatch, tmp_path):
+    """Isolate every test from global state: cache dir, platform lru_cache, env."""
+    binaries.detect_platform.cache_clear()
+    binaries._registry.cache_clear()
+    monkeypatch.setenv("HEADROOM_BINARIES_CACHE", str(tmp_path / "cache"))
+    monkeypatch.delenv("HEADROOM_BINARIES_MIRROR", raising=False)
+    monkeypatch.delenv("HEADROOM_BINARIES_OFFLINE", raising=False)
+    yield
+    binaries.detect_platform.cache_clear()
+    binaries._registry.cache_clear()
+def _set_platform(monkeypatch, *, sys_plat: str, machine: str, musl: bool = False):
+    monkeypatch.setattr(sys, "platform", sys_plat)
+    monkeypatch.setattr("platform.machine", lambda: machine)
+    monkeypatch.setattr(binaries, "_is_musl", lambda: musl)
+    binaries.detect_platform.cache_clear()
+def _make_tar_gz(files: dict[str, bytes]) -> bytes:
+    buf = io.BytesIO()
+    with tarfile.open(fileobj=buf, mode="w:gz") as tf:
+        for name, data in files.items():
+            info = tarfile.TarInfo(name=name)
+            info.size = len(data)
+            tf.addfile(info, io.BytesIO(data))
+    return buf.getvalue()
+def _make_zip(files: dict[str, bytes]) -> bytes:
+    buf = io.BytesIO()
+    with zipfile.ZipFile(buf, "w") as zf:
+        for name, data in files.items():
+            zf.writestr(name, data)
+    return buf.getvalue()
+class _FakeResponse:
+    def __init__(self, data: bytes):
+        self._data = data
+        self.headers = {"Content-Length": str(len(data))}
+    def read(self, n: int = -1) -> bytes:
+        if n < 0 or n >= len(self._data):
+            chunk, self._data = self._data, b""
+            return chunk
+        chunk, self._data = self._data[:n], self._data[n:]
+        return chunk
+    def __enter__(self):
+        return self
+    def __exit__(self, *a):
+        return False
+@pytest.fixture
+def fake_urlopen(monkeypatch):
+    """Install a fake urllib.request.urlopen that serves registered URLs."""
+    served: dict[str, bytes] = {}
+    def fake(req, timeout=None):  # noqa: ARG001
+        url = req.full_url if hasattr(req, "full_url") else req
+        if url not in served:
+            raise AssertionError(f"unexpected fetch for {url}")
+        return _FakeResponse(served[url])
+    monkeypatch.setattr(binaries.urllib.request, "urlopen", fake)
+    return served
+# -------- Platform detection --------------------------------------------- #
+def test_detect_platform_linux_gnu(monkeypatch):
+    _set_platform(monkeypatch, sys_plat="linux", machine="x86_64", musl=False)
+    p = binaries.detect_platform()
+    assert p == binaries.PlatformKey("linux", "x86_64", "gnu")
+    assert p.key() == "linux-x86_64-gnu"
+def test_detect_platform_linux_musl(monkeypatch):
+    _set_platform(monkeypatch, sys_plat="linux", machine="aarch64", musl=True)
+    assert binaries.detect_platform().key() == "linux-aarch64-musl"
+def test_detect_platform_darwin_arm64(monkeypatch):
+    _set_platform(monkeypatch, sys_plat="darwin", machine="arm64")
+    assert binaries.detect_platform().key() == "darwin-aarch64"
+def test_detect_platform_windows_amd64(monkeypatch):
+    _set_platform(monkeypatch, sys_plat="win32", machine="AMD64")
+    assert binaries.detect_platform().key() == "windows-x86_64"
+# -------- Cache dir ------------------------------------------------------ #
+def test_cache_dir_respects_env_override(monkeypatch, tmp_path):
+    monkeypatch.setenv("HEADROOM_BINARIES_CACHE", str(tmp_path / "custom"))
+    assert binaries.cache_dir() == (tmp_path / "custom").resolve()
+# -------- Registry / asset resolution ------------------------------------ #
+def test_unsupported_platform_raises(monkeypatch):
+    _set_platform(monkeypatch, sys_plat="linux", machine="riscv64")
+    with pytest.raises(binaries.PlatformNotSupported):
+        binaries._asset_for_platform("difft", binaries.detect_platform())
+def test_pypi_only_tool_raises_with_helpful_message(monkeypatch):
+    _set_platform(monkeypatch, sys_plat="darwin", machine="arm64")
+    with pytest.raises(binaries.PlatformNotSupported) as exc:
+        binaries._asset_for_platform("ast-grep", binaries.detect_platform())
+    assert "pip install headroom-ai" in str(exc.value)
+def test_unknown_tool_raises_key_error():
+    with pytest.raises(KeyError):
+        binaries._tool_entry("not-a-real-tool")
+# -------- which / resolve with PATH hits --------------------------------- #
+def test_which_finds_on_path(monkeypatch, tmp_path):
+    fake_bin = tmp_path / "difft"
+    fake_bin.write_text("#!/bin/sh\necho ok\n")
+    fake_bin.chmod(0o755)
+    monkeypatch.setattr(
+        binaries.shutil, "which", lambda name: str(fake_bin) if name == "difft" else None
+    )
+    # Because the tool is on PATH, which() returns its path without fetching.
+    assert binaries.which("difft") == fake_bin
+def test_which_returns_none_when_not_cached(monkeypatch):
+    _set_platform(monkeypatch, sys_plat="darwin", machine="arm64")
+    monkeypatch.setattr(binaries.shutil, "which", lambda _name: None)
+    assert binaries.which("difft") is None
+def test_resolve_honors_path(monkeypatch, tmp_path):
+    fake_bin = tmp_path / "scc"
+    fake_bin.write_text("")
+    fake_bin.chmod(0o755)
+    monkeypatch.setattr(
+        binaries.shutil, "which", lambda name: str(fake_bin) if name == "scc" else None
+    )
+    assert binaries.resolve("scc") == fake_bin
+# -------- Offline / mirror / fetch behavior ------------------------------ #
+def test_offline_error_when_fetch_required(monkeypatch):
+    _set_platform(monkeypatch, sys_plat="darwin", machine="arm64")
+    monkeypatch.setattr(binaries.shutil, "which", lambda _name: None)
+    monkeypatch.setenv("HEADROOM_BINARIES_OFFLINE", "1")
+    with pytest.raises(binaries.OfflineError):
+        binaries.resolve("difft")
+def test_mirror_substitution():
+    os.environ["HEADROOM_BINARIES_MIRROR"] = "https://mirror.example.com/gh"
+    try:
+        out = binaries._mirror_url(
+            "https://github.com/Wilfred/difftastic/releases/download/0.64.0/x.tar.gz"
+        )
+        assert (
+            out
+            == "https://mirror.example.com/gh/Wilfred/difftastic/releases/download/0.64.0/x.tar.gz"
+        )
+        # Non-matching URLs are left alone.
+        assert binaries._mirror_url("https://example.com/x") == "https://example.com/x"
+    finally:
+        del os.environ["HEADROOM_BINARIES_MIRROR"]
+def test_fetch_extract_and_cache_tar_gz(monkeypatch, fake_urlopen, tmp_path):
+    _set_platform(monkeypatch, sys_plat="darwin", machine="arm64")
+    monkeypatch.setattr(binaries.shutil, "which", lambda _name: None)
+    payload = b"#!/bin/sh\necho fake-difft\n"
+    archive = _make_tar_gz({"difft-0.64.0/difft": payload})
+    url = "https://github.com/Wilfred/difftastic/releases/download/0.64.0/difft-aarch64-apple-darwin.tar.gz"
+    fake_urlopen[url] = archive
+    path = binaries.resolve("difft")
+    assert path.exists()
+    assert path.read_bytes() == payload
+    # Second call should use cache (no further fetch).
+    fake_urlopen.pop(url)  # remove so a refetch would error
+    path2 = binaries.resolve("difft")
+    assert path2 == path
+def test_fetch_extract_zip(monkeypatch, fake_urlopen):
+    _set_platform(monkeypatch, sys_plat="win32", machine="AMD64")
+    monkeypatch.setattr(binaries.shutil, "which", lambda _name: None)
+    payload = b"MZfake"
+    archive = _make_zip({"scc.exe": payload})
+    url = "https://github.com/boyter/scc/releases/download/v3.5.0/scc_Windows_x86_64.zip"
+    fake_urlopen[url] = archive
+    path = binaries.resolve("scc")
+    assert path.exists()
+    assert path.name.endswith("scc.exe")
+    assert path.read_bytes() == payload
+def test_sha256_mismatch_raises_and_deletes(monkeypatch, fake_urlopen, tmp_path):
+    _set_platform(monkeypatch, sys_plat="darwin", machine="arm64")
+    monkeypatch.setattr(binaries.shutil, "which", lambda _name: None)
+    # Override the registry entry for difft to include a bogus sha256.
+    reg = binaries._registry()
+    asset = reg["tools"]["difft"]["assets"]["darwin-aarch64"]
+    asset["sha256"] = "deadbeef" * 8  # wrong
+    archive = _make_tar_gz({"difft": b"hi"})
+    fake_urlopen[asset["url"]] = archive
+    try:
+        with pytest.raises(binaries.Sha256Mismatch):
+            binaries.resolve("difft")
+    finally:
+        asset["sha256"] = None  # restore
+def test_sha256_match_passes(monkeypatch, fake_urlopen):
+    _set_platform(monkeypatch, sys_plat="darwin", machine="arm64")
+    monkeypatch.setattr(binaries.shutil, "which", lambda _name: None)
+    archive = _make_tar_gz({"difft": b"hello"})
+    good = hashlib.sha256(archive).hexdigest()
+    reg = binaries._registry()
+    asset = reg["tools"]["difft"]["assets"]["darwin-aarch64"]
+    asset["sha256"] = good
+    fake_urlopen[asset["url"]] = archive
+    try:
+        path = binaries.resolve("difft")
+        assert path.read_bytes() == b"hello"
+    finally:
+        asset["sha256"] = None
+# -------- status() ------------------------------------------------------- #
+def test_status_reports_every_registered_tool(monkeypatch):
+    _set_platform(monkeypatch, sys_plat="darwin", machine="arm64")
+    monkeypatch.setattr(binaries.shutil, "which", lambda _name: None)
+    rows = binaries.status()
+    names = {r["tool"] for r in rows}
+    assert {"difft", "scc", "ast-grep"} <= names
+    for r in rows:
+        assert r["state"] in ("on-path", "cached", "missing", "unsupported-platform")

tests/test_bundled_tools_savings.py ADDED Viewed

	@@ -0,0 +1,367 @@

+"""Comprehensive integration tests for the bundled CLI tools.
+Proves three things end-to-end:
+    1. `headroom.binaries.ensure_tools()` actually installs every tool.
+    2. Each tool reduces token count on a realistic payload (tiktoken-measured).
+    3. A real LLM answers the same question correctly on the compressed
+       payload (LLM-as-judge).
+Live API calls are gated on OPENAI_API_KEY / ANTHROPIC_API_KEY being present
+in the environment (loaded from .env if python-dotenv is available).
+"""
+from __future__ import annotations
+import json
+import os
+import subprocess
+import textwrap
+from pathlib import Path
+import pytest
+try:
+    from dotenv import load_dotenv
+    load_dotenv(Path(__file__).resolve().parent.parent / ".env")
+except ImportError:
+    pass
+import tiktoken
+from headroom import binaries
+# ---------- Fixtures ------------------------------------------------------ #
+ENC = tiktoken.get_encoding("cl100k_base")
+def _tokens(text: str) -> int:
+    return len(ENC.encode(text))
+SAMPLE_PY = textwrap.dedent(
+    '''
+    """Payments module — illustrative fixture for compression tests."""
+    import logging
+    from dataclasses import dataclass
+    from decimal import Decimal
+    from typing import Iterable
+    log = logging.getLogger(__name__)
+    @dataclass
+    class LineItem:
+        sku: str
+        quantity: int
+        unit_price: Decimal
+    def compute_subtotal(items: Iterable[LineItem]) -> Decimal:
+        total = Decimal("0")
+        for item in items:
+            total += item.unit_price * item.quantity
+        return total
+    def apply_promo(subtotal: Decimal, code: str | None) -> Decimal:
+        if not code:
+            return subtotal
+        if code == "SAVE10":
+            return subtotal * Decimal("0.9")
+        if code == "FREESHIP":
+            return subtotal
+        log.warning("unknown promo code %s", code)
+        return subtotal
+    def compute_tax(subtotal: Decimal, rate: Decimal) -> Decimal:
+        return (subtotal * rate).quantize(Decimal("0.01"))
+    def process_payment(items: list[LineItem], promo: str | None, tax_rate: Decimal) -> Decimal:
+        """Main entry point: compute the final total for a cart."""
+        subtotal = compute_subtotal(items)
+        after_promo = apply_promo(subtotal, promo)
+        tax = compute_tax(after_promo, tax_rate)
+        total = after_promo + tax
+        log.info("processed payment: subtotal=%s tax=%s total=%s", subtotal, tax, total)
+        return total
+    def refund_payment(order_id: str, amount: Decimal) -> dict:
+        """Issue a refund for a previous order."""
+        log.info("refunding %s from %s", amount, order_id)
+        return {"order_id": order_id, "refund": str(amount), "status": "ok"}
+    def list_orders_for_user(user_id: str, limit: int = 20) -> list[dict]:
+        """Placeholder DB lookup."""
+        return [{"user": user_id, "order": i} for i in range(limit)]
+    '''
+).strip()
+SAMPLE_PY_MODIFIED = SAMPLE_PY.replace(
+    'return subtotal * Decimal("0.9")',
+    'return subtotal * Decimal("0.85")  # promo bumped from 10% to 15%',
+).replace(
+    'log.warning("unknown promo code %s", code)',
+    'log.error("unknown promo code %s — rejecting", code)\n        raise ValueError(code)',
+)
+@pytest.fixture(scope="module")
+def repo(tmp_path_factory) -> Path:
+    d = tmp_path_factory.mktemp("payments-repo")
+    (d / "payments.py").write_text(SAMPLE_PY)
+    (d / "payments_v2.py").write_text(SAMPLE_PY_MODIFIED)
+    (d / "README.md").write_text("# payments fixture\n")
+    return d
+# ---------- 1. Tool installation ----------------------------------------- #
+def test_ensure_tools_installs_every_tool():
+    """All three tools should be reachable after ensure_tools()."""
+    binaries.ensure_tools(quiet=True)
+    # ast-grep comes from the PyPI wheel (core dep); resolve() checks PATH
+    # and sys.prefix/bin so it works in non-activated venvs too.
+    assert binaries.resolve("ast-grep").exists(), "ast-grep-cli wheel not installed"
+    # difft & scc come from the GitHub-release fetcher.
+    assert binaries.which("difft") is not None, "difftastic not installed"
+    assert binaries.which("scc") is not None, "scc not installed"
+# ---------- 2. Token-savings (no API) ------------------------------------ #
+def test_ast_grep_slice_saves_tokens(repo: Path):
+    """Function-level slice vs full-file — ast-grep must reduce tokens."""
+    full = (repo / "payments.py").read_text()
+    full_tokens = _tokens(full)
+    # Extract just `process_payment` and `apply_promo` (the two functions an
+    # agent would realistically need to reason about a promo-code bug).
+    result = subprocess.run(
+        [
+            str(binaries.resolve("ast-grep")),
+            "run",
+            "--pattern",
+            "def process_payment",
+            "--lang",
+            "python",
+            "--json=stream",
+            str(repo / "payments.py"),
+        ],
+        capture_output=True,
+        text=True,
+        check=True,
+    )
+    matches = [json.loads(line) for line in result.stdout.strip().splitlines() if line]
+    assert matches, "ast-grep returned no matches"
+    sliced = "\n\n".join(m["text"] for m in matches)
+    sliced_tokens = _tokens(sliced)
+    savings_pct = (1 - sliced_tokens / full_tokens) * 100
+    print(f"\n[ast-grep] full={full_tokens}t  sliced={sliced_tokens}t  savings={savings_pct:.1f}%")
+    assert sliced_tokens < full_tokens
+    assert savings_pct >= 40, f"expected ≥40% savings, got {savings_pct:.1f}%"
+def test_difftastic_saves_tokens_vs_line_diff(repo: Path):
+    """Structural diff should compress smaller than unified line diff."""
+    # Baseline: unified line diff via /usr/bin/diff.
+    line_diff = subprocess.run(
+        ["diff", "-u", str(repo / "payments.py"), str(repo / "payments_v2.py")],
+        capture_output=True,
+        text=True,
+    ).stdout
+    line_tokens = _tokens(line_diff)
+    # difftastic in a compact display mode.
+    struct = subprocess.run(
+        [
+            str(binaries.resolve("difft")),
+            "--display=inline",
+            "--color=never",
+            str(repo / "payments.py"),
+            str(repo / "payments_v2.py"),
+        ],
+        capture_output=True,
+        text=True,
+    ).stdout
+    struct_tokens = _tokens(struct)
+    savings_pct = (1 - struct_tokens / line_tokens) * 100 if line_tokens else 0.0
+    print(
+        f"\n[difftastic] line={line_tokens}t  struct={struct_tokens}t  savings={savings_pct:.1f}%"
+    )
+    # On small diffs structural output can occasionally be equal or slightly
+    # larger due to display overhead; just assert it doesn't blow up.
+    assert struct_tokens <= int(line_tokens * 1.2), (
+        f"difft output unexpectedly larger: {struct_tokens} vs {line_tokens}"
+    )
+def test_scc_repo_shape_card_is_tiny(repo: Path):
+    """scc produces a repo-shape summary that's much smaller than raw files."""
+    raw_bytes = sum(
+        (repo / p).stat().st_size for p in ("payments.py", "payments_v2.py", "README.md")
+    )
+    raw_tokens = _tokens((repo / "payments.py").read_text())
+    raw_tokens += _tokens((repo / "payments_v2.py").read_text())
+    raw_tokens += _tokens((repo / "README.md").read_text())
+    scc_out = subprocess.run(
+        [str(binaries.resolve("scc")), "--format=json", str(repo)],
+        capture_output=True,
+        text=True,
+        check=True,
+    ).stdout
+    scc_tokens = _tokens(scc_out)
+    print(f"\n[scc] raw_files={raw_tokens}t  scc_card={scc_tokens}t  bytes_scanned={raw_bytes}")
+    # scc summarizes many files into one small JSON blob; assert it's smaller
+    # than the concatenated raw file contents.
+    assert scc_tokens < raw_tokens
+# ---------- 3. Quality test (live API) ----------------------------------- #
+_NEED_OPENAI = pytest.mark.skipif(
+    not os.environ.get("OPENAI_API_KEY"),
+    reason="OPENAI_API_KEY not set",
+)
+_NEED_ANTHROPIC = pytest.mark.skipif(
+    not os.environ.get("ANTHROPIC_API_KEY"),
+    reason="ANTHROPIC_API_KEY not set",
+)
+QUESTION = (
+    "In this payments module, what discount percentage does the SAVE10 promo "
+    "currently apply? Answer with just the number (e.g. '10')."
+)
+EXPECTED = "10"
+@_NEED_OPENAI
+def test_compressed_payload_preserves_answer_openai(repo: Path):
+    """Model answers the same question correctly on ast-grep-sliced input."""
+    import openai  # lazy: only required when the key is present
+    full = (repo / "payments.py").read_text()
+    result = subprocess.run(
+        [
+            str(binaries.resolve("ast-grep")),
+            "run",
+            "--pattern",
+            "def apply_promo",
+            "--lang",
+            "python",
+            "--json=stream",
+            str(repo / "payments.py"),
+        ],
+        capture_output=True,
+        text=True,
+        check=True,
+    )
+    matches = [json.loads(line) for line in result.stdout.strip().splitlines() if line]
+    sliced = matches[0]["text"]
+    client = openai.OpenAI()
+    full_tokens = _tokens(full)
+    sliced_tokens = _tokens(sliced)
+    full_resp = client.chat.completions.create(
+        model="gpt-4o-mini",
+        messages=[
+            {"role": "system", "content": "You answer briefly and numerically."},
+            {"role": "user", "content": f"{QUESTION}\n\n---\n{full}"},
+        ],
+        max_tokens=16,
+        temperature=0,
+    )
+    sliced_resp = client.chat.completions.create(
+        model="gpt-4o-mini",
+        messages=[
+            {"role": "system", "content": "You answer briefly and numerically."},
+            {"role": "user", "content": f"{QUESTION}\n\n---\n{sliced}"},
+        ],
+        max_tokens=16,
+        temperature=0,
+    )
+    full_answer = full_resp.choices[0].message.content.strip()
+    sliced_answer = sliced_resp.choices[0].message.content.strip()
+    full_usage = full_resp.usage.prompt_tokens
+    sliced_usage = sliced_resp.usage.prompt_tokens
+    print(f"\n[openai] full_payload={full_tokens}t prompt_tokens={full_usage} → {full_answer!r}")
+    print(
+        f"[openai] sliced_payload={sliced_tokens}t prompt_tokens={sliced_usage} → {sliced_answer!r}"
+    )
+    print(f"[openai] prompt-token savings: {(1 - sliced_usage / full_usage) * 100:.1f}%")
+    assert EXPECTED in full_answer, f"baseline failed: {full_answer!r}"
+    assert EXPECTED in sliced_answer, f"compressed answer wrong: {sliced_answer!r}"
+    assert sliced_usage < full_usage, "compressed payload used more tokens than full"
+@_NEED_ANTHROPIC
+def test_compressed_payload_preserves_answer_anthropic(repo: Path):
+    import anthropic
+    full = (repo / "payments.py").read_text()
+    result = subprocess.run(
+        [
+            str(binaries.resolve("ast-grep")),
+            "run",
+            "--pattern",
+            "def apply_promo",
+            "--lang",
+            "python",
+            "--json=stream",
+            str(repo / "payments.py"),
+        ],
+        capture_output=True,
+        text=True,
+        check=True,
+    )
+    sliced = json.loads(result.stdout.strip().splitlines()[0])["text"]
+    client = anthropic.Anthropic()
+    full_resp = client.messages.create(
+        model="claude-haiku-4-5-20251001",
+        max_tokens=16,
+        system="You answer briefly and numerically.",
+        messages=[{"role": "user", "content": f"{QUESTION}\n\n---\n{full}"}],
+    )
+    sliced_resp = client.messages.create(
+        model="claude-haiku-4-5-20251001",
+        max_tokens=16,
+        system="You answer briefly and numerically.",
+        messages=[{"role": "user", "content": f"{QUESTION}\n\n---\n{sliced}"}],
+    )
+    full_answer = full_resp.content[0].text.strip()
+    sliced_answer = sliced_resp.content[0].text.strip()
+    print(f"\n[anthropic] full prompt_tokens={full_resp.usage.input_tokens} → {full_answer!r}")
+    print(f"[anthropic] sliced prompt_tokens={sliced_resp.usage.input_tokens} → {sliced_answer!r}")
+    print(
+        f"[anthropic] savings: "
+        f"{(1 - sliced_resp.usage.input_tokens / full_resp.usage.input_tokens) * 100:.1f}%"
+    )
+    assert EXPECTED in full_answer, f"baseline failed: {full_answer!r}"
+    assert EXPECTED in sliced_answer, f"compressed answer wrong: {sliced_answer!r}"
+    assert sliced_resp.usage.input_tokens < full_resp.usage.input_tokens

tests/test_tool_result_interceptors.py ADDED Viewed

	@@ -0,0 +1,400 @@

+"""Tests for the tool_result interceptor framework + ast-grep Read outliner."""
+from __future__ import annotations
+import textwrap
+import pytest
+from headroom.proxy.interceptors import (
+    INTERCEPTORS,
+    ToolResultInterceptor,
+    apply_to_messages,
+    register,
+)
+from headroom.proxy.interceptors.astgrep import AstGrepReadOutline
+from headroom.tokenizer import Tokenizer
+class _FakeTokenCounter:
+    """Deterministic 4-chars-per-token counter for unit tests."""
+    def count_text(self, text: str) -> int:
+        return max(1, len(text) // 4)
+    def count_messages(self, messages) -> int:
+        total = 0
+        for m in messages:
+            c = m.get("content")
+            if isinstance(c, str):
+                total += self.count_text(c)
+            elif isinstance(c, list):
+                for b in c:
+                    if isinstance(b, dict):
+                        inner = b.get("content") or b.get("text") or ""
+                        if isinstance(inner, str):
+                            total += self.count_text(inner)
+        return total
+@pytest.fixture
+def tokenizer() -> Tokenizer:
+    # Real Tokenizer wrapping the fake counter; mirrors production construction.
+    return Tokenizer(_FakeTokenCounter())  # type: ignore[arg-type]
+# -------- Framework basics ----------------------------------------------- #
+def test_astgrep_interceptor_registered_by_default():
+    assert any(i.name == "ast-grep" for i in INTERCEPTORS)
+def test_register_is_idempotent_on_name():
+    before = len(INTERCEPTORS)
+    register(AstGrepReadOutline())  # same name
+    assert len(INTERCEPTORS) == before
+def test_custom_interceptor_plugs_in(tokenizer):
+    class UpperCase:
+        name = "uppercase-test"
+        def matches(self, tool_name, tool_input, tool_output):
+            return tool_name == "Echo"
+        def transform(self, tool_name, tool_input, tool_output):
+            # Must REDUCE tokens — use a single short marker.
+            return "X"
+    dummy: ToolResultInterceptor = UpperCase()  # type: ignore[assignment]
+    register(dummy)
+    try:
+        messages = [
+            {
+                "role": "assistant",
+                "content": [{"type": "tool_use", "id": "1", "name": "Echo", "input": {}}],
+            },
+            {
+                "role": "user",
+                "content": [
+                    {
+                        "type": "tool_result",
+                        "tool_use_id": "1",
+                        "content": "hello " * 100,
+                    }
+                ],
+            },
+        ]
+        result = apply_to_messages(messages, tokenizer)
+        assert any(s.tool == "uppercase-test" for s in result.spans)
+        swapped = result.messages[1]["content"][0]["content"]
+        assert swapped == "X"
+    finally:
+        INTERCEPTORS[:] = [i for i in INTERCEPTORS if i.name != "uppercase-test"]
+def test_pass_through_when_no_interceptor_matches(tokenizer):
+    messages = [
+        {
+            "role": "assistant",
+            "content": [{"type": "tool_use", "id": "1", "name": "Unknown", "input": {}}],
+        },
+        {
+            "role": "user",
+            "content": [{"type": "tool_result", "tool_use_id": "1", "content": "x" * 5000}],
+        },
+    ]
+    result = apply_to_messages(messages, tokenizer)
+    assert result.spans == []
+    assert result.messages[1] is messages[1]  # untouched identity
+# -------- ast-grep interceptor ------------------------------------------- #
+_PY_FIXTURE = textwrap.dedent(
+    '''
+    """Payments module fixture."""
+    from decimal import Decimal
+    def compute_subtotal(items):
+        total = Decimal("0")
+        for item in items:
+            total += item.price * item.qty
+        return total
+    def apply_promo(subtotal, code):
+        if not code:
+            return subtotal
+        if code == "SAVE10":
+            return subtotal * Decimal("0.9")
+        return subtotal
+    def compute_tax(subtotal, rate):
+        return (subtotal * rate).quantize(Decimal("0.01"))
+    def process_payment(items, promo, tax_rate):
+        """Main entry point."""
+        subtotal = compute_subtotal(items)
+        after = apply_promo(subtotal, promo)
+        tax = compute_tax(after, tax_rate)
+        return after + tax
+    def refund(order_id, amount):
+        """Issue a refund."""
+        return {"order": order_id, "refund": str(amount)}
+    def list_orders_for_user(user_id, limit=20):
+        """Placeholder DB lookup for a user's orders."""
+        return [{"user": user_id, "order": i} for i in range(limit)]
+    def cancel_order(order_id, reason=None):
+        """Cancel an order, logging the reason if provided."""
+        return {"order": order_id, "cancelled": True, "reason": reason or "unspecified"}
+    def summarize_cart(items):
+        """Return a one-line summary of cart contents."""
+        skus = [i.sku for i in items]
+        total_qty = sum(i.qty for i in items)
+        return f"{len(items)} line items ({total_qty} units): {', '.join(skus)}"
+    def format_receipt(order_id, items, total):
+        """Render a textual receipt."""
+        lines = [f"Order {order_id}"]
+        for i in items:
+            lines.append(f"  {i.sku} x {i.qty} @ {i.unit_price} = {i.qty * i.unit_price}")
+        lines.append(f"Total: {total}")
+        return "\\n".join(lines)
+    '''
+).strip()
+def test_astgrep_outlines_large_python_read(tokenizer):
+    messages = [
+        {
+            "role": "assistant",
+            "content": [
+                {
+                    "type": "tool_use",
+                    "id": "abc",
+                    "name": "Read",
+                    "input": {"file_path": "/repo/payments.py"},
+                }
+            ],
+        },
+        {
+            "role": "user",
+            "content": [{"type": "tool_result", "tool_use_id": "abc", "content": _PY_FIXTURE}],
+        },
+    ]
+    result = apply_to_messages(messages, tokenizer)
+    assert len(result.spans) == 1
+    span = result.spans[0]
+    assert span.tool == "ast-grep"
+    assert span.tokens_after < span.tokens_before
+    new_content = result.messages[1]["content"][0]["content"]
+    assert "outlined by ast-grep" in new_content
+    assert "body elided" in new_content
+    assert "def process_payment" in new_content
+    assert "def apply_promo" in new_content
+    # Bodies should NOT leak through unchanged.
+    assert "total += item.price * item.qty" not in new_content
+def test_astgrep_skips_small_files(tokenizer):
+    small = "def foo(): return 1\n"
+    messages = [
+        {
+            "role": "assistant",
+            "content": [
+                {
+                    "type": "tool_use",
+                    "id": "x",
+                    "name": "Read",
+                    "input": {"file_path": "/a.py"},
+                }
+            ],
+        },
+        {
+            "role": "user",
+            "content": [{"type": "tool_result", "tool_use_id": "x", "content": small}],
+        },
+    ]
+    result = apply_to_messages(messages, tokenizer)
+    assert result.spans == []
+def test_astgrep_skips_non_code_extensions(tokenizer):
+    messages = [
+        {
+            "role": "assistant",
+            "content": [
+                {
+                    "type": "tool_use",
+                    "id": "r",
+                    "name": "Read",
+                    "input": {"file_path": "/notes.txt"},
+                }
+            ],
+        },
+        {
+            "role": "user",
+            "content": [{"type": "tool_result", "tool_use_id": "r", "content": "x" * 3000}],
+        },
+    ]
+    result = apply_to_messages(messages, tokenizer)
+    assert result.spans == []
+# -------- OpenAI-format tool_result -------------------------------------- #
+def test_astgrep_skips_when_line_range_requested(tokenizer):
+    """If the tool_input specifies a line range, the model wants those lines — pass through."""
+    messages = [
+        {
+            "role": "assistant",
+            "content": [
+                {
+                    "type": "tool_use",
+                    "id": "r",
+                    "name": "Read",
+                    "input": {
+                        "file_path": "/repo/payments.py",
+                        "offset": 30,
+                        "limit": 20,
+                    },
+                }
+            ],
+        },
+        {
+            "role": "user",
+            "content": [{"type": "tool_result", "tool_use_id": "r", "content": _PY_FIXTURE}],
+        },
+    ]
+    result = apply_to_messages(messages, tokenizer)
+    assert result.spans == []
+def test_progressive_disclosure_second_read_passes_through(tokenizer):
+    """First Read of a file gets outlined; second Read of the same path is untouched."""
+    messages = [
+        # Turn 1: Read foo.py → outlined
+        {
+            "role": "assistant",
+            "content": [
+                {
+                    "type": "tool_use",
+                    "id": "t1",
+                    "name": "Read",
+                    "input": {"file_path": "/repo/payments.py"},
+                }
+            ],
+        },
+        {
+            "role": "user",
+            "content": [{"type": "tool_result", "tool_use_id": "t1", "content": _PY_FIXTURE}],
+        },
+        # Turn 2: Read foo.py again (model came back for more) → pass through
+        {
+            "role": "assistant",
+            "content": [
+                {
+                    "type": "tool_use",
+                    "id": "t2",
+                    "name": "Read",
+                    "input": {"file_path": "/repo/payments.py"},
+                }
+            ],
+        },
+        {
+            "role": "user",
+            "content": [{"type": "tool_result", "tool_use_id": "t2", "content": _PY_FIXTURE}],
+        },
+    ]
+    result = apply_to_messages(messages, tokenizer)
+    # Only the first Read is rewritten; the second keeps its full body.
+    assert len(result.spans) == 1
+    first_tr = result.messages[1]["content"][0]["content"]
+    second_tr = result.messages[3]["content"][0]["content"]
+    assert "outlined by ast-grep" in first_tr
+    assert "outlined by ast-grep" not in second_tr
+    assert "def process_payment" in second_tr
+    # Second Read preserves the bodies.
+    assert "subtotal = compute_subtotal(items)" in second_tr
+def test_progressive_disclosure_different_file_still_outlined(tokenizer):
+    """Reading a DIFFERENT file after the first outline should still outline."""
+    messages = [
+        {
+            "role": "assistant",
+            "content": [
+                {
+                    "type": "tool_use",
+                    "id": "t1",
+                    "name": "Read",
+                    "input": {"file_path": "/repo/payments.py"},
+                }
+            ],
+        },
+        {
+            "role": "user",
+            "content": [{"type": "tool_result", "tool_use_id": "t1", "content": _PY_FIXTURE}],
+        },
+        {
+            "role": "assistant",
+            "content": [
+                {
+                    "type": "tool_use",
+                    "id": "t2",
+                    "name": "Read",
+                    "input": {"file_path": "/repo/other.py"},
+                }
+            ],
+        },
+        {
+            "role": "user",
+            "content": [{"type": "tool_result", "tool_use_id": "t2", "content": _PY_FIXTURE}],
+        },
+    ]
+    result = apply_to_messages(messages, tokenizer)
+    # Both files get outlined — different keys.
+    assert len(result.spans) == 2
+def test_openai_format_tool_result_is_rewritten(tokenizer):
+    messages = [
+        {
+            "role": "assistant",
+            "content": None,
+            "tool_calls": [
+                {
+                    "id": "call_1",
+                    "type": "function",
+                    "function": {
+                        "name": "Read",
+                        "arguments": '{"file_path": "/x/payments.py"}',
+                    },
+                }
+            ],
+        },
+        {
+            "role": "tool",
+            "tool_call_id": "call_1",
+            "content": _PY_FIXTURE,
+        },
+    ]
+    result = apply_to_messages(messages, tokenizer)
+    assert len(result.spans) == 1
+    new_content = result.messages[1]["content"]
+    assert "outlined by ast-grep" in new_content