Spaces:

minhtudragon
/

headroom

Build error

Garm Claude Opus 4.7 (1M context) commited on Apr 22

Commit

e5edc40

1 Parent(s): 9a9a136

test(proxy): warm up TestClient before measuring /livez backpressure latency

The test drained the anthropic pre-upstream semaphore and asserted that 20
subsequent /livez calls stayed under 100 ms. With only 20 samples, the p99
computation falls through to max(latencies) — one cold-start outlier was
enough to fail the test. Observed on CI py3.10 runners where the first
TestClient request paid ~330 ms of one-time ASGI lifespan / import /
route-resolution cost while every subsequent request was sub-ms.

Add 3 warm-up requests before timing starts. Preserves the test's
signal (if /livez were actually blocked on the drained semaphore, all
post-warmup samples would still be slow) while removing the
runner-speed flake.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Files changed (1) hide show

tests/test_anthropic_pre_upstream_backpressure.py +8 -0

tests/test_anthropic_pre_upstream_backpressure.py CHANGED Viewed

@@ -564,6 +564,14 @@ def test_livez_unaffected_under_anthropic_backpressure():
     latencies: list[float] = []
     with TestClient(app) as client:
         for _ in range(20):
             t0 = time.perf_counter()
             resp = client.get("/livez")

     latencies: list[float] = []
     with TestClient(app) as client:
+        # Warm up: the first few requests pay one-time costs (TestClient
+        # ASGI lifespan, route resolution, import side effects) that are
+        # unrelated to what this test measures. Without warm-up, the
+        # single cold-start sample dominates `max(latencies)` (which is
+        # what the p99 fallback below reduces to for small N) and causes
+        # flakes on slow CI runners.
+        for _ in range(3):
+            client.get("/livez")
         for _ in range(20):
             t0 = time.perf_counter()
             resp = client.get("/livez")