File size: 6,139 Bytes
441e897
 
0fc4ec3
849ee7b
 
441e897
bd351d2
441e897
 
849ee7b
0fc4ec3
 
 
aec617b
 
 
 
 
 
0fc4ec3
 
 
 
 
 
 
441e897
 
849ee7b
 
0fc4ec3
 
 
 
 
 
 
 
 
 
 
 
 
 
f93ba31
 
0fc4ec3
 
c74d553
0fc4ec3
326ac0d
 
 
 
0fc4ec3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
849ee7b
 
0fc4ec3
 
849ee7b
0fc4ec3
 
849ee7b
0fc4ec3
 
849ee7b
 
0fc4ec3
c8055f7
0fc4ec3
 
 
c8055f7
0fc4ec3
c8055f7
0fc4ec3
 
 
 
 
 
 
 
 
 
8457788
0fc4ec3
 
8457788
0fc4ec3
8457788
0fc4ec3
 
8457788
0fc4ec3
8457788
0fc4ec3
 
 
 
 
 
 
 
c74d553
 
8457788
0fc4ec3
 
 
8457788
0fc4ec3
 
 
 
 
 
 
6ac8ef6
0fc4ec3
 
 
 
 
 
 
849ee7b
 
0fc4ec3
 
 
 
 
849ee7b
0fc4ec3
849ee7b
0fc4ec3
 
849ee7b
 
0fc4ec3
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
---
title: Trace Field Notes
emoji: 🧭
colorFrom: green
colorTo: gray
sdk: gradio
sdk_version: 6.16.0
app_file: app.py
pinned: false
license: mit
short_description: Qualitative field reports for coding-agent session traces.
tags:
  - build-small
  - track:backyard
  - sponsor:openbmb
  - sponsor:openai
  - sponsor:nvidia
  - achievement:offbrand
  - achievement:fieldnotes
  - gradio-server
  - zerogpu
  - coding-agents
models:
  - openbmb/MiniCPM5-1B
  - nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16
  - openai/privacy-filter
---

# Trace Field Notes

Trace Field Notes turns long coding-agent session logs into qualitative field
reports: where the agent got stuck, how it detoured, what it tried, how it
recovered, and whether its final claim matched its own evidence.

Most agent traces are too long to read after the fact. Tool telemetry is noisy,
private, and often the wrong level of detail. This app focuses on a narrower
question: what did the agent *say* about its own work while it was solving a
task? The answer becomes a field notebook, not a benchmark.

## Links

- Live Space: https://huggingface.co/spaces/build-small-hackathon/trace-field-notes
- App runtime: https://build-small-hackathon-trace-field-notes.hf.space/
- GitHub: https://github.com/JacobLinCool/trace-field-notes
- Demo video: https://youtu.be/1QNZlqkl8zo
- Demo MP4 asset: https://huggingface.co/spaces/build-small-hackathon/trace-field-notes/resolve/main/assets/trace-field-notes-demo.mp4
- Article draft: [`docs/article.md`](docs/article.md)
- Social post draft: [`docs/social-post.md`](docs/social-post.md)
- Public X post: https://x.com/JacobLinCool/status/2066160425952334155

## Team

- HF username: [@JacobLinCool](https://huggingface.co/JacobLinCool)

## Who it is for

Trace Field Notes is for developers, researchers, and hackathon builders who use
Codex, Claude Code, Pi Agent, or similar coding agents and want to understand
the session narrative after the code is written:

- Was the agent blocked, or just exploring?
- Did it change strategy for a good reason?
- Did a detour produce a better route?
- Did the closeout claim overstate what was verified?
- What can the next run learn from this one?

The app does **not** claim to inspect hidden reasoning or prove that the final
code is correct. It reports the visible narrative the agent wrote.

## How to use it

1. Find a local coding-agent session log.
2. Review and redact anything sensitive before upload.
3. Upload `.jsonl`, `.json`, `.txt`, or `.log`.
4. Choose the analysis engine:
   - **Quick analysis**: `openbmb/MiniCPM5-1B`
   - **Deeper analysis**: `nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16`
   - **Rule-based**: deterministic codebook, no model
5. Choose **GPU** for the Hugging Face ZeroGPU path or **CPU** for a no-quota
   run.
6. Read the report: verdict, trail map, episode detail, terrain groups, detour
   analysis, closeout audit, and redacted narrative export.

Common local trace locations:

```bash
# Codex
ls ~/.codex/sessions

# Claude Code
ls ~/.claude/projects

# Pi Agent
ls ~/.pi/agent/sessions
```

## Technology

The frontend is a custom React field-notebook UI served through `gradio.Server`.
It deliberately avoids the default Gradio component look so the report feels
like a qualitative trail map rather than a form.

The backend pipeline is:

1. `parser.py` loads Codex, Claude Code, Pi Agent, JSONL, JSON, text, and log
   files into visible narrative messages.
2. `redaction.py` applies deterministic secret and PII patterns.
3. `privacy_filter.py` optionally adds `openai/privacy-filter` on the Space GPU.
4. `analyzer.py` identifies difficulty episodes and classifies them with a
   deterministic codebook.
5. `model_runtime.py` optionally asks MiniCPM5 1B or Nemotron 3 Nano 30B-A3B to
   rewrite the analysis into a richer structured field report.
6. `view_model.py` adapts the result into the JSON shape rendered by the UI.
7. `profiling.py` logs per-stage timing and resource snapshots to server logs.

The app streams real progress events so long runs do not look frozen: upload,
extract, redact, chart, classify, synthesize, and model analysis.

## Build Small fit

Trace Field Notes targets the **Backyard AI** track: it solves a specific,
practical problem for people already using coding agents.

It also targets these Build Small prizes / badges:

- **Best Use of Codex**: Codex helped develop, debug, package, document, and
  produce the demo video. The connected GitHub history includes Codex-attributed
  commits.
- **Best MiniCPM Build**: Quick analysis uses `openbmb/MiniCPM5-1B`.
- **Nemotron Hardware Prize**: Deeper analysis uses
  `nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16`.
- **Off Brand**: the app uses `gradio.Server` with a custom React trail-map UI,
  not stock Gradio blocks.
- **Best Demo**: the repo includes a polished demo video, article draft, and
  public X post.

It does **not** target Tiny Titan because the optional Nemotron path is 30B, and
it does **not** target Best Use of Modal because the runtime is Hugging Face
ZeroGPU / CPU, not Modal.

## Privacy posture

Agent traces can include prompts, tool inputs, command output, local paths,
screenshots, secrets, private source code, and personal data. Review and redact
before uploading or sharing.

By default, Trace Field Notes:

- ignores raw tool-call contents;
- analyzes only visible assistant narrative messages plus optional user context;
- runs deterministic secret redaction;
- can run `openai/privacy-filter` for a second PII pass;
- exports only redacted narrative text.

## Local development

```bash
python3.11 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python app.py
```

Run tests:

```bash
python3.11 -m unittest discover -s tests
```

Optional environment settings are listed in [`.env.example`](.env.example).

## Codex contribution

Codex assisted with repository inspection, implementation debugging, test
verification, privacy/README hardening, Hugging Face deployment preparation,
demo-video scripting, voiceover generation, video composition, frame/ASR
verification, and hackathon submission packaging.