Kirana Detective — Build Progress
Deadline: June 15, 2026 (5 days remaining as of June 10)
Status Legend
- Done
- [~] In progress / partial
- Not started
- [!] Blocked / needs action
Pre-build: Fine-tune & Publish Models
| Task | Status | File | Notes |
|---|---|---|---|
| 0.1 Fine-tune YOLO26n | [ ] | finetune/train_yolo26n.py |
Run modal run finetune/train_yolo26n.py |
| 0.2a Generate 500 synthetic invoices | [x] | finetune/generate_invoices.py |
500 images in 4 formats — pure Pillow, no native deps |
| 0.2b Fine-tune MiniCPM-V 4.6 | [ ] | finetune/train_minicpm_v.py |
Run modal run finetune/train_minicpm_v.py after uploading invoices |
| 0.3 Fine-tune MiniCPM5-1B | [ ] | finetune/train_minicpm5_1b.py |
Run modal run finetune/train_minicpm5_1b.py |
Action needed: Run all three modal jobs TODAY (June 10) — each takes 1-3h.
Core Implementation
| Task | Status | File | Notes |
|---|---|---|---|
| 1. Project scaffolding | [x] | requirements.txt, README.md, dirs |
Done |
| 2. Data models | [x] | models.py |
All dataclasses + CANONICAL_AGENT_ORDER |
| 3.1 FMCG catalog JSON | [x] | data/fmcg_catalog.json |
200 SKUs, 7 categories |
| 3.2 FMCGCatalog class | [x] | catalog.py |
Alias lookup, GST prefix match, singleton |
| 4. Storage layer | [x] | storage.py |
SQLite + degraded mode + 90-day retention |
| 5. Agent Tracer | [x] | tracer.py |
HF Hub publish with retry + daemon thread |
| 6. Agent 1: Invoice Extractor | [x] | agents/invoice_extractor.py |
MiniCPM-V 4.6 |
| 7. Agent 2: Product Matcher | [x] | agents/product_matcher.py |
MiniCPM5-1B |
| 8. Agent 3: Pricing Agent | [x] | agents/pricing_agent.py |
Rule-based |
| 9. Agent 4: Visual Counter | [x] | agents/visual_counter.py |
YOLO26n ONNX |
| 10. Agent 5: Reconciliation Agent | [x] | agents/reconciliation_agent.py |
Rule-based |
| 11. Agent 6: Savings Agent | [x] | agents/savings_agent.py |
MiniCPM5-1B |
| 12. Checkpoint (unit tests pass) | [ ] | tests/ |
|
| 13. Pipeline Orchestrator | [x] | pipeline.py |
|
| 14. Backend entry point | [x] | app.py |
Gradio gr.Server |
| 15. Custom frontend | [x] | static/index.html |
Off-Brand badge |
| 16. Property-based tests | [ ] | tests/test_properties.py |
Hypothesis |
| 17. Checkpoint (all tests pass) | [ ] | — | |
| 18. HF Space deployment | [ ] | README.md + verify_models.py |
Fine-tune Scripts Status
| Script | Modal Secret | Ready to run? |
|---|---|---|
train_yolo26n.py |
roboflow-secret, hf-secret |
Yes — modal secrets set |
generate_invoices.py |
None (local) | Needs pip install weasyprint augraphy |
train_minicpm_v.py |
hf-secret |
After generate_invoices.py |
train_minicpm5_1b.py |
hf-secret |
After catalog is on Modal volume |
Modal Setup
# Secrets (already done):
modal secret create roboflow-secret ROBOFLOW_API_KEY=<key>
modal secret create hf-secret HF_TOKEN=<token>
# Upload catalog to Modal volume (needed for train_minicpm5_1b):
modal volume put kirana-synth-data data/fmcg_catalog.json fmcg_catalog.json
# Run fine-tune jobs (run all 3 in parallel today):
modal run finetune/train_yolo26n.py
modal run finetune/generate_invoices.py # local, not modal
modal run finetune/train_minicpm_v.py # after invoices are generated
modal run finetune/train_minicpm5_1b.py
HF Repos to Create
After fine-tuning publishes, verify these exist:
build-small-hackathon/yolo26n-indian-fmcg-detectionbuild-small-hackathon/minicpm-v-4-6-indian-invoice-extractionbuild-small-hackathon/minicpm5-1b-indian-fmcg-normalizerbuild-small-hackathon/kirana-detective-traces(dataset — create manually before first audit run)
Badge Checklist
| Badge | Requirement | Status |
|---|---|---|
| Off the Grid | Zero cloud API calls in inference | [ ] |
| Well-Tuned | 3 fine-tuned models on HF Hub | [ ] |
| Off-Brand | Custom gr.Server frontend | [ ] |
| Llama Champion | MiniCPM models via llama.cpp | [ ] |
| Sharing is Caring | Agent trace to HF Dataset after each audit | [ ] |
| Field Notes | Blog post | [ ] |
Day-by-Day Remaining Plan
| Day | Date | Focus |
|---|---|---|
| Day 6 | June 10 | Kick off Modal fine-tune jobs + implement models.py, catalog.py |
| Day 7 | June 11 | storage.py, tracer.py, agents 1-3 |
| Day 8 | June 12 | agents 4-6, pipeline.py |
| Day 9 | June 13 | app.py, static/index.html, full pipeline test |
| Day 10 | June 14 | Tests, demo video, blog post, HF Space deploy |
| Deadline | June 15 | Submit |