# Kirana Detective — Build Progress Deadline: **June 15, 2026** (5 days remaining as of June 10) --- ## Status Legend - [x] Done - [~] In progress / partial - [ ] Not started - [!] Blocked / needs action --- ## Pre-build: Fine-tune & Publish Models | Task | Status | File | Notes | |---|---|---|---| | 0.1 Fine-tune YOLO26n | [ ] | `finetune/train_yolo26n.py` | Run `modal run finetune/train_yolo26n.py` | | 0.2a Generate 500 synthetic invoices | [x] | `finetune/generate_invoices.py` | 500 images in 4 formats — pure Pillow, no native deps | | 0.2b Fine-tune MiniCPM-V 4.6 | [ ] | `finetune/train_minicpm_v.py` | Run `modal run finetune/train_minicpm_v.py` after uploading invoices | | 0.3 Fine-tune MiniCPM5-1B | [ ] | `finetune/train_minicpm5_1b.py` | Run `modal run finetune/train_minicpm5_1b.py` | **Action needed**: Run all three modal jobs TODAY (June 10) — each takes 1-3h. --- ## Core Implementation | Task | Status | File | Notes | |---|---|---|---| | 1. Project scaffolding | [x] | `requirements.txt`, `README.md`, dirs | Done | | 2. Data models | [x] | `models.py` | All dataclasses + CANONICAL_AGENT_ORDER | | 3.1 FMCG catalog JSON | [x] | `data/fmcg_catalog.json` | 200 SKUs, 7 categories | | 3.2 FMCGCatalog class | [x] | `catalog.py` | Alias lookup, GST prefix match, singleton | | 4. Storage layer | [x] | `storage.py` | SQLite + degraded mode + 90-day retention | | 5. Agent Tracer | [x] | `tracer.py` | HF Hub publish with retry + daemon thread | | 6. Agent 1: Invoice Extractor | [x] | `agents/invoice_extractor.py` | MiniCPM-V 4.6 | | 7. Agent 2: Product Matcher | [x] | `agents/product_matcher.py` | MiniCPM5-1B | | 8. Agent 3: Pricing Agent | [x] | `agents/pricing_agent.py` | Rule-based | | 9. Agent 4: Visual Counter | [x] | `agents/visual_counter.py` | YOLO26n ONNX | | 10. Agent 5: Reconciliation Agent | [x] | `agents/reconciliation_agent.py` | Rule-based | | 11. Agent 6: Savings Agent | [x] | `agents/savings_agent.py` | MiniCPM5-1B | | 12. Checkpoint (unit tests pass) | [ ] | `tests/` | | | 13. Pipeline Orchestrator | [x] | `pipeline.py` | | | 14. Backend entry point | [x] | `app.py` | Gradio gr.Server | | 15. Custom frontend | [x] | `static/index.html` | Off-Brand badge | | 16. Property-based tests | [ ] | `tests/test_properties.py` | Hypothesis | | 17. Checkpoint (all tests pass) | [ ] | — | | | 18. HF Space deployment | [ ] | `README.md` + `verify_models.py` | | --- ## Fine-tune Scripts Status | Script | Modal Secret | Ready to run? | |---|---|---| | `train_yolo26n.py` | `roboflow-secret`, `hf-secret` | Yes — modal secrets set | | `generate_invoices.py` | None (local) | Needs `pip install weasyprint augraphy` | | `train_minicpm_v.py` | `hf-secret` | After generate_invoices.py | | `train_minicpm5_1b.py` | `hf-secret` | After catalog is on Modal volume | --- ## Modal Setup ```bash # Secrets (already done): modal secret create roboflow-secret ROBOFLOW_API_KEY= modal secret create hf-secret HF_TOKEN= # Upload catalog to Modal volume (needed for train_minicpm5_1b): modal volume put kirana-synth-data data/fmcg_catalog.json fmcg_catalog.json # Run fine-tune jobs (run all 3 in parallel today): modal run finetune/train_yolo26n.py modal run finetune/generate_invoices.py # local, not modal modal run finetune/train_minicpm_v.py # after invoices are generated modal run finetune/train_minicpm5_1b.py ``` --- ## HF Repos to Create After fine-tuning publishes, verify these exist: - `build-small-hackathon/yolo26n-indian-fmcg-detection` - `build-small-hackathon/minicpm-v-4-6-indian-invoice-extraction` - `build-small-hackathon/minicpm5-1b-indian-fmcg-normalizer` - `build-small-hackathon/kirana-detective-traces` (dataset — create manually before first audit run) --- ## Badge Checklist | Badge | Requirement | Status | |---|---|---| | Off the Grid | Zero cloud API calls in inference | [ ] | | Well-Tuned | 3 fine-tuned models on HF Hub | [ ] | | Off-Brand | Custom gr.Server frontend | [ ] | | Llama Champion | MiniCPM models via llama.cpp | [ ] | | Sharing is Caring | Agent trace to HF Dataset after each audit | [ ] | | Field Notes | Blog post | [ ] | --- ## Day-by-Day Remaining Plan | Day | Date | Focus | |---|---|---| | Day 6 | June 10 | Kick off Modal fine-tune jobs + implement models.py, catalog.py | | Day 7 | June 11 | storage.py, tracer.py, agents 1-3 | | Day 8 | June 12 | agents 4-6, pipeline.py | | Day 9 | June 13 | app.py, static/index.html, full pipeline test | | Day 10 | June 14 | Tests, demo video, blog post, HF Space deploy | | Deadline | June 15 | Submit |