# Kirana Detective — Hackathon Submission

**HuggingFace Build Small Hackathon 2026**

| | |
|---|---|
| **Space** | [build-small-hackathon/kirana-detective](https://huggingface.co/spaces/build-small-hackathon/kirana-detective) |
| **Demo Video** | [YouTube](https://youtu.be/8TVZP4sfesI) |
| **Blog Post** | [How I Built an AI Auditor for India's 12 Million Kirana Stores](https://huggingface.co/blog/build-small-hackathon/kirana-detective) |
| **Social Post** | [X / Twitter](https://x.com/naazimhussain02/status/2065966381657633161) |
| **Track** | Track 1: Backyard AI |
| **Total Parameters** | ~2.38B (Tiny Titan ✅) |

---

## Track: Backyard AI

**Problem**: India's 12 million kirana store owners receive 3–5 distributor invoices per week via WhatsApp, printed bills, or Tally exports. Manual verification is impossible. Distributors overcharge, deliver short quantities, and apply wrong GST rates. A single store loses ₹3,000–₹8,000 per month silently.

**Solution**: Upload an invoice + delivery photos → receive a ₹ leakage report in under 60 seconds. Every finding (overcharge, shortage, GST error, duplicate) maps to a rupee amount and an action step.

**Real user**: Tested against real invoice formats from kirana distributors in India (HUL, ITC, Nestlé, Britannia).

**Model constraint fit**: Entire pipeline runs on CPU — no GPU required at inference. Designed for Tier 2/3 city deployment where GPU hardware is absent and internet is patchy.

---

## Merit Badges

### ✅ Off the Grid
Zero cloud API calls. All inference runs locally:
- MiniCPM-V 4.6 via `transformers` (merged bfloat16 weights)
- MiniCPM5-1B via `llama-cpp-python` (GGUF Q4_K_M)
- YOLO26n via ONNX Runtime

Invoice data never leaves the device. Suitable for privacy-sensitive business data.

### ✅ Well-Tuned
Three custom models fine-tuned from scratch and published on HF Hub:

| Model | Repo | Task |
|---|---|---|
| MiniCPM-V 4.6 | [`build-small-hackathon/minicpm-v-4-6-indian-invoice-extraction-merged`](https://huggingface.co/build-small-hackathon/minicpm-v-4-6-indian-invoice-extraction-merged) | Invoice OCR → structured JSON |
| MiniCPM5-1B | [`build-small-hackathon/minicpm5-1b-indian-fmcg-normalizer`](https://huggingface.co/build-small-hackathon/minicpm5-1b-indian-fmcg-normalizer) | Product name normalisation + savings report |
| YOLO26n | [`build-small-hackathon/yolo26n-indian-fmcg-detection`](https://huggingface.co/build-small-hackathon/yolo26n-indian-fmcg-detection) | Product counting from delivery photos |

Training dataset: [`build-small-hackathon/kirana-invoice-train-data`](https://huggingface.co/datasets/build-small-hackathon/kirana-invoice-train-data) — 500 synthetic Indian invoices (printed GST, Tally PDF, handwritten, WhatsApp).

### ✅ Off-Brand
Custom Gradio UI — not default Gradio. Features:
- Rupee savings cards with colour-coded anomaly type (overcharge = red, shortage = amber, duplicate = purple, GST = orange)
- Agent progress stream with per-agent timing
- Collapsible raw JSON view per agent
- Dark/warm colour scheme themed around the kirana store context

### ✅ Llama Champion
MiniCPM5-1B is served entirely via `llama-cpp-python` using a GGUF Q4_K_M quantised model. Used for both Agent 2 (product normalisation) and Agent 6 (savings report generation). No transformers at runtime for these two agents — pure llama.cpp.

### ✅ Sharing is Caring
11 raw Claude Code (Sonnet 4.6) JSONL build sessions published as a public trace dataset, viewable in HF Data Studio's native agent trace viewer:

[`build-small-hackathon/kirana-detective-build-traces`](https://huggingface.co/datasets/build-small-hackathon/kirana-detective-build-traces) — complete design, coding, debugging, and documentation sessions from blank repo to submission.

### ✅ Field Notes
Full blog post: [How I Built an AI Auditor for India's 12 Million Kirana Stores](https://huggingface.co/blog/build-small-hackathon/kirana-detective)

Covers: the problem, the 6-agent pipeline design, all three fine-tuned models with training details, the local-inference rationale, the hardest bug, and the full stack.

---

## Prize Categories Targeted

### Special Awards

**🏋️ Tiny Titan** — ~2.38B total parameters across all three models combined.

| Component | Parameters |
|---|---|
| MiniCPM-V 4.6 (merged bfloat16) | ~1.3B |
| MiniCPM5-1B (GGUF Q4_K_M) | ~1.08B |
| YOLO26n (ONNX) | ~2.4M |
| **Total** | **~2.38B** |

Well within the ≤4B Tiny Titan threshold.

**🤖 Best Agent** — Fully modular 6-agent pipeline. Each agent has a single responsibility, a defined input/output contract, and produces an `AgentTraceEntry` with timing. Generator-based streaming shows live agent progress in the UI.

**🎨 Off-Brand** — Custom Gradio UI with rupee savings cards, colour-coded anomaly flags, and agent-by-agent progress stream. Distinctly different from the default Gradio look.

**📊 Best Demo** — End-to-end demo covering: invoice upload → extraction → normalisation → price check → delivery photo counting → shortage reconciliation → ₹ savings report with action items.

**🎖️ Bonus Quest Champion** — All 6 merit badges claimed on a single submission:

| # | Badge | Evidence |
|---|---|---|
| 1 | Off the Grid | Zero cloud API calls — MiniCPM-V (transformers) + MiniCPM5-1B (llama.cpp) + YOLO26n (ONNX), all CPU |
| 2 | Well-Tuned | 3 custom fine-tuned models published on HF Hub (MiniCPM-V 4.6, MiniCPM5-1B, YOLO26n) |
| 3 | Off-Brand | Custom Gradio UI — rupee savings cards, colour-coded anomaly flags, per-agent streaming progress |
| 4 | Llama Champion | MiniCPM5-1B served via llama-cpp-python (GGUF Q4_K_M) for both Agent 2 and Agent 6 |
| 5 | Sharing is Caring | 11 Claude Code build sessions published at [`build-small-hackathon/kirana-detective-build-traces`](https://huggingface.co/datasets/build-small-hackathon/kirana-detective-build-traces) |
| 6 | Field Notes | Blog post at [huggingface.co/blog/build-small-hackathon/kirana-detective](https://huggingface.co/blog/build-small-hackathon/kirana-detective) |

Full sash. All badges earned independently, each with verifiable evidence.

**🗳️ Community Choice** — Kirana Detective is built around a problem that 12 million Indian shopkeepers face every week. It is demonstrable to anyone who has ever received a bill they couldn't verify — which is most of the world. The live Space requires no setup, no account, and produces a tangible rupee number in under a minute. The blog post and X post are live for community sharing. Encouraging votes from the community.

### Sponsor Awards

**OpenBMB ($10,000 pool)**

Both language models are from OpenBMB's MiniCPM family:
- MiniCPM-V 4.6 (`openbmb/MiniCPM-V-4.6`) — fine-tuned for Indian invoice extraction
- MiniCPM5-1B (`openbmb/MiniCPM5-1B`) — fine-tuned for FMCG product normalisation and report generation

Both are fine-tuned, pushed to HF Hub, and used in production in the Space. MiniCPM5-1B runs as GGUF via llama.cpp (cross-qualifying for Llama Champion badge).

**Modal ($20,000 in credits)**

All three models were trained on Modal A10G GPUs using Modal's `@app.function` decorator with GPU provisioning. Total compute: ~4.5 hours of A10G time, ~$5.80 total cost.

Training scripts in `finetune/`:
- `finetune/train_minicpm_v.py` — MiniCPM-V 4.6 fine-tuning (51 min, A10G)
- `finetune/train_minicpm5_1b.py` — MiniCPM5-1B fine-tuning (~1 hr, A10G)
- `finetune/train_yolo26n.py` — YOLO26n fine-tuning (~2 hrs, A10G)
- `finetune/generate_invoices.py` — synthetic invoice generation (Modal function)
- `finetune/export_minicpm_v_gguf.py` — LoRA merge + GGUF export (Modal function)

**NVIDIA**

YOLO26n is exported to ONNX and can leverage NVIDIA GPU acceleration via ONNX Runtime when available (falls back to CPU). The A10G GPU used for all training is NVIDIA hardware. ONNX Runtime GPU execution provider supports CUDA/TensorRT for deployment on NVIDIA hardware.

---

## Six-Agent Pipeline Summary

```
Agent 1 — Invoice Extractor     MiniCPM-V 4.6 (OpenBMB, fine-tuned)  →  Structured JSON
Agent 2 — Product Matcher       MiniCPM5-1B (OpenBMB, GGUF, llama.cpp) →  Canonical SKU names
Agent 3 — Pricing Agent         Rule-based (SQLite history)             →  Price / GST flags
Agent 4 — Visual Counter        YOLO26n (ONNX Runtime)                  →  Product counts
Agent 5 — Reconciliation Agent  Rule-based                              →  Shortage flags + ₹ loss
Agent 6 — Savings Agent         MiniCPM5-1B (OpenBMB, GGUF, llama.cpp) →  ₹ report + actions
```

---

## Constraints Met

| Constraint | Status |
|---|---|
| Models ≤ 32B parameters | ✅ ~2.38B total |
| Gradio UI | ✅ Custom Gradio 6.16 |
| Hosted as HF Space | ✅ [build-small-hackathon/kirana-detective](https://huggingface.co/spaces/build-small-hackathon/kirana-detective) |
| Demo video | ✅ [YouTube](https://youtu.be/8TVZP4sfesI) |
| Social media post | ✅ [X post](https://x.com/naazimhussain02/status/2065966381657633161) |

---

## Links

| Resource | URL |
|---|---|
| Space | https://huggingface.co/spaces/build-small-hackathon/kirana-detective |
| Blog | https://huggingface.co/blog/build-small-hackathon/kirana-detective |
| X Post | https://x.com/naazimhussain02/status/2065966381657633161 |
| Invoice Extractor | https://huggingface.co/build-small-hackathon/minicpm-v-4-6-indian-invoice-extraction-merged |
| Product Normalizer | https://huggingface.co/build-small-hackathon/minicpm5-1b-indian-fmcg-normalizer |
| Product Detector | https://huggingface.co/build-small-hackathon/yolo26n-indian-fmcg-detection |
| Training Dataset | https://huggingface.co/datasets/build-small-hackathon/kirana-invoice-train-data |
| Build Traces | https://huggingface.co/datasets/build-small-hackathon/kirana-detective-build-traces |
| GitHub | https://github.com/naazimsnh02/kirana-detective |