Spaces:

build-small-hackathon
/

kirana-detective

Sleeping

App Files Files Community

kirana-detective / SUBMISSION.md

naazimsnh02

Final Submission

e0446f7 5 days ago

preview code

Raw

History Blame Contribute Delete

9.91 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade

Kirana Detective — Hackathon Submission

HuggingFace Build Small Hackathon 2026


Space	build-small-hackathon/kirana-detective
Demo Video	YouTube
Blog Post	How I Built an AI Auditor for India's 12 Million Kirana Stores
Social Post	X / Twitter
Track	Track 1: Backyard AI
Total Parameters	~2.38B (Tiny Titan ✅)

Track: Backyard AI

Problem: India's 12 million kirana store owners receive 3–5 distributor invoices per week via WhatsApp, printed bills, or Tally exports. Manual verification is impossible. Distributors overcharge, deliver short quantities, and apply wrong GST rates. A single store loses ₹3,000–₹8,000 per month silently.

Solution: Upload an invoice + delivery photos → receive a ₹ leakage report in under 60 seconds. Every finding (overcharge, shortage, GST error, duplicate) maps to a rupee amount and an action step.

Real user: Tested against real invoice formats from kirana distributors in India (HUL, ITC, Nestlé, Britannia).

Model constraint fit: Entire pipeline runs on CPU — no GPU required at inference. Designed for Tier 2/3 city deployment where GPU hardware is absent and internet is patchy.

Merit Badges

✅ Off the Grid

Zero cloud API calls. All inference runs locally:

MiniCPM-V 4.6 via transformers (merged bfloat16 weights)
MiniCPM5-1B via llama-cpp-python (GGUF Q4_K_M)
YOLO26n via ONNX Runtime

Invoice data never leaves the device. Suitable for privacy-sensitive business data.

✅ Well-Tuned

Three custom models fine-tuned from scratch and published on HF Hub:

Model	Repo	Task
MiniCPM-V 4.6	`build-small-hackathon/minicpm-v-4-6-indian-invoice-extraction-merged`	Invoice OCR → structured JSON
MiniCPM5-1B	`build-small-hackathon/minicpm5-1b-indian-fmcg-normalizer`	Product name normalisation + savings report
YOLO26n	`build-small-hackathon/yolo26n-indian-fmcg-detection`	Product counting from delivery photos

Training dataset: build-small-hackathon/kirana-invoice-train-data — 500 synthetic Indian invoices (printed GST, Tally PDF, handwritten, WhatsApp).

✅ Off-Brand

Custom Gradio UI — not default Gradio. Features:

Rupee savings cards with colour-coded anomaly type (overcharge = red, shortage = amber, duplicate = purple, GST = orange)
Agent progress stream with per-agent timing
Collapsible raw JSON view per agent
Dark/warm colour scheme themed around the kirana store context

✅ Llama Champion

MiniCPM5-1B is served entirely via llama-cpp-python using a GGUF Q4_K_M quantised model. Used for both Agent 2 (product normalisation) and Agent 6 (savings report generation). No transformers at runtime for these two agents — pure llama.cpp.

✅ Sharing is Caring

11 raw Claude Code (Sonnet 4.6) JSONL build sessions published as a public trace dataset, viewable in HF Data Studio's native agent trace viewer:

build-small-hackathon/kirana-detective-build-traces — complete design, coding, debugging, and documentation sessions from blank repo to submission.

✅ Field Notes

Full blog post: How I Built an AI Auditor for India's 12 Million Kirana Stores

Covers: the problem, the 6-agent pipeline design, all three fine-tuned models with training details, the local-inference rationale, the hardest bug, and the full stack.

Prize Categories Targeted

Special Awards

🏋️ Tiny Titan — ~2.38B total parameters across all three models combined.

Component	Parameters
MiniCPM-V 4.6 (merged bfloat16)	~1.3B
MiniCPM5-1B (GGUF Q4_K_M)	~1.08B
YOLO26n (ONNX)	~2.4M
Total	~2.38B

Well within the ≤4B Tiny Titan threshold.

🤖 Best Agent — Fully modular 6-agent pipeline. Each agent has a single responsibility, a defined input/output contract, and produces an AgentTraceEntry with timing. Generator-based streaming shows live agent progress in the UI.

🎨 Off-Brand — Custom Gradio UI with rupee savings cards, colour-coded anomaly flags, and agent-by-agent progress stream. Distinctly different from the default Gradio look.

📊 Best Demo — End-to-end demo covering: invoice upload → extraction → normalisation → price check → delivery photo counting → shortage reconciliation → ₹ savings report with action items.

🎖️ Bonus Quest Champion — All 6 merit badges claimed on a single submission:

#	Badge	Evidence
1	Off the Grid	Zero cloud API calls — MiniCPM-V (transformers) + MiniCPM5-1B (llama.cpp) + YOLO26n (ONNX), all CPU
2	Well-Tuned	3 custom fine-tuned models published on HF Hub (MiniCPM-V 4.6, MiniCPM5-1B, YOLO26n)
3	Off-Brand	Custom Gradio UI — rupee savings cards, colour-coded anomaly flags, per-agent streaming progress
4	Llama Champion	MiniCPM5-1B served via llama-cpp-python (GGUF Q4_K_M) for both Agent 2 and Agent 6
5	Sharing is Caring	11 Claude Code build sessions published at `build-small-hackathon/kirana-detective-build-traces`
6	Field Notes	Blog post at huggingface.co/blog/build-small-hackathon/kirana-detective

Full sash. All badges earned independently, each with verifiable evidence.

🗳️ Community Choice — Kirana Detective is built around a problem that 12 million Indian shopkeepers face every week. It is demonstrable to anyone who has ever received a bill they couldn't verify — which is most of the world. The live Space requires no setup, no account, and produces a tangible rupee number in under a minute. The blog post and X post are live for community sharing. Encouraging votes from the community.

Sponsor Awards

OpenBMB ($10,000 pool)

Both language models are from OpenBMB's MiniCPM family:

MiniCPM-V 4.6 (openbmb/MiniCPM-V-4.6) — fine-tuned for Indian invoice extraction
MiniCPM5-1B (openbmb/MiniCPM5-1B) — fine-tuned for FMCG product normalisation and report generation

Both are fine-tuned, pushed to HF Hub, and used in production in the Space. MiniCPM5-1B runs as GGUF via llama.cpp (cross-qualifying for Llama Champion badge).

Modal ($20,000 in credits)

All three models were trained on Modal A10G GPUs using Modal's @app.function decorator with GPU provisioning. Total compute: ~4.5 hours of A10G time, ~$5.80 total cost.

Training scripts in finetune/:

finetune/train_minicpm_v.py — MiniCPM-V 4.6 fine-tuning (51 min, A10G)
finetune/train_minicpm5_1b.py — MiniCPM5-1B fine-tuning (~1 hr, A10G)
finetune/train_yolo26n.py — YOLO26n fine-tuning (~2 hrs, A10G)
finetune/generate_invoices.py — synthetic invoice generation (Modal function)
finetune/export_minicpm_v_gguf.py — LoRA merge + GGUF export (Modal function)

NVIDIA

YOLO26n is exported to ONNX and can leverage NVIDIA GPU acceleration via ONNX Runtime when available (falls back to CPU). The A10G GPU used for all training is NVIDIA hardware. ONNX Runtime GPU execution provider supports CUDA/TensorRT for deployment on NVIDIA hardware.

Six-Agent Pipeline Summary

Agent 1 — Invoice Extractor     MiniCPM-V 4.6 (OpenBMB, fine-tuned)  →  Structured JSON
Agent 2 — Product Matcher       MiniCPM5-1B (OpenBMB, GGUF, llama.cpp) →  Canonical SKU names
Agent 3 — Pricing Agent         Rule-based (SQLite history)             →  Price / GST flags
Agent 4 — Visual Counter        YOLO26n (ONNX Runtime)                  →  Product counts
Agent 5 — Reconciliation Agent  Rule-based                              →  Shortage flags + ₹ loss
Agent 6 — Savings Agent         MiniCPM5-1B (OpenBMB, GGUF, llama.cpp) →  ₹ report + actions

Constraints Met

Constraint	Status
Models ≤ 32B parameters	✅ ~2.38B total
Gradio UI	✅ Custom Gradio 6.16
Hosted as HF Space	✅ build-small-hackathon/kirana-detective
Demo video	✅ YouTube
Social media post	✅ X post

Links

Resource	URL
Space	https://huggingface.co/spaces/build-small-hackathon/kirana-detective
Blog	https://huggingface.co/blog/build-small-hackathon/kirana-detective
X Post	https://x.com/naazimhussain02/status/2065966381657633161
Invoice Extractor	https://huggingface.co/build-small-hackathon/minicpm-v-4-6-indian-invoice-extraction-merged
Product Normalizer	https://huggingface.co/build-small-hackathon/minicpm5-1b-indian-fmcg-normalizer
Product Detector	https://huggingface.co/build-small-hackathon/yolo26n-indian-fmcg-detection
Training Dataset	https://huggingface.co/datasets/build-small-hackathon/kirana-invoice-train-data
Build Traces	https://huggingface.co/datasets/build-small-hackathon/kirana-detective-build-traces
GitHub	https://github.com/naazimsnh02/kirana-detective