--- sdk: gradio sdk_version: 6.16.0 app_file: app.py title: Kirana Detective short_description: AI invoice auditor for Indian kirana stores license: mit tags: - invoice-audit - llm - yolo - gguf - gradio - indian-fmcg - kirana - minicpm - multimodal - backyard-ai - local-first - fine-tuned - custom-ui - llama.cpp - open-trace - blog-post - openbmb - modal.com - track:backyard - sponsor:openbmb - sponsor:modal - achievement:offgrid - achievement:welltuned - achievement:offbrand - achievement:llama - achievement:sharing - achievement:fieldnotes colorFrom: purple colorTo: pink pinned: true thumbnail: >- https://cdn-uploads.huggingface.co/production/uploads/639548cc276ff8643fab34ac/p5tj50-FcFouVAlbUk8KJ.png ---
# πŸ” Kirana Detective ### AI-Powered Invoice & Inventory Auditor for Indian Kirana Stores *Find where money is being lost β€” in under 60 seconds* [![Try It Live](https://img.shields.io/badge/πŸ€—%20Try%20It-Live%20Demo-blue)](https://huggingface.co/spaces/build-small-hackathon/kirana-detective) [![Watch Demo](https://img.shields.io/badge/β–Ά%20Watch-Demo-red)](https://youtu.be/8TVZP4sfesI) [![Blog Post](https://img.shields.io/badge/πŸ“%20Blog-How%20I%20Built%20It-orange)](https://huggingface.co/blog/build-small-hackathon/kirana-detective) [![X Post](https://img.shields.io/badge/𝕏-View%20Post-black)](https://x.com/naazimhussain02/status/2065966381657633161) [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE) [![Python 3.11](https://img.shields.io/badge/Python-3.11-blue.svg)](https://python.org) [![Training: Modal](https://img.shields.io/badge/Training-Modal%20A10G-orange)](https://modal.com) [![Models: OpenBMB](https://img.shields.io/badge/Models-OpenBMB%20MiniCPM-purple)](https://huggingface.co/openbmb) [![Hackathon: Build Small 2026](https://img.shields.io/badge/Hackathon-HF%20Build%20Small%202026-yellow)](https://huggingface.co/build-small-hackathon)
--- ## What It Does Indian kirana store owners receive 3–5 distributor invoices every week via WhatsApp, printed bills, or Tally exports. Verifying them manually is impossible. **Kirana Detective uploads an invoice + delivery photos and finds:** | Finding | Example | |---|---| | Price overcharge | Surf Excel 1kg charged β‚Ή255 β€” historical price β‚Ή220 (+15.9%) | | Delivery shortage | Invoice says 24 Coke bottles β€” photo shows 20 | | Duplicate charge | Parle-G 80g appears twice on the same invoice | | GST mismatch | Aashirvaad Atta billed at 12% instead of 5% | Every finding converts to a **rupee leakage number** with an actionable follow-up step. --- ## Demo ``` Upload Invoice (photo/PDF/WhatsApp) + Delivery Photos (up to 5) ↓ Agent 1 β€” MiniCPM-V 4.6 extracts structured JSON from invoice image Agent 2 β€” MiniCPM5-1B normalises "SURF XL 1K" β†’ "Surf Excel Washing Powder 1kg" Agent 3 β€” Rule engine checks price vs. stored invoice history Agent 4 β€” YOLO26n counts products in delivery photos Agent 5 β€” Reconciliation: invoice qty vs. counted qty β†’ shortage flags Agent 6 β€” MiniCPM5-1B generates rupee savings report + action items ↓ β‚Ή TOTAL LEAKAGE DETECTED: β‚Ή858 ``` --- ## Six-Agent Pipeline ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Agent 1 β€” Invoice Extractor β”‚ MiniCPM-V 4.6 (merged, bfloat16) β”‚ Invoice image/PDF β†’ JSON β”‚ OCR + structured field extraction β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ↓ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Agent 2 β€” Product Matcher β”‚ MiniCPM5-1B (GGUF Q4_K_M) β”‚ Raw names β†’ canonical SKU IDs β”‚ "MAGGI NDL" β†’ Nestle Maggi 70g β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ↓ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Agent 3 β€” Pricing Agent β”‚ Rule-based (SQLite history) β”‚ Normalized invoice β†’ price flagsβ”‚ Detects overcharges & GST errors β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ↓ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Agent 4 β€” Visual Counter β”‚ YOLO26n (ONNX, 1,831 classes) β”‚ Delivery photos β†’ product countsβ”‚ mAP50 = 0.428 on merged dataset β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ↓ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Agent 5 β€” Reconciliation Agent β”‚ Rule-based β”‚ Invoice qty vs. counted qty β”‚ Shortage flags + β‚Ή loss β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ↓ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Agent 6 β€” Savings Agent β”‚ MiniCPM5-1B (GGUF Q4_K_M) β”‚ All flags β†’ β‚Ή report + actions β”‚ "Call HUL rep. Request credit note." β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` Every agent run is stored locally in SQLite for audit history. --- ## Fine-Tuned Models All three models were trained from scratch on Modal A10G GPUs and published to HuggingFace. Total training cost: ~$5.80. ### Model 1 β€” MiniCPM-V 4.6 (Invoice Extractor) | | | |---|---| | **Repo** | [`build-small-hackathon/minicpm-v-4-6-indian-invoice-extraction-merged`](https://huggingface.co/build-small-hackathon/minicpm-v-4-6-indian-invoice-extraction-merged) | | **Base** | openbmb/MiniCPM-V-4.6 | | **Method** | QLoRA rank 16 (PEFT + bitsandbytes), then merged to full bfloat16 weights | | **Data** | 500 synthetic Indian invoices β€” printed GST, Tally PDF, handwritten, WhatsApp | | **Eval loss** | 0.212 (epoch 3 / 3) | | **Training** | 51 min 50 sec on A10G Β· 87 steps Β· 9.5M trainable params (0.72%) | | **Inference** | `transformers` AutoModel Β· `model.chat()` β€” no PEFT at runtime | ### Model 2 β€” MiniCPM5-1B (Product Normalizer) | | | |---|---| | **Repo** | [`build-small-hackathon/minicpm5-1b-indian-fmcg-normalizer`](https://huggingface.co/build-small-hackathon/minicpm5-1b-indian-fmcg-normalizer) | | **Base** | openbmb/MiniCPM5-1B | | **Method** | QLoRA rank 16 via Unsloth, exported to GGUF Q4_K_M | | **Data** | 2,000 synthetic (raw_name β†’ canonical_name) pairs Β· 200 Indian FMCG SKUs | | **Training** | ~1 hour on A10G | | **Inference** | llama-cpp-python Β· `create_chat_completion()` | ### Model 3 β€” YOLO26n (Product Detector) | | | |---|---| | **Repo** | [`build-small-hackathon/yolo26n-indian-fmcg-detection`](https://huggingface.co/build-small-hackathon/yolo26n-indian-fmcg-detection) | | **Base** | Ultralytics YOLO26n | | **Method** | Supervised fine-tuning on 3 merged Roboflow datasets | | **Data** | ~11,400 images Β· 1,831 unified classes | | **Metrics** | mAP50 = **0.428** Β· mAP50-95 = **0.302** Β· 100 epochs (A10G) | | **Inference** | ONNX Runtime Β· CPU or GPU | ### Training Dataset | | | |---|---| | **Repo** | [`build-small-hackathon/kirana-invoice-train-data`](https://huggingface.co/datasets/build-small-hackathon/kirana-invoice-train-data) | | **Contents** | 500 synthetic invoice images (450 train / 50 eval) with structured JSON annotations | --- ## Running Locally ```bash git clone https://github.com/naazimsnh02/kirana-detective.git cd kirana-detective pip install -r requirements.txt python app.py ``` **First run:** downloads ~3 GB of model weights (cached after that). **Requirements:** ~6 GB RAM Β· Python 3.11 Β· optional CUDA GPU for faster MiniCPM-V inference. ### Environment Variables | Variable | Required | Purpose | |---|---|---| | `HF_TOKEN` | Optional | Faster downloads from HF Hub (avoids rate limits) | --- ## Re-Training the Models All training scripts are in `finetune/`. Training is orchestrated on Modal. ```bash export HF_TOKEN= export ROBOFLOW_API_KEY= modal run finetune/generate_invoices.py # ~10 min β€” generate 500 synthetic invoices modal run finetune/train_minicpm_v.py # ~52 min β€” fine-tune invoice extractor modal run finetune/export_minicpm_v_gguf.py # ~10 min β€” merge LoRA β†’ push HF weights modal run finetune/train_minicpm5_1b.py # ~1 hour β€” fine-tune product normalizer modal run finetune/train_yolo26n.py # ~2 hours β€” fine-tune YOLO26n detector ``` Scripts publish to `naazimsnh02/` first; transfer to `build-small-hackathon/` manually after. See [`finetune/README.md`](finetune/README.md) for the full workflow. --- ## Model Architecture | Component | Model | Parameters | Runtime | |---|---|---|---| | Invoice OCR & extraction | MiniCPM-V 4.6 (merged) | 1.3B | transformers | | Product normalisation + report | MiniCPM5-1B (GGUF Q4_K_M) | 1.08B | llama-cpp-python | | Product detection & counting | YOLO26n (ONNX) | ~2.4M | onnxruntime | | **Total** | β€” | **~2.38B** | β€” | Comfortably within the **Tiny Titan** ≀4B threshold. Zero cloud API calls β€” fully local inference. --- ## Hackathon Badges | Badge | How | |---|---| | 🎯 Well-Tuned | 3 custom fine-tuned models published on HF Hub | | πŸ”Œ Off the Grid | 100% local inference β€” MiniCPM-V (transformers) + MiniCPM5-1B (GGUF) + YOLO26n (ONNX) | | πŸ¦™ Llama Champion | MiniCPM5-1B served via llama-cpp-python (GGUF Q4_K_M) | | 🎨 Off-Brand | Custom Gradio UI β€” rupee savings cards, colour-coded anomaly flags | | πŸ“‘ Sharing is Caring | Claude Code build sessions (11 JSONL sessions) published as a public trace dataset | | πŸ““ Field Notes | *"How I built an AI auditor for India's 12 million kirana stores"* | | πŸ‹οΈ Tiny Titan | ~2.38B total parameters β€” OCR + normalization + counting + report | --- ## Sharing is Caring β€” Build Trace Dataset The 11 raw Claude Code (Sonnet 4.6) JSONL sessions used to design, code, debug, and document this entire project β€” from blank repo to hackathon submission. Viewable in HF Data Studio's native agent trace viewer. | Dataset | Contents | Format | |---|---|---| | [`build-small-hackathon/kirana-detective-build-traces`](https://huggingface.co/datasets/build-small-hackathon/kirana-detective-build-traces) | 11 Claude Code build sessions Β· ~8.9 MB | Native JSONL trace viewer | To upload build traces: ```bash export HF_TOKEN= python finetune/upload_build_traces.py ``` --- ## Project Structure ``` kirana-detective/ β”œβ”€β”€ app.py # Gradio + FastAPI server β”œβ”€β”€ pipeline.py # AuditOrchestrator (6-agent runner) β”œβ”€β”€ models.py # Dataclasses: InvoiceJSON, LeakageReport, etc. β”œβ”€β”€ catalog.py # FMCG product catalog + alias lookup β”œβ”€β”€ storage.py # SQLite price history + audit log β”œβ”€β”€ tracer.py # Agent trace logging β†’ HF Hub β”œβ”€β”€ agents/ β”‚ β”œβ”€β”€ invoice_extractor.py # Agent 1 β€” MiniCPM-V 4.6 β”‚ β”œβ”€β”€ product_matcher.py # Agent 2 β€” MiniCPM5-1B (alias + LLM) β”‚ β”œβ”€β”€ pricing_agent.py # Agent 3 β€” rule-based price checks β”‚ β”œβ”€β”€ visual_counter.py # Agent 4 β€” YOLO26n ONNX β”‚ β”œβ”€β”€ reconciliation_agent.py # Agent 5 β€” invoice vs. photo reconciliation β”‚ └── savings_agent.py # Agent 6 β€” MiniCPM5-1B report generator β”œβ”€β”€ finetune/ β”‚ β”œβ”€β”€ README.md # Training workflow guide β”‚ β”œβ”€β”€ generate_invoices.py # Synthetic invoice generator β”‚ β”œβ”€β”€ train_minicpm_v.py # Fine-tune MiniCPM-V β”‚ β”œβ”€β”€ train_minicpm5_1b.py # Fine-tune MiniCPM5-1B β”‚ β”œβ”€β”€ train_yolo26n.py # Fine-tune YOLO26n β”‚ β”œβ”€β”€ export_minicpm_v_gguf.py# Merge LoRA β†’ push HF weights β”‚ └── upload_build_traces.py # Upload Claude Code sessions β†’ HF Hub β”œβ”€β”€ data/ β”‚ └── fmcg_catalog.json # 200 canonical SKU names + GST rates └── MODEL_CARD.md # Full training + evaluation documentation ``` --- ## Links - **Demo Video**: https://youtu.be/8TVZP4sfesI - **HF Space**: [build-small-hackathon/kirana-detective](https://huggingface.co/spaces/build-small-hackathon/kirana-detective) - **Training dataset**: [build-small-hackathon/kirana-invoice-train-data](https://huggingface.co/datasets/build-small-hackathon/kirana-invoice-train-data) - **Invoice extractor**: [build-small-hackathon/minicpm-v-4-6-indian-invoice-extraction-merged](https://huggingface.co/build-small-hackathon/minicpm-v-4-6-indian-invoice-extraction-merged) - **Product normalizer**: [build-small-hackathon/minicpm5-1b-indian-fmcg-normalizer](https://huggingface.co/build-small-hackathon/minicpm5-1b-indian-fmcg-normalizer) - **Product detector**: [build-small-hackathon/yolo26n-indian-fmcg-detection](https://huggingface.co/build-small-hackathon/yolo26n-indian-fmcg-detection) - **Full model card**: [MODEL_CARD.md](MODEL_CARD.md) - **Build sessions (Claude Code)**: [build-small-hackathon/kirana-detective-build-traces](https://huggingface.co/datasets/build-small-hackathon/kirana-detective-build-traces) --- ## License - **Code**: MIT - **MiniCPM-V / MiniCPM5-1B**: Apache 2.0 (OpenBMB) - **YOLO26n**: AGPL-3.0 (Ultralytics) --- *HuggingFace Build Small Hackathon 2026 Β· Track 1: Backyard AI Β· [naazimsnh02](https://github.com/naazimsnh02)*