---
sdk: gradio
sdk_version: 6.16.0
app_file: app.py
title: Kirana Detective
short_description: AI invoice auditor for Indian kirana stores
license: mit
tags:
- invoice-audit
- llm
- yolo
- gguf
- gradio
- indian-fmcg
- kirana
- minicpm
- multimodal
- backyard-ai
- local-first
- fine-tuned
- custom-ui
- llama.cpp
- open-trace
- blog-post
- openbmb
- modal.com
- track:backyard
- sponsor:openbmb
- sponsor:modal
- achievement:offgrid
- achievement:welltuned
- achievement:offbrand
- achievement:llama
- achievement:sharing
- achievement:fieldnotes
colorFrom: purple
colorTo: pink
pinned: true
thumbnail: >-
https://cdn-uploads.huggingface.co/production/uploads/639548cc276ff8643fab34ac/p5tj50-FcFouVAlbUk8KJ.png
---
# π Kirana Detective
### AI-Powered Invoice & Inventory Auditor for Indian Kirana Stores
*Find where money is being lost β in under 60 seconds*
[](https://huggingface.co/spaces/build-small-hackathon/kirana-detective)
[](https://youtu.be/8TVZP4sfesI)
[](https://huggingface.co/blog/build-small-hackathon/kirana-detective)
[](https://x.com/naazimhussain02/status/2065966381657633161)
[](LICENSE)
[](https://python.org)
[](https://modal.com)
[](https://huggingface.co/openbmb)
[](https://huggingface.co/build-small-hackathon)
---
## What It Does
Indian kirana store owners receive 3β5 distributor invoices every week via WhatsApp, printed bills, or Tally exports. Verifying them manually is impossible.
**Kirana Detective uploads an invoice + delivery photos and finds:**
| Finding | Example |
|---|---|
| Price overcharge | Surf Excel 1kg charged βΉ255 β historical price βΉ220 (+15.9%) |
| Delivery shortage | Invoice says 24 Coke bottles β photo shows 20 |
| Duplicate charge | Parle-G 80g appears twice on the same invoice |
| GST mismatch | Aashirvaad Atta billed at 12% instead of 5% |
Every finding converts to a **rupee leakage number** with an actionable follow-up step.
---
## Demo
```
Upload Invoice (photo/PDF/WhatsApp) + Delivery Photos (up to 5)
β
Agent 1 β MiniCPM-V 4.6 extracts structured JSON from invoice image
Agent 2 β MiniCPM5-1B normalises "SURF XL 1K" β "Surf Excel Washing Powder 1kg"
Agent 3 β Rule engine checks price vs. stored invoice history
Agent 4 β YOLO26n counts products in delivery photos
Agent 5 β Reconciliation: invoice qty vs. counted qty β shortage flags
Agent 6 β MiniCPM5-1B generates rupee savings report + action items
β
βΉ TOTAL LEAKAGE DETECTED: βΉ858
```
---
## Six-Agent Pipeline
```
ββββββββββββββββββββββββββββββββββββ
β Agent 1 β Invoice Extractor β MiniCPM-V 4.6 (merged, bfloat16)
β Invoice image/PDF β JSON β OCR + structured field extraction
ββββββββββββββββ¬ββββββββββββββββββββ
β
ββββββββββββββββββββββββββββββββββββ
β Agent 2 β Product Matcher β MiniCPM5-1B (GGUF Q4_K_M)
β Raw names β canonical SKU IDs β "MAGGI NDL" β Nestle Maggi 70g
ββββββββββββββββ¬ββββββββββββββββββββ
β
ββββββββββββββββββββββββββββββββββββ
β Agent 3 β Pricing Agent β Rule-based (SQLite history)
β Normalized invoice β price flagsβ Detects overcharges & GST errors
ββββββββββββββββ¬ββββββββββββββββββββ
β
ββββββββββββββββββββββββββββββββββββ
β Agent 4 β Visual Counter β YOLO26n (ONNX, 1,831 classes)
β Delivery photos β product countsβ mAP50 = 0.428 on merged dataset
ββββββββββββββββ¬ββββββββββββββββββββ
β
ββββββββββββββββββββββββββββββββββββ
β Agent 5 β Reconciliation Agent β Rule-based
β Invoice qty vs. counted qty β Shortage flags + βΉ loss
ββββββββββββββββ¬ββββββββββββββββββββ
β
ββββββββββββββββββββββββββββββββββββ
β Agent 6 β Savings Agent β MiniCPM5-1B (GGUF Q4_K_M)
β All flags β βΉ report + actions β "Call HUL rep. Request credit note."
ββββββββββββββββββββββββββββββββββββ
```
Every agent run is stored locally in SQLite for audit history.
---
## Fine-Tuned Models
All three models were trained from scratch on Modal A10G GPUs and published to HuggingFace. Total training cost: ~$5.80.
### Model 1 β MiniCPM-V 4.6 (Invoice Extractor)
| | |
|---|---|
| **Repo** | [`build-small-hackathon/minicpm-v-4-6-indian-invoice-extraction-merged`](https://huggingface.co/build-small-hackathon/minicpm-v-4-6-indian-invoice-extraction-merged) |
| **Base** | openbmb/MiniCPM-V-4.6 |
| **Method** | QLoRA rank 16 (PEFT + bitsandbytes), then merged to full bfloat16 weights |
| **Data** | 500 synthetic Indian invoices β printed GST, Tally PDF, handwritten, WhatsApp |
| **Eval loss** | 0.212 (epoch 3 / 3) |
| **Training** | 51 min 50 sec on A10G Β· 87 steps Β· 9.5M trainable params (0.72%) |
| **Inference** | `transformers` AutoModel Β· `model.chat()` β no PEFT at runtime |
### Model 2 β MiniCPM5-1B (Product Normalizer)
| | |
|---|---|
| **Repo** | [`build-small-hackathon/minicpm5-1b-indian-fmcg-normalizer`](https://huggingface.co/build-small-hackathon/minicpm5-1b-indian-fmcg-normalizer) |
| **Base** | openbmb/MiniCPM5-1B |
| **Method** | QLoRA rank 16 via Unsloth, exported to GGUF Q4_K_M |
| **Data** | 2,000 synthetic (raw_name β canonical_name) pairs Β· 200 Indian FMCG SKUs |
| **Training** | ~1 hour on A10G |
| **Inference** | llama-cpp-python Β· `create_chat_completion()` |
### Model 3 β YOLO26n (Product Detector)
| | |
|---|---|
| **Repo** | [`build-small-hackathon/yolo26n-indian-fmcg-detection`](https://huggingface.co/build-small-hackathon/yolo26n-indian-fmcg-detection) |
| **Base** | Ultralytics YOLO26n |
| **Method** | Supervised fine-tuning on 3 merged Roboflow datasets |
| **Data** | ~11,400 images Β· 1,831 unified classes |
| **Metrics** | mAP50 = **0.428** Β· mAP50-95 = **0.302** Β· 100 epochs (A10G) |
| **Inference** | ONNX Runtime Β· CPU or GPU |
### Training Dataset
| | |
|---|---|
| **Repo** | [`build-small-hackathon/kirana-invoice-train-data`](https://huggingface.co/datasets/build-small-hackathon/kirana-invoice-train-data) |
| **Contents** | 500 synthetic invoice images (450 train / 50 eval) with structured JSON annotations |
---
## Running Locally
```bash
git clone https://github.com/naazimsnh02/kirana-detective.git
cd kirana-detective
pip install -r requirements.txt
python app.py
```
**First run:** downloads ~3 GB of model weights (cached after that).
**Requirements:** ~6 GB RAM Β· Python 3.11 Β· optional CUDA GPU for faster MiniCPM-V inference.
### Environment Variables
| Variable | Required | Purpose |
|---|---|---|
| `HF_TOKEN` | Optional | Faster downloads from HF Hub (avoids rate limits) |
---
## Re-Training the Models
All training scripts are in `finetune/`. Training is orchestrated on Modal.
```bash
export HF_TOKEN=
export ROBOFLOW_API_KEY=
modal run finetune/generate_invoices.py # ~10 min β generate 500 synthetic invoices
modal run finetune/train_minicpm_v.py # ~52 min β fine-tune invoice extractor
modal run finetune/export_minicpm_v_gguf.py # ~10 min β merge LoRA β push HF weights
modal run finetune/train_minicpm5_1b.py # ~1 hour β fine-tune product normalizer
modal run finetune/train_yolo26n.py # ~2 hours β fine-tune YOLO26n detector
```
Scripts publish to `naazimsnh02/` first; transfer to `build-small-hackathon/` manually after.
See [`finetune/README.md`](finetune/README.md) for the full workflow.
---
## Model Architecture
| Component | Model | Parameters | Runtime |
|---|---|---|---|
| Invoice OCR & extraction | MiniCPM-V 4.6 (merged) | 1.3B | transformers |
| Product normalisation + report | MiniCPM5-1B (GGUF Q4_K_M) | 1.08B | llama-cpp-python |
| Product detection & counting | YOLO26n (ONNX) | ~2.4M | onnxruntime |
| **Total** | β | **~2.38B** | β |
Comfortably within the **Tiny Titan** β€4B threshold. Zero cloud API calls β fully local inference.
---
## Hackathon Badges
| Badge | How |
|---|---|
| π― Well-Tuned | 3 custom fine-tuned models published on HF Hub |
| π Off the Grid | 100% local inference β MiniCPM-V (transformers) + MiniCPM5-1B (GGUF) + YOLO26n (ONNX) |
| π¦ Llama Champion | MiniCPM5-1B served via llama-cpp-python (GGUF Q4_K_M) |
| π¨ Off-Brand | Custom Gradio UI β rupee savings cards, colour-coded anomaly flags |
| π‘ Sharing is Caring | Claude Code build sessions (11 JSONL sessions) published as a public trace dataset |
| π Field Notes | *"How I built an AI auditor for India's 12 million kirana stores"* |
| ποΈ Tiny Titan | ~2.38B total parameters β OCR + normalization + counting + report |
---
## Sharing is Caring β Build Trace Dataset
The 11 raw Claude Code (Sonnet 4.6) JSONL sessions used to design, code, debug, and document this entire project β from blank repo to hackathon submission. Viewable in HF Data Studio's native agent trace viewer.
| Dataset | Contents | Format |
|---|---|---|
| [`build-small-hackathon/kirana-detective-build-traces`](https://huggingface.co/datasets/build-small-hackathon/kirana-detective-build-traces) | 11 Claude Code build sessions Β· ~8.9 MB | Native JSONL trace viewer |
To upload build traces:
```bash
export HF_TOKEN=
python finetune/upload_build_traces.py
```
---
## Project Structure
```
kirana-detective/
βββ app.py # Gradio + FastAPI server
βββ pipeline.py # AuditOrchestrator (6-agent runner)
βββ models.py # Dataclasses: InvoiceJSON, LeakageReport, etc.
βββ catalog.py # FMCG product catalog + alias lookup
βββ storage.py # SQLite price history + audit log
βββ tracer.py # Agent trace logging β HF Hub
βββ agents/
β βββ invoice_extractor.py # Agent 1 β MiniCPM-V 4.6
β βββ product_matcher.py # Agent 2 β MiniCPM5-1B (alias + LLM)
β βββ pricing_agent.py # Agent 3 β rule-based price checks
β βββ visual_counter.py # Agent 4 β YOLO26n ONNX
β βββ reconciliation_agent.py # Agent 5 β invoice vs. photo reconciliation
β βββ savings_agent.py # Agent 6 β MiniCPM5-1B report generator
βββ finetune/
β βββ README.md # Training workflow guide
β βββ generate_invoices.py # Synthetic invoice generator
β βββ train_minicpm_v.py # Fine-tune MiniCPM-V
β βββ train_minicpm5_1b.py # Fine-tune MiniCPM5-1B
β βββ train_yolo26n.py # Fine-tune YOLO26n
β βββ export_minicpm_v_gguf.py# Merge LoRA β push HF weights
β βββ upload_build_traces.py # Upload Claude Code sessions β HF Hub
βββ data/
β βββ fmcg_catalog.json # 200 canonical SKU names + GST rates
βββ MODEL_CARD.md # Full training + evaluation documentation
```
---
## Links
- **Demo Video**: https://youtu.be/8TVZP4sfesI
- **HF Space**: [build-small-hackathon/kirana-detective](https://huggingface.co/spaces/build-small-hackathon/kirana-detective)
- **Training dataset**: [build-small-hackathon/kirana-invoice-train-data](https://huggingface.co/datasets/build-small-hackathon/kirana-invoice-train-data)
- **Invoice extractor**: [build-small-hackathon/minicpm-v-4-6-indian-invoice-extraction-merged](https://huggingface.co/build-small-hackathon/minicpm-v-4-6-indian-invoice-extraction-merged)
- **Product normalizer**: [build-small-hackathon/minicpm5-1b-indian-fmcg-normalizer](https://huggingface.co/build-small-hackathon/minicpm5-1b-indian-fmcg-normalizer)
- **Product detector**: [build-small-hackathon/yolo26n-indian-fmcg-detection](https://huggingface.co/build-small-hackathon/yolo26n-indian-fmcg-detection)
- **Full model card**: [MODEL_CARD.md](MODEL_CARD.md)
- **Build sessions (Claude Code)**: [build-small-hackathon/kirana-detective-build-traces](https://huggingface.co/datasets/build-small-hackathon/kirana-detective-build-traces)
---
## License
- **Code**: MIT
- **MiniCPM-V / MiniCPM5-1B**: Apache 2.0 (OpenBMB)
- **YOLO26n**: AGPL-3.0 (Ultralytics)
---
*HuggingFace Build Small Hackathon 2026 Β· Track 1: Backyard AI Β· [naazimsnh02](https://github.com/naazimsnh02)*