---
sdk: gradio
sdk_version: 6.16.0
app_file: app.py
title: Kirana Detective
short_description: AI invoice auditor for Indian kirana stores
license: mit
tags:
- invoice-audit
- llm
- yolo
- gguf
- gradio
- indian-fmcg
- kirana
- minicpm
- multimodal
- backyard-ai
- local-first
- fine-tuned
- custom-ui
- llama.cpp
- open-trace
- blog-post
- openbmb
- modal.com
- track:backyard
- sponsor:openbmb
- sponsor:modal
- achievement:offgrid
- achievement:welltuned
- achievement:offbrand
- achievement:llama
- achievement:sharing
- achievement:fieldnotes
colorFrom: purple
colorTo: pink
pinned: true
thumbnail: >-
  https://cdn-uploads.huggingface.co/production/uploads/639548cc276ff8643fab34ac/p5tj50-FcFouVAlbUk8KJ.png
---

<div align="center">

# 🔍 Kirana Detective

### AI-Powered Invoice & Inventory Auditor for Indian Kirana Stores

*Find where money is being lost — in under 60 seconds*

[![Try It Live](https://img.shields.io/badge/🤗%20Try%20It-Live%20Demo-blue)](https://huggingface.co/spaces/build-small-hackathon/kirana-detective)
[![Watch Demo](https://img.shields.io/badge/▶%20Watch-Demo-red)](https://youtu.be/8TVZP4sfesI)
[![Blog Post](https://img.shields.io/badge/📝%20Blog-How%20I%20Built%20It-orange)](https://huggingface.co/blog/build-small-hackathon/kirana-detective)
[![X Post](https://img.shields.io/badge/𝕏-View%20Post-black)](https://x.com/naazimhussain02/status/2065966381657633161)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)
[![Python 3.11](https://img.shields.io/badge/Python-3.11-blue.svg)](https://python.org)
[![Training: Modal](https://img.shields.io/badge/Training-Modal%20A10G-orange)](https://modal.com)
[![Models: OpenBMB](https://img.shields.io/badge/Models-OpenBMB%20MiniCPM-purple)](https://huggingface.co/openbmb)
[![Hackathon: Build Small 2026](https://img.shields.io/badge/Hackathon-HF%20Build%20Small%202026-yellow)](https://huggingface.co/build-small-hackathon)

</div>

---

## What It Does

Indian kirana store owners receive 3–5 distributor invoices every week via WhatsApp, printed bills, or Tally exports. Verifying them manually is impossible.

**Kirana Detective uploads an invoice + delivery photos and finds:**

| Finding | Example |
|---|---|
| Price overcharge | Surf Excel 1kg charged ₹255 — historical price ₹220 (+15.9%) |
| Delivery shortage | Invoice says 24 Coke bottles — photo shows 20 |
| Duplicate charge | Parle-G 80g appears twice on the same invoice |
| GST mismatch | Aashirvaad Atta billed at 12% instead of 5% |

Every finding converts to a **rupee leakage number** with an actionable follow-up step.

---

## Demo

```
Upload Invoice (photo/PDF/WhatsApp) + Delivery Photos (up to 5)
        ↓
Agent 1 — MiniCPM-V 4.6 extracts structured JSON from invoice image
Agent 2 — MiniCPM5-1B normalises "SURF XL 1K" → "Surf Excel Washing Powder 1kg"
Agent 3 — Rule engine checks price vs. stored invoice history
Agent 4 — YOLO26n counts products in delivery photos
Agent 5 — Reconciliation: invoice qty vs. counted qty → shortage flags
Agent 6 — MiniCPM5-1B generates rupee savings report + action items
        ↓
₹ TOTAL LEAKAGE DETECTED: ₹858
```

---

## Six-Agent Pipeline

```
┌──────────────────────────────────┐
│  Agent 1 — Invoice Extractor     │  MiniCPM-V 4.6 (merged, bfloat16)
│  Invoice image/PDF → JSON        │  OCR + structured field extraction
└──────────────┬───────────────────┘
               ↓
┌──────────────────────────────────┐
│  Agent 2 — Product Matcher       │  MiniCPM5-1B (GGUF Q4_K_M)
│  Raw names → canonical SKU IDs   │  "MAGGI NDL" → Nestle Maggi 70g
└──────────────┬───────────────────┘
               ↓
┌──────────────────────────────────┐
│  Agent 3 — Pricing Agent         │  Rule-based (SQLite history)
│  Normalized invoice → price flags│  Detects overcharges & GST errors
└──────────────┬───────────────────┘
               ↓
┌──────────────────────────────────┐
│  Agent 4 — Visual Counter        │  YOLO26n (ONNX, 1,831 classes)
│  Delivery photos → product counts│  mAP50 = 0.428 on merged dataset
└──────────────┬───────────────────┘
               ↓
┌──────────────────────────────────┐
│  Agent 5 — Reconciliation Agent  │  Rule-based
│  Invoice qty vs. counted qty     │  Shortage flags + ₹ loss
└──────────────┬───────────────────┘
               ↓
┌──────────────────────────────────┐
│  Agent 6 — Savings Agent         │  MiniCPM5-1B (GGUF Q4_K_M)
│  All flags → ₹ report + actions  │  "Call HUL rep. Request credit note."
└──────────────────────────────────┘
```

Every agent run is stored locally in SQLite for audit history.

---

## Fine-Tuned Models

All three models were trained from scratch on Modal A10G GPUs and published to HuggingFace. Total training cost: ~$5.80.

### Model 1 — MiniCPM-V 4.6 (Invoice Extractor)

| | |
|---|---|
| **Repo** | [`build-small-hackathon/minicpm-v-4-6-indian-invoice-extraction-merged`](https://huggingface.co/build-small-hackathon/minicpm-v-4-6-indian-invoice-extraction-merged) |
| **Base** | openbmb/MiniCPM-V-4.6 |
| **Method** | QLoRA rank 16 (PEFT + bitsandbytes), then merged to full bfloat16 weights |
| **Data** | 500 synthetic Indian invoices — printed GST, Tally PDF, handwritten, WhatsApp |
| **Eval loss** | 0.212 (epoch 3 / 3) |
| **Training** | 51 min 50 sec on A10G · 87 steps · 9.5M trainable params (0.72%) |
| **Inference** | `transformers` AutoModel · `model.chat()` — no PEFT at runtime |

### Model 2 — MiniCPM5-1B (Product Normalizer)

| | |
|---|---|
| **Repo** | [`build-small-hackathon/minicpm5-1b-indian-fmcg-normalizer`](https://huggingface.co/build-small-hackathon/minicpm5-1b-indian-fmcg-normalizer) |
| **Base** | openbmb/MiniCPM5-1B |
| **Method** | QLoRA rank 16 via Unsloth, exported to GGUF Q4_K_M |
| **Data** | 2,000 synthetic (raw_name → canonical_name) pairs · 200 Indian FMCG SKUs |
| **Training** | ~1 hour on A10G |
| **Inference** | llama-cpp-python · `create_chat_completion()` |

### Model 3 — YOLO26n (Product Detector)

| | |
|---|---|
| **Repo** | [`build-small-hackathon/yolo26n-indian-fmcg-detection`](https://huggingface.co/build-small-hackathon/yolo26n-indian-fmcg-detection) |
| **Base** | Ultralytics YOLO26n |
| **Method** | Supervised fine-tuning on 3 merged Roboflow datasets |
| **Data** | ~11,400 images · 1,831 unified classes |
| **Metrics** | mAP50 = **0.428** · mAP50-95 = **0.302** · 100 epochs (A10G) |
| **Inference** | ONNX Runtime · CPU or GPU |

### Training Dataset

| | |
|---|---|
| **Repo** | [`build-small-hackathon/kirana-invoice-train-data`](https://huggingface.co/datasets/build-small-hackathon/kirana-invoice-train-data) |
| **Contents** | 500 synthetic invoice images (450 train / 50 eval) with structured JSON annotations |

---

## Running Locally

```bash
git clone https://github.com/naazimsnh02/kirana-detective.git
cd kirana-detective
pip install -r requirements.txt
python app.py
```

**First run:** downloads ~3 GB of model weights (cached after that).  
**Requirements:** ~6 GB RAM · Python 3.11 · optional CUDA GPU for faster MiniCPM-V inference.

### Environment Variables

| Variable | Required | Purpose |
|---|---|---|
| `HF_TOKEN` | Optional | Faster downloads from HF Hub (avoids rate limits) |

---

## Re-Training the Models

All training scripts are in `finetune/`. Training is orchestrated on Modal.

```bash
export HF_TOKEN=<your-token>
export ROBOFLOW_API_KEY=<your-key>

modal run finetune/generate_invoices.py        # ~10 min — generate 500 synthetic invoices
modal run finetune/train_minicpm_v.py          # ~52 min — fine-tune invoice extractor
modal run finetune/export_minicpm_v_gguf.py   # ~10 min — merge LoRA → push HF weights
modal run finetune/train_minicpm5_1b.py        # ~1 hour — fine-tune product normalizer
modal run finetune/train_yolo26n.py            # ~2 hours — fine-tune YOLO26n detector
```

Scripts publish to `naazimsnh02/` first; transfer to `build-small-hackathon/` manually after.  
See [`finetune/README.md`](finetune/README.md) for the full workflow.

---

## Model Architecture

| Component | Model | Parameters | Runtime |
|---|---|---|---|
| Invoice OCR & extraction | MiniCPM-V 4.6 (merged) | 1.3B | transformers |
| Product normalisation + report | MiniCPM5-1B (GGUF Q4_K_M) | 1.08B | llama-cpp-python |
| Product detection & counting | YOLO26n (ONNX) | ~2.4M | onnxruntime |
| **Total** | — | **~2.38B** | — |

Comfortably within the **Tiny Titan** ≤4B threshold. Zero cloud API calls — fully local inference.

---

## Hackathon Badges

| Badge | How |
|---|---|
| 🎯 Well-Tuned | 3 custom fine-tuned models published on HF Hub |
| 🔌 Off the Grid | 100% local inference — MiniCPM-V (transformers) + MiniCPM5-1B (GGUF) + YOLO26n (ONNX) |
| 🦙 Llama Champion | MiniCPM5-1B served via llama-cpp-python (GGUF Q4_K_M) |
| 🎨 Off-Brand | Custom Gradio UI — rupee savings cards, colour-coded anomaly flags |
| 📡 Sharing is Caring | Claude Code build sessions (11 JSONL sessions) published as a public trace dataset |
| 📓 Field Notes | *"How I built an AI auditor for India's 12 million kirana stores"* |
| 🏋️ Tiny Titan | ~2.38B total parameters — OCR + normalization + counting + report |

---

## Sharing is Caring — Build Trace Dataset

The 11 raw Claude Code (Sonnet 4.6) JSONL sessions used to design, code, debug, and document this entire project — from blank repo to hackathon submission. Viewable in HF Data Studio's native agent trace viewer.

| Dataset | Contents | Format |
|---|---|---|
| [`build-small-hackathon/kirana-detective-build-traces`](https://huggingface.co/datasets/build-small-hackathon/kirana-detective-build-traces) | 11 Claude Code build sessions · ~8.9 MB | Native JSONL trace viewer |

To upload build traces:

```bash
export HF_TOKEN=<your-token>
python finetune/upload_build_traces.py
```

---

## Project Structure

```
kirana-detective/
├── app.py                      # Gradio + FastAPI server
├── pipeline.py                 # AuditOrchestrator (6-agent runner)
├── models.py                   # Dataclasses: InvoiceJSON, LeakageReport, etc.
├── catalog.py                  # FMCG product catalog + alias lookup
├── storage.py                  # SQLite price history + audit log
├── tracer.py                   # Agent trace logging → HF Hub
├── agents/
│   ├── invoice_extractor.py    # Agent 1 — MiniCPM-V 4.6
│   ├── product_matcher.py      # Agent 2 — MiniCPM5-1B (alias + LLM)
│   ├── pricing_agent.py        # Agent 3 — rule-based price checks
│   ├── visual_counter.py       # Agent 4 — YOLO26n ONNX
│   ├── reconciliation_agent.py # Agent 5 — invoice vs. photo reconciliation
│   └── savings_agent.py        # Agent 6 — MiniCPM5-1B report generator
├── finetune/
│   ├── README.md               # Training workflow guide
│   ├── generate_invoices.py    # Synthetic invoice generator
│   ├── train_minicpm_v.py      # Fine-tune MiniCPM-V
│   ├── train_minicpm5_1b.py    # Fine-tune MiniCPM5-1B
│   ├── train_yolo26n.py        # Fine-tune YOLO26n
│   ├── export_minicpm_v_gguf.py# Merge LoRA → push HF weights
│   └── upload_build_traces.py  # Upload Claude Code sessions → HF Hub
├── data/
│   └── fmcg_catalog.json       # 200 canonical SKU names + GST rates
└── MODEL_CARD.md               # Full training + evaluation documentation
```

---

## Links

- **Demo Video**: https://youtu.be/8TVZP4sfesI
- **HF Space**: [build-small-hackathon/kirana-detective](https://huggingface.co/spaces/build-small-hackathon/kirana-detective)
- **Training dataset**: [build-small-hackathon/kirana-invoice-train-data](https://huggingface.co/datasets/build-small-hackathon/kirana-invoice-train-data)
- **Invoice extractor**: [build-small-hackathon/minicpm-v-4-6-indian-invoice-extraction-merged](https://huggingface.co/build-small-hackathon/minicpm-v-4-6-indian-invoice-extraction-merged)
- **Product normalizer**: [build-small-hackathon/minicpm5-1b-indian-fmcg-normalizer](https://huggingface.co/build-small-hackathon/minicpm5-1b-indian-fmcg-normalizer)
- **Product detector**: [build-small-hackathon/yolo26n-indian-fmcg-detection](https://huggingface.co/build-small-hackathon/yolo26n-indian-fmcg-detection)
- **Full model card**: [MODEL_CARD.md](MODEL_CARD.md)
- **Build sessions (Claude Code)**: [build-small-hackathon/kirana-detective-build-traces](https://huggingface.co/datasets/build-small-hackathon/kirana-detective-build-traces)

---

## License

- **Code**: MIT
- **MiniCPM-V / MiniCPM5-1B**: Apache 2.0 (OpenBMB)
- **YOLO26n**: AGPL-3.0 (Ultralytics)

---

*HuggingFace Build Small Hackathon 2026 · Track 1: Backyard AI · [naazimsnh02](https://github.com/naazimsnh02)*