| --- |
| sdk: gradio |
| sdk_version: 6.16.0 |
| app_file: app.py |
| title: Kirana Detective AI |
| short_description: AI invoice auditor for Indian kirana stores |
| license: mit |
| tags: |
| - invoice-audit |
| - llm |
| - yolo |
| - gguf |
| - gradio |
| - indian-fmcg |
| - kirana |
| - minicpm |
| - multimodal |
| - backyard-ai |
| - local-first |
| - fine-tuned |
| - custom-ui |
| - llama.cpp |
| - open-trace |
| - blog-post |
| - openbmb |
| - modal.com |
| --- |
| |
| <div align="center"> |
|
|
| # π Kirana Detective AI |
|
|
| ### AI-Powered Invoice & Inventory Auditor for Indian Kirana Stores |
|
|
| *Find where money is being lost β in under 60 seconds* |
|
|
| [](https://huggingface.co/spaces/build-small-hackathon/kirana-detective) |
| [](LICENSE) |
| [](https://python.org) |
| [](https://modal.com) |
| [](https://huggingface.co/openbmb) |
| [](https://huggingface.co/build-small-hackathon) |
|
|
| </div> |
|
|
| --- |
|
|
| ## What It Does |
|
|
| Indian kirana store owners receive 3β5 distributor invoices every week via WhatsApp, printed bills, or Tally exports. Verifying them manually is impossible. |
|
|
| **Kirana Detective uploads an invoice + delivery photos and finds:** |
|
|
| | Finding | Example | |
| |---|---| |
| | Price overcharge | Surf Excel 1kg charged βΉ255 β historical price βΉ220 (+15.9%) | |
| | Delivery shortage | Invoice says 24 Coke bottles β photo shows 20 | |
| | Duplicate charge | Parle-G 80g appears twice on the same invoice | |
| | GST mismatch | Aashirvaad Atta billed at 12% instead of 5% | |
|
|
| Every finding converts to a **rupee leakage number** with an actionable follow-up step. |
|
|
| --- |
|
|
| ## Demo |
|
|
| ``` |
| Upload Invoice (photo/PDF/WhatsApp) + Delivery Photos (up to 5) |
| β |
| Agent 1 β MiniCPM-V 4.6 extracts structured JSON from invoice image |
| Agent 2 β MiniCPM5-1B normalises "SURF XL 1K" β "Surf Excel Washing Powder 1kg" |
| Agent 3 β Rule engine checks price vs. stored invoice history |
| Agent 4 β YOLO26n counts products in delivery photos |
| Agent 5 β Reconciliation: invoice qty vs. counted qty β shortage flags |
| Agent 6 β MiniCPM5-1B generates rupee savings report + action items |
| β |
| βΉ TOTAL LEAKAGE DETECTED: βΉ858 |
| ``` |
|
|
| --- |
|
|
| ## Six-Agent Pipeline |
|
|
| ``` |
| ββββββββββββββββββββββββββββββββββββ |
| β Agent 1 β Invoice Extractor β MiniCPM-V 4.6 (merged, bfloat16) |
| β Invoice image/PDF β JSON β OCR + structured field extraction |
| ββββββββββββββββ¬ββββββββββββββββββββ |
| β |
| ββββββββββββββββββββββββββββββββββββ |
| β Agent 2 β Product Matcher β MiniCPM5-1B (GGUF Q4_K_M) |
| β Raw names β canonical SKU IDs β "MAGGI NDL" β Nestle Maggi 70g |
| ββββββββββββββββ¬ββββββββββββββββββββ |
| β |
| ββββββββββββββββββββββββββββββββββββ |
| β Agent 3 β Pricing Agent β Rule-based (SQLite history) |
| β Normalized invoice β price flagsβ Detects overcharges & GST errors |
| ββββββββββββββββ¬ββββββββββββββββββββ |
| β |
| ββββββββββββββββββββββββββββββββββββ |
| β Agent 4 β Visual Counter β YOLO26n (ONNX, 1,831 classes) |
| β Delivery photos β product countsβ mAP50 = 0.428 on merged dataset |
| ββββββββββββββββ¬ββββββββββββββββββββ |
| β |
| ββββββββββββββββββββββββββββββββββββ |
| β Agent 5 β Reconciliation Agent β Rule-based |
| β Invoice qty vs. counted qty β Shortage flags + βΉ loss |
| ββββββββββββββββ¬ββββββββββββββββββββ |
| β |
| ββββββββββββββββββββββββββββββββββββ |
| β Agent 6 β Savings Agent β MiniCPM5-1B (GGUF Q4_K_M) |
| β All flags β βΉ report + actions β "Call HUL rep. Request credit note." |
| ββββββββββββββββββββββββββββββββββββ |
| ``` |
|
|
| Every agent run is traced and logged for the **Sharing is Caring** badge. |
|
|
| --- |
|
|
| ## Fine-Tuned Models |
|
|
| All three models were trained from scratch on Modal A10G GPUs and published to HuggingFace. Total training cost: ~$5.80. |
|
|
| ### Model 1 β MiniCPM-V 4.6 (Invoice Extractor) |
|
|
| | | | |
| |---|---| |
| | **Repo** | [`build-small-hackathon/minicpm-v-4-6-indian-invoice-extraction-merged`](https://huggingface.co/build-small-hackathon/minicpm-v-4-6-indian-invoice-extraction-merged) | |
| | **Base** | openbmb/MiniCPM-V-4.6 | |
| | **Method** | QLoRA rank 16 (PEFT + bitsandbytes), then merged to full bfloat16 weights | |
| | **Data** | 500 synthetic Indian invoices β printed GST, Tally PDF, handwritten, WhatsApp | |
| | **Eval loss** | 0.212 (epoch 3 / 3) | |
| | **Training** | 51 min 50 sec on A10G Β· 87 steps Β· 9.5M trainable params (0.72%) | |
| | **Inference** | `transformers` AutoModel Β· `model.chat()` β no PEFT at runtime | |
|
|
| ### Model 2 β MiniCPM5-1B (Product Normalizer) |
|
|
| | | | |
| |---|---| |
| | **Repo** | [`build-small-hackathon/minicpm5-1b-indian-fmcg-normalizer`](https://huggingface.co/build-small-hackathon/minicpm5-1b-indian-fmcg-normalizer) | |
| | **Base** | openbmb/MiniCPM5-1B | |
| | **Method** | QLoRA rank 16 via Unsloth, exported to GGUF Q4_K_M | |
| | **Data** | 2,000 synthetic (raw_name β canonical_name) pairs Β· 200 Indian FMCG SKUs | |
| | **Training** | ~1 hour on A10G | |
| | **Inference** | llama-cpp-python Β· `create_chat_completion()` | |
|
|
| ### Model 3 β YOLO26n (Product Detector) |
|
|
| | | | |
| |---|---| |
| | **Repo** | [`build-small-hackathon/yolo26n-indian-fmcg-detection`](https://huggingface.co/build-small-hackathon/yolo26n-indian-fmcg-detection) | |
| | **Base** | Ultralytics YOLO26n | |
| | **Method** | Supervised fine-tuning on 3 merged Roboflow datasets | |
| | **Data** | ~11,400 images Β· 1,831 unified classes | |
| | **Metrics** | mAP50 = **0.428** Β· mAP50-95 = **0.302** Β· 100 epochs (A10G) | |
| | **Inference** | ONNX Runtime Β· CPU or GPU | |
|
|
| ### Training Dataset |
|
|
| | | | |
| |---|---| |
| | **Repo** | [`build-small-hackathon/kirana-invoice-train-data`](https://huggingface.co/datasets/build-small-hackathon/kirana-invoice-train-data) | |
| | **Contents** | 500 synthetic invoice images (450 train / 50 eval) with structured JSON annotations | |
|
|
| --- |
|
|
| ## Running Locally |
|
|
| ```bash |
| git clone https://github.com/naazimsnh02/kirana-detective.git |
| cd kirana-detective |
| pip install -r requirements.txt |
| python app.py |
| ``` |
|
|
| **First run:** downloads ~3 GB of model weights (cached after that). |
| **Requirements:** ~6 GB RAM Β· Python 3.11 Β· optional CUDA GPU for faster MiniCPM-V inference. |
|
|
| ### Environment Variables |
|
|
| | Variable | Required | Purpose | |
| |---|---|---| |
| | `HF_TOKEN` | Optional | Faster downloads from HF Hub (avoids rate limits) | |
|
|
| --- |
|
|
| ## Re-Training the Models |
|
|
| All training scripts are in `finetune/`. Training is orchestrated on Modal. |
|
|
| ```bash |
| export HF_TOKEN=<your-token> |
| export ROBOFLOW_API_KEY=<your-key> |
| |
| modal run finetune/generate_invoices.py # ~10 min β generate 500 synthetic invoices |
| modal run finetune/train_minicpm_v.py # ~52 min β fine-tune invoice extractor |
| modal run finetune/export_minicpm_v_gguf.py # ~10 min β merge LoRA β push HF weights |
| modal run finetune/train_minicpm5_1b.py # ~1 hour β fine-tune product normalizer |
| modal run finetune/train_yolo26n.py # ~2 hours β fine-tune YOLO26n detector |
| ``` |
|
|
| Scripts publish to `naazimsnh02/` first; transfer to `build-small-hackathon/` manually after. |
| See [`finetune/README.md`](finetune/README.md) for the full workflow. |
|
|
| --- |
|
|
| ## Model Architecture |
|
|
| | Component | Model | Parameters | Runtime | |
| |---|---|---|---| |
| | Invoice OCR & extraction | MiniCPM-V 4.6 (merged) | 1.3B | transformers | |
| | Product normalisation + report | MiniCPM5-1B (GGUF Q4_K_M) | 1.08B | llama-cpp-python | |
| | Product detection & counting | YOLO26n (ONNX) | ~2.4M | onnxruntime | |
| | **Total** | β | **~2.38B** | β | |
|
|
| Comfortably within the **Tiny Titan** β€4B threshold. Zero cloud API calls β fully local inference. |
|
|
| --- |
|
|
| ## Hackathon Badges |
|
|
| | Badge | How | |
| |---|---| |
| | π― Well-Tuned | 3 custom fine-tuned models published on HF Hub | |
| | π Off the Grid | 100% local inference β MiniCPM-V (transformers) + MiniCPM5-1B (GGUF) + YOLO26n (ONNX) | |
| | π¦ Llama Champion | MiniCPM5-1B served via llama-cpp-python (GGUF Q4_K_M) | |
| | π¨ Off-Brand | Custom Gradio UI β rupee savings cards, colour-coded anomaly flags | |
| | π‘ Sharing is Caring | Two public trace datasets: runtime audit traces per run + full Claude Code build sessions | |
| | π Field Notes | *"How I built an AI auditor for India's 12 million kirana stores"* | |
| | ποΈ Tiny Titan | ~2.38B total parameters β OCR + normalization + counting + report | |
|
|
| --- |
|
|
| ## Sharing is Caring β Trace Datasets |
|
|
| Two public datasets document everything that happened in this project: |
|
|
| ### 1. Runtime Audit Traces β `build-small-hackathon/kirana-detective-traces` |
|
|
| Published automatically after every invoice audit run by [`tracer.py`](tracer.py). Each file records one complete pipeline execution β all six agents, inputs, outputs, and timings. |
|
|
| ### 2. Claude Code Build Sessions β `build-small-hackathon/kirana-detective-build-traces` |
|
|
| The 11 raw Claude Code (Sonnet 4.6) JSONL sessions used to design, code, debug, and document this entire project β from blank repo to hackathon submission. Viewable in HF Data Studio's native agent trace viewer. |
|
|
| | Dataset | Contents | Format | |
| |---|---|---| |
| | [`build-small-hackathon/kirana-detective-traces`](https://huggingface.co/datasets/build-small-hackathon/kirana-detective-traces) | Per-audit runtime traces (6 agents per run) | JSONL, auto-published by app | |
| | [`build-small-hackathon/kirana-detective-build-traces`](https://huggingface.co/datasets/build-small-hackathon/kirana-detective-build-traces) | 11 Claude Code build sessions Β· ~8.9 MB | Native JSONL trace viewer | |
|
|
| To upload build traces: |
|
|
| ```bash |
| export HF_TOKEN=<your-token> |
| python finetune/upload_build_traces.py |
| ``` |
|
|
| --- |
|
|
| ## Project Structure |
|
|
| ``` |
| kirana-detective/ |
| βββ app.py # Gradio + FastAPI server |
| βββ pipeline.py # AuditOrchestrator (6-agent runner) |
| βββ models.py # Dataclasses: InvoiceJSON, LeakageReport, etc. |
| βββ catalog.py # FMCG product catalog + alias lookup |
| βββ storage.py # SQLite price history + audit log |
| βββ tracer.py # Agent trace logging β HF Hub |
| βββ agents/ |
| β βββ invoice_extractor.py # Agent 1 β MiniCPM-V 4.6 |
| β βββ product_matcher.py # Agent 2 β MiniCPM5-1B (alias + LLM) |
| β βββ pricing_agent.py # Agent 3 β rule-based price checks |
| β βββ visual_counter.py # Agent 4 β YOLO26n ONNX |
| β βββ reconciliation_agent.py # Agent 5 β invoice vs. photo reconciliation |
| β βββ savings_agent.py # Agent 6 β MiniCPM5-1B report generator |
| βββ finetune/ |
| β βββ README.md # Training workflow guide |
| β βββ generate_invoices.py # Synthetic invoice generator |
| β βββ train_minicpm_v.py # Fine-tune MiniCPM-V |
| β βββ train_minicpm5_1b.py # Fine-tune MiniCPM5-1B |
| β βββ train_yolo26n.py # Fine-tune YOLO26n |
| β βββ export_minicpm_v_gguf.py# Merge LoRA β push HF weights |
| β βββ upload_build_traces.py # Upload Claude Code sessions β HF Hub |
| βββ data/ |
| β βββ fmcg_catalog.json # 200 canonical SKU names + GST rates |
| βββ MODEL_CARD.md # Full training + evaluation documentation |
| ``` |
|
|
| --- |
|
|
| ## Links |
|
|
| - **HF Space**: [build-small-hackathon/kirana-detective](https://huggingface.co/spaces/build-small-hackathon/kirana-detective) |
| - **Training dataset**: [build-small-hackathon/kirana-invoice-train-data](https://huggingface.co/datasets/build-small-hackathon/kirana-invoice-train-data) |
| - **Invoice extractor**: [build-small-hackathon/minicpm-v-4-6-indian-invoice-extraction-merged](https://huggingface.co/build-small-hackathon/minicpm-v-4-6-indian-invoice-extraction-merged) |
| - **Product normalizer**: [build-small-hackathon/minicpm5-1b-indian-fmcg-normalizer](https://huggingface.co/build-small-hackathon/minicpm5-1b-indian-fmcg-normalizer) |
| - **Product detector**: [build-small-hackathon/yolo26n-indian-fmcg-detection](https://huggingface.co/build-small-hackathon/yolo26n-indian-fmcg-detection) |
| - **Full model card**: [MODEL_CARD.md](MODEL_CARD.md) |
| - **Runtime audit traces**: [build-small-hackathon/kirana-detective-traces](https://huggingface.co/datasets/build-small-hackathon/kirana-detective-traces) |
| - **Build sessions (Claude Code)**: [build-small-hackathon/kirana-detective-build-traces](https://huggingface.co/datasets/build-small-hackathon/kirana-detective-build-traces) |
| - **PRD**: [docs/kirana-detective-prd.md](docs/kirana-detective-prd.md) |
|
|
| --- |
|
|
| ## License |
|
|
| - **Code**: MIT |
| - **MiniCPM-V / MiniCPM5-1B**: Apache 2.0 (OpenBMB) |
| - **YOLO26n**: AGPL-3.0 (Ultralytics) |
|
|
| --- |
|
|
| *HuggingFace Build Small Hackathon 2026 Β· Track 1: Backyard AI Β· [naazimsnh02](https://github.com/naazimsnh02)* |
|
|