kirana-detective / docs /kirana-detective-prd.md
naazimsnh02's picture
All models training uploaded
9d75c8c
|
Raw
History Blame Contribute Delete
26.4 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade

Kirana Detective AI

AI-Powered Inventory & Invoice Auditor for Indian Kirana Stores

Field Value
Version MVP v1.0
Hackathon Hugging Face Build Small Hackathon 2026
Track Track 1: Backyard AI
Deadline June 15, 2026

Executive Summary

Kirana Detective AI helps Indian kirana store owners detect profit leakage by automatically auditing invoices, validating deliveries, identifying pricing anomalies, and comparing invoice quantities against actual products visible in shelf or carton photos.

The system acts as an AI-powered business auditor that helps small retailers identify billing errors, missing products, supplier discrepancies, and inventory issues that would otherwise go unnoticed.

Unlike generic AI assistants, Kirana Detective solves a highly specific problem for a clearly defined user group and produces measurable financial value β€” a rupee savings number that is concrete, judge-friendly, and immediately relatable to any Indian evaluator.


Problem Statement

India has approximately 12 million kirana stores. Most operate with:

  • Printed invoices from distributors
  • WhatsApp invoice screenshots
  • Manual or no delivery verification
  • Informal bookkeeping

Common Loss Sources

Issue Example
Supplier overcharging Charged β‚Ή255 for Surf Excel, should be β‚Ή220
Missing delivery items Invoice says 50 Coke bottles, 46 delivered
Incorrect GST applied Aashirvaad Atta at 12% instead of 5%
Duplicate invoice lines Same product charged twice
Unclaimed distributor discounts Buy-10-get-1 offer never applied
Dead inventory Corn Flakes unsold for 75 days

Each mistake is small. Monthly losses accumulate to β‚Ή2,000–₹20,000 per store.

Store owners rarely have time to manually audit invoices and deliveries. Kirana Detective becomes their AI auditor.


Vision

"Find where money is being lost."

The goal is not accounting. The goal is detecting profit leakage and converting every finding into a rupee value.


Primary User

Ravi β€” Kirana Store Owner, Chennai

  • Runs a neighbourhood provision store
  • Receives 3–5 distributor invoices per week
  • Gets most invoices via WhatsApp
  • Uses an Android phone
  • Low technical skill β€” needs a tap-and-see interface
  • Loses approximately β‚Ή3,000–₹8,000/month to undetected billing errors

Success Metrics

Metric Target
Detected savings per audit β‰₯ β‚Ή500 shown to user
Invoice audit time < 60 seconds
Delivery verification accuracy β‰₯ 80% on carton photos
"Actually used it" proof Demo video with real kirana owner

MVP Scope (Must-Build for Hackathon)

Focus ruthlessly on this single killer workflow:

Invoice Upload β†’ Delivery Photo Upload β†’ Missing Product Detection β†’ β‚Ή Savings Report

Must Have

  • βœ… Invoice image / PDF upload
  • βœ… Invoice OCR and structured extraction
  • βœ… Product name normalization
  • βœ… Price anomaly detection vs. historical invoices
  • βœ… Delivery photo upload and product counting (YOLO26n)
  • βœ… Invoice vs. delivery reconciliation
  • βœ… Profit leakage dashboard with β‚Ή savings total
  • βœ… Agent trace logging (for Sharing is Caring badge)
  • βœ… Custom Gradio UI (not default theme)

Deferred to Future

  • Expiry date detection
  • Dead stock / slow-moving inventory alerts
  • Supplier trust score
  • Supplier negotiation insights
  • Multi-store analytics
  • WhatsApp bot integration
  • Demand forecasting

Core Features (MVP)

Feature 1 β€” Invoice Understanding

Input: Invoice image (photo, PDF, WhatsApp screenshot)

Model: MiniCPM-V 4.6 (fine-tuned on Indian invoice formats)

AI Tasks:

  • OCR extraction of all invoice fields
  • Handling mixed English + Tamil/Hindi/Telugu text
  • Parsing Tally printouts, handwritten bills, GST invoices

Output: Structured Invoice JSON

{
  "invoice_number": "INV-2024-8821",
  "supplier": "Hindustan Unilever Ltd",
  "date": "2026-06-08",
  "items": [
    {
      "product_raw": "SURF EXCEL 1KG",
      "product_normalized": "Surf Excel Washing Powder 1kg",
      "quantity": 10,
      "unit_price": 255.00,
      "gst_rate": 18,
      "line_total": 2550.00
    }
  ],
  "grand_total": 2550.00
}

Feature 2 β€” Product Name Normalization

Problem: Distributor invoices use inconsistent product names.

Invoice Text Normalized Name
MAGGI 70GM Nestle Maggi Masala Noodles 70g
MAGGI NDL Nestle Maggi Masala Noodles 70g
SURF XL 1K Surf Excel Washing Powder 1kg
PARLE G 80 Parle-G Biscuit 80g
COLGAT 100G Colgate Strong Teeth Toothpaste 100g

Model: Fine-tuned MiniCPM5-1B on Indian FMCG SKU normalization dataset

Output: Consistent product catalog entries that allow historical price comparisons across different invoices from the same supplier.


Feature 3 β€” Pricing Anomaly Detection

Logic: Rule-based comparison against stored historical invoice data.

Example:

Product: Surf Excel Washing Powder 1kg

Historical price (last 3 invoices):
  β‚Ή220 | β‚Ή220 | β‚Ή222

Current invoice price: β‚Ή255

⚠ Price increase detected: +15.9%
Estimated excess charge (10 units): β‚Ή330

No ML needed here β€” arithmetic + historical lookup is both sufficient and more trustworthy than a model for financial comparisons.


Feature 4 β€” Duplicate Charge Detection

Logic: Rule-based scan of extracted invoice JSON.

Detects:

  • Same product appearing twice in one invoice
  • Same invoice number submitted twice across sessions
  • Repeated line items with identical product + qty + price

Output:

⚠ Duplicate detected: Parle-G 80g appears twice on this invoice.
Combined quantity: 40 units | Possible duplicate charge: β‚Ή320

Feature 5 β€” Delivery Verification (Visual Counting)

This is the centrepiece feature β€” the most visually impressive for the demo.

Input: Invoice JSON (from Feature 1) + 1–5 delivery photos

Model: YOLO26n fine-tuned on Indian FMCG products (see Model Stack section)

Pipeline:

Delivery Photo
      ↓
YOLO26n-nano (ONNX, local)
  β†’ Detect bounding boxes
  β†’ Count instances per product class
  β†’ Output: {Coke 200ml: 20, Maggi 70g: 48}
      ↓
MiniCPM-V 4.6
  β†’ Cross-verify with invoice context
  β†’ Generate natural-language summary
      ↓
Reconciliation Agent
  β†’ Invoice qty vs detected qty
  β†’ Calculate β‚Ή shortage value

Example Output:

Invoice expects: Coke 200ml Γ— 24
Detected in photo: 20 bottles

⚠ Shortage: 4 bottles
Estimated loss: β‚Ή180

Important scope note: Multi-image counting (Feature 6 in the original PRD) is simplified β€” the user uploads up to 5 photos of the same delivery, counts are aggregated, then reconciled against the invoice. No complex carton-stacking estimation is attempted.


Feature 6 β€” Profit Leakage Dashboard

The "wow" output β€” everything converts to β‚Ή.

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  KIRANA DETECTIVE β€” AUDIT REPORT
  Supplier: HUL | Invoice: INV-8821
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

  ⚠  Pricing Issues
     Surf Excel 1kg: +15.9% vs history ...... β‚Ή330

  ⚠  Delivery Shortage
     Coke 200ml: 4 bottles missing ........... β‚Ή180
     Maggi 70g: 2 packets missing ............. β‚Ή28

  ⚠  Duplicate Charge
     Parle-G 80g: possible duplicate ........ β‚Ή320

  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  πŸ’° TOTAL LEAKAGE DETECTED:    β‚Ή858
  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

  Actions:
  β†’ Contact HUL rep about price increase
  β†’ Request credit note for 4 Coke bottles
  β†’ Verify Parle-G line item with distributor

AI Agent Workflow

This multi-agent pipeline is explicitly designed for the Best Agent award.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚           USER UPLOADS                  β”‚
β”‚   Invoice Image + Delivery Photos       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Agent 1: Invoice Extraction     β”‚
β”‚  Model: MiniCPM-V 4.6 (ft)      β”‚
β”‚  Input:  Invoice image/PDF       β”‚
β”‚  Output: Structured invoice JSON β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Agent 2: Product Matching       β”‚
β”‚  Model: MiniCPM5-1B (ft)        β”‚
β”‚  Input:  Raw product names       β”‚
β”‚  Output: Normalized product IDs  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Agent 3: Pricing Agent          β”‚
β”‚  Logic:  Rule-based              β”‚
β”‚  Input:  Normalized invoice      β”‚
β”‚  Output: Price anomaly flags     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Agent 4: Visual Counting Agent  β”‚
β”‚  Model: YOLO26n-FMCG (ft)      β”‚
β”‚  Input:  Delivery photos         β”‚
β”‚  Output: {product: count} dict   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Agent 5: Reconciliation Agent   β”‚
β”‚  Logic:  Rule-based              β”‚
β”‚  Input:  Invoice qty + Photo qty β”‚
β”‚  Output: Shortage flags + β‚Ή loss β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Agent 6: Savings Agent          β”‚
β”‚  Model: MiniCPM5-1B              β”‚
β”‚  Input:  All flags               β”‚
β”‚  Output: β‚Ή report + action items β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Agent trace is logged and shared on HuggingFace Hub β†’ Sharing is Caring badge.


Model Stack

Primary Vision Model β€” MiniCPM-V 4.6

Property Value
Developer OpenBMB (Tsinghua University)
Parameters 1.3B
Release status Current MiniCPM-V 4.6 family model; released in 2026
Strengths OCR-heavy document understanding, image/video inputs, edge-friendly multimodal reasoning
Architecture note SigLIP2-400M vision encoder + Qwen3.5-0.8B LLM
GGUF / local support Yes β€” supports llama.cpp/GGUF deployment for Off the Grid + Llama Champion badges
Why chosen Best fit for invoice OCR under the 32B cap and directly targets OpenBMB sponsor prize

Tasks: Invoice OCR, final report generation, cross-verification narration


Counting Model β€” YOLO26n (Fine-tuned)

Property Value
Developer Ultralytics
Parameters ~2.4M fused model
Strengths Faster CPU ONNX inference than YOLO11n, accurate object detection + counting, edge-friendly
Export format ONNX (local inference, no llama.cpp needed)
Why chosen Latest Ultralytics nano detector; purpose-built for counting while VLMs hallucinate on dense product scenes

Tasks: Detect and count FMCG products in delivery photos

Design decision: YOLO26n handles counting because VLMs like MiniCPM-V underperform on dense shelf scenes with 20–50 identical objects. Each model does what it does best.


Agent Orchestration Model β€” MiniCPM5-1B

Property Value
Developer OpenBMB
Parameters 1.08B
Context length 131,072 tokens
Strengths Tool use, reasoning, code/JSON generation, workflow orchestration, report generation
GGUF support Yes β€” official GGUF release supports llama.cpp/Ollama/LM Studio workflows
Why chosen Current OpenBMB 1B-class model, better aligned than the older MiniCPM3 reference and strengthens OpenBMB prize positioning

Tasks: Product normalization, agent orchestration, savings report text generation


Parameter Budget

Component Model Parameters
Invoice/document vision MiniCPM-V 4.6 1.3B
Product normalization + agent text MiniCPM5-1B 1.08B
Product detection/counting YOLO26n ~2.4M
Total active model budget Combined stack ~2.38B

This keeps the app far below the hackathon's 32B cap and within the Tiny Titan special-award range (<=4B), while still using separate models for the tasks they handle best.

Current Model References

  • MiniCPM-V 4.6: openbmb/MiniCPM-V-4.6
  • MiniCPM5-1B: openbmb/MiniCPM5-1B and openbmb/MiniCPM5-1B-GGUF
  • YOLO26n: Ultralytics YOLO26 nano detector, exported to ONNX after fine-tuning

Fine-Tuning Strategy

What to Fine-Tune (and Why)

1. MiniCPM-V 4.6 β€” Invoice Extraction

Why: Indian invoice formats (Tally printouts, WhatsApp screenshots, handwritten GST bills, mixed-language text) are not well-represented in the base model's training data. Fine-tuning on 300–500 synthetic Indian invoices dramatically improves structured JSON output.

Dataset: Synthetically generated using Claude/GPT β€” 500 invoices across:

  • 10 major Indian FMCG suppliers (HUL, NestlΓ©, Parle, Britannia, ITC, Amul, Dabur, Marico, Emami, Godrej)
  • 4 invoice formats (printed GST bill, handwritten, Tally export, WhatsApp screenshot)
  • Intentional errors: wrong GST, duplicate lines, price spikes

Platform: Modal + Unsloth QLoRA (~2–3 hours training time)

Publish to: build-small-hackathon/minicpm-v-4-6-indian-invoice-extraction-merged


2. YOLO26n β€” Indian FMCG Product Detection

Why: Base YOLO26n is not trained on Indian grocery products. Fine-tuning on the existing Indian Grocery Object Detection dataset (Roboflow) gives the model the ability to reliably detect Parle-G, Maggi, Amul, Britannia, HUL products in kirana shelf/delivery photos.

Dataset: Indian Grocery Object Detection β€” Roboflow β€” already annotated with bounding boxes for common Indian FMCG SKUs.

Training: Ultralytics fine-tune on Modal GPU (~1–2 hours)

Export: ONNX for local CPU inference

Publish to: build-small-hackathon/yolo26n-indian-fmcg-detection


3. MiniCPM5-1B β€” Product Name Normalization

Why: "MAGGI NDL 70GM", "MAGGI MASALA", and "MAGGI 70G" should all map to "Nestle Maggi Masala Noodles 70g". This requires Indian FMCG domain knowledge a general 1B model lacks.

Dataset: 2,000 synthetic (raw_name, normalized_name) pairs covering top 200 Indian FMCG SKUs

Publish to: build-small-hackathon/minicpm5-1b-indian-fmcg-normalizer


What NOT to Fine-Tune

Task Why Not
GST rate validation Pure lookup table by HSN code. 0/5/12/18/28%. Deterministic.
Price anomaly detection Simple arithmetic vs. stored history. More trustworthy without ML.
Duplicate detection String matching + invoice ID comparison.
Savings calculation Arithmetic. No model needed.
Supplier trust scoring Aggregation of existing rule-based signals.

Indian Context β€” Training Data Coverage

FMCG Brands (Invoice Normalization Dataset)

Food: Parle-G, Good Day, Britannia Marie, Maggi, Yippee, Aashirvaad Atta, Tata Salt, Amul Butter, Mother Dairy, Aavin

Home Care: Surf Excel, Rin, Vim, Harpic, Lizol, Domex, Scotch-Brite, Mortein

Personal Care: Colgate, Pepsodent, Clinic Plus, Pantene, Lux, Dove, Lifebuoy, Dettol, Parachute

Beverages: Coca-Cola, Pepsi, Sprite, Thums Up, Frooti, Maaza, Bovonto (South India)

GST Rate Lookup (Rule-Based, Not Fine-Tuned)

Rate Example Products
0% Fresh milk, eggs, vegetables
5% Packaged food, Atta, Dal, edible oil
12% Butter, ghee, packaged dry fruits
18% Soap, shampoo, toothpaste, detergent
28% Aerated drinks, tobacco

Regional Language Support

Invoice OCR handles mixed-language text including English, Tamil, Hindi, and Telugu β€” common in South Indian distributor invoices.


Award Strategy

OpenBMB Award

How: MiniCPM-V 4.6 is the primary vision model for OCR, cross-verification, and report generation. MiniCPM5-1B handles orchestration, normalization, and report text. Both are current OpenBMB models, making the product visibly built around the sponsor's ecosystem.

OpenAI Track

How: The project is built with Codex as the primary coding agent, with Codex-authored commits and implementation traces included in the submission materials. The demo should explicitly show how Codex accelerated the build and helped produce the final Gradio app, making OpenAI's contribution load-bearing without adding a cloud API dependency.

Modal Awards

How: Modal is used for the fine-tuning runs for MiniCPM-V 4.6, MiniCPM5-1B, and YOLO26n, with training logs, artifacts, and published Hugging Face model links included in the Field Notes post. Modal is not just incidental infrastructure; it is the training engine that makes the local-first app domain-specific.

Best Agent Award

How: Six-agent pipeline with clear separation of concerns, visible agent trace logged to HuggingFace Hub. Not a single LLM call β€” genuine tool-using agent workflow.

Well-Tuned Badge 🎯

How: Three fine-tuned models published on HuggingFace:

  1. minicpm-v-4-6-indian-invoice-extraction
  2. yolo26n-indian-fmcg-detection
  3. minicpm5-1b-indian-fmcg-normalizer

Off the Grid Badge πŸ”Œ

How: MiniCPM-V 4.6 GGUF via llama.cpp + MiniCPM5-1B GGUF via llama.cpp + YOLO26n ONNX β€” entire pipeline runs locally, zero cloud API calls.

Llama Champion Badge πŸ¦™

How: MiniCPM-V 4.6 and MiniCPM5-1B are served via llama.cpp using their GGUF quantized versions.

Off-Brand Badge 🎨

How: Custom Gradio UI β€” not default theme. Audit report card design with β‚Ή savings prominently displayed, colour-coded anomaly flags, and clean mobile-friendly layout.

Sharing is Caring Badge πŸ“‘

How: Agent trace logged after each audit run and shared as a HuggingFace dataset artifact.

Field Notes Badge πŸ““

How: Blog post: "How I built an AI auditor for India's 12 million kirana stores" β€” covering dataset creation, fine-tuning decisions, real-world testing with a store owner.

Bonus Quest Champion

How: Stack the largest credible set of badges on one polished submission: Off the Grid, Well-Tuned, Off-Brand, Llama Champion, Sharing is Caring, and Field Notes.

Tiny Titan

How: Total active model budget is approximately 2.38B parameters, comfortably below the <=4B Tiny Titan threshold while still handling OCR, agentic reasoning, normalization, and product counting.

Best Demo

How: The video centers on one concrete, emotional story: a real kirana owner finds a rupee-denominated loss, sees the missing items visually highlighted, and gets a practical supplier action list. The demo should show the app working, the owner reaction, the agent trace, and the final savings number.

Community Choice

How: Make the Space immediately understandable: upload sample invoice, upload sample delivery photos, run audit, see rupee savings. Pair the Space with a short social post using the India kirana angle and the "find where money is being lost" tagline.

NVIDIA Nemotron Quest

Decision: Explicitly not targeted. Chasing Nemotron would force a major stack change and weaken the OpenBMB/local-first Tiny Titan story. The submission focuses on Backyard AI, OpenBMB, OpenAI, Modal, and the bonus badges instead.


Gradio UI Design

Screen 1 β€” Upload

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  πŸ” KIRANA DETECTIVE                    β”‚
β”‚  Your AI Business Auditor               β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                         β”‚
β”‚  [ πŸ“„ Upload Invoice ]                  β”‚
β”‚  Photo / PDF / WhatsApp screenshot      β”‚
β”‚                                         β”‚
β”‚  [ πŸ“· Upload Delivery Photos ]          β”‚
β”‚  Up to 5 photos of received goods       β”‚
β”‚                                         β”‚
β”‚  Supplier Name: ___________________     β”‚
β”‚                                         β”‚
β”‚  [ πŸ” Run Audit ]                       β”‚
β”‚                                         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Screen 2 β€” Results Dashboard

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  AUDIT COMPLETE β€” HUL | INV-8821        β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                         β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚ ⚠ Price β”‚  β”‚ ⚠ Short β”‚  β”‚ ⚠ Dupli β”‚ β”‚
β”‚  β”‚  β‚Ή330   β”‚  β”‚  β‚Ή208   β”‚  β”‚  β‚Ή320   β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚                                         β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  β”‚  πŸ’° TOTAL LEAKAGE: β‚Ή858        β”‚    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β”‚                                         β”‚
β”‚  [ πŸ“‹ Full Report ]  [ πŸ“€ Share ]       β”‚
β”‚                                         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Demo Story (For Submission Video)

Ravi, a kirana store owner in Chennai, uploads one invoice from his HUL distributor and three photos of the goods delivered that morning.

In 45 seconds, Kirana Detective finds:

  • Surf Excel is being charged 15.9% above the historical price
  • 4 Coke bottles are missing from the delivery
  • A Parle-G line item appears to be duplicated

Total leakage detected: β‚Ή858

Ravi calls his distributor. The credit note is issued the same day.

This outcome is specific, measurable, and achievable in a real demo β€” exactly what Backyard AI judges want to see.


10-Day Build Plan

Day Task Model/Tool Risk
1 Fine-tune YOLO26n on Roboflow Indian Grocery dataset Modal GPU, Ultralytics Low
2 Generate 500 synthetic Indian invoices; fine-tune MiniCPM-V 4.6 extraction Modal + Unsloth Medium
3 Fine-tune MiniCPM5-1B product normalizer; publish all 3 models to HF Modal + Unsloth Low
4 Build invoice OCR pipeline in Gradio: upload β†’ MiniCPM-V β†’ JSON Python + Gradio Medium
5 Build YOLO26n delivery counting pipeline: photo β†’ count dict ONNX Runtime Medium
6 Build reconciliation agent + pricing anomaly detection Rule-based Python Low
7 Build custom Gradio dashboard UI with β‚Ή savings cards Gradio + CSS Low
8 Wire all agents together; implement trace logging; deploy to HF Space LangGraph / custom Medium
9 Test with real kirana owner; record demo video; capture Codex-authored commit/story proof Codex + real user testing Low
10 Write Field Notes blog; share agent trace; include Modal logs and final submission assets HF Dataset + Modal logs Low

Technical Stack Summary

Component Technology
Frontend Gradio (custom theme, Off-Brand)
Hosting Hugging Face Spaces
Primary VLM MiniCPM-V 4.6 (GGUF via llama.cpp)
Agent Orchestrator MiniCPM5-1B (GGUF via llama.cpp)
Counting Model YOLO26n fine-tuned (ONNX, local)
Fine-tuning Platform Modal + Unsloth (training engine for sponsor eligibility)
Build Agent OpenAI Codex (commit author + build trace for OpenAI Track positioning)
Invoice parsing PyMuPDF (PDF) + Gradio Image input
Data storage Local JSON / SQLite (no cloud DB)
Agent tracing Custom trace logger β†’ HF Dataset

Risk Register

Risk Likelihood Mitigation
MiniCPM-V GGUF has high latency on CPU Medium Use 4-bit quantized Q4_K_M; fall back to float16 on HF Space GPU
YOLO26n misses products not in Roboflow dataset Medium Limit demo to top 10 products; expand post-hackathon
Delivery photo quality too low for counting High Show demo with clean carton photos; add "photo quality tip" in UI
Fine-tuning time exceeds budget Low All 3 models trainable in < 6 hours total on Modal A10G
OpenAI Track story looks indirect Medium Make Codex visible in commit metadata, implementation trace, Field Notes, and demo narrative
Modal usage looks incidental Low Publish Modal training logs/artifacts and explicitly link fine-tuned models to Modal runs
Scope creep during build week High Freeze scope at Day 3; no new features after Day 6

Kirana Detective AI β€” Build Small Hackathon 2026 β€” Track 1: Backyard AI