kcc-agri / README_HF.md
hritikm15's picture
Day 9 β€” v4 merge deploy: kcc_core + advisors + Proof tab + pest heatmap
49818d2 verified
metadata
title: KCC AgriAdvisor
emoji: 🌾
colorFrom: green
colorTo: yellow
sdk: docker
pinned: false
app_port: 7860

🌾 KCC AgriAdvisor

AI-powered agricultural advisory for Indian farmers. The most comprehensive open Indian agriculture AI stack β€” RAG over 16.5M Kisan Call Center records, national pest forecast for 700 districts Γ— 20 crops, block-level satellite features, ICAR-cited dose recommendations.

πŸ“Š Performance

Independent benchmark β€” 100-query stratified sample from IndiaAgriBench-498, Indian farmer questions across pest, disease, nutrient, irrigation, scheme, price, and crop-selection problem types.

Metric This Stack Vanilla Gemini Vanilla Groq
Citation rate 0.91 0.00 0.00
Banned-chemical leakage 0.00 0.00 0.04
Latency p50 (ms) 787 9,978 1,015
Judge β€” Citation (1-10) 9.12 1.14 0.26
Judge β€” Safety (1-10) 8.48 8.67 5.50

Smoke re-run on the v4-merged stack (5-query sample, 2026-05-11) confirmed citation rate 1.0, banned leak 0.0, latency p50 627 ms β€” at or above baseline. Full benchmark: see the in-app /proof tab or eval/benchmark_full_2026-05-10.json in the IP bundle (NDA-gated).

Features

  • RAG chatbot over 16.5M KCC Q&A β€” Hindi / English / 8 regional languages
  • Pest Early Warning β€” district stacking model (AUC 0.937), 1-month lead time
  • Price Forecast β€” presow_v4 P25/P50/P75, 290 crops
  • National Pest Heatmap β€” 700 districts Γ— 20 crops Γ— 3 months, interactive
  • Citation Guard β€” post-generation safety filter, negation-aware
  • Enterprise Dashboard β€” B2B pest + price analytics

Architecture (v4)

client β†’ /query
  ↓
 topic guard β†’ classify β†’ v2 multi_step_retrieve (FAISS+BM25+rerank)
  ↓
 prompt build (system + safety + few-shot + retrieved context)
  ↓
 kcc_core.llm cascade: Groq β†’ Gemini β†’ local Llama (first success wins)
  ↓
 kcc_core.citation_guard.review (negation-aware banned-chem strikethrough,
                                  hard overrides, [1][2] citation check)
  ↓
 JSON { answer, sources, safety_warnings, backend, ... }

HF Spaces Secrets required

Secret Purpose
GROQ_API_KEY Primary LLM (Groq llama-4-scout, fast)
GEMINI_API_KEY Fallback LLM + image diagnosis
SECRET_KEY JWT signing (β‰₯32 hex chars)
B2B_DEMO_PASSWORD Demo enterprise login
HF_TOKEN (Optional) Pull large data files from HF Dataset at boot

What's in this v4 merge (vs v2)

Area v2 (original) v4 (this)
LLM cascade inline in api.py kcc_core.llm (one canonical path)
Backend tracking always "groq" lie actual model that answered
Banned chemicals 26-item list, substring 33-item, negation-aware
Citation guard none post-gen runs on every /query + /query/stream
Hard overrides none BPH β‰  Imidacloprid
Rate limiting none slowapi (120/min, 2000/hr default)
Request tracing none X-Request-ID middleware
/pest-risk state-only open requires B2B JWT
Frontend Farmer + B2B tabs + new /proof tab with benchmark charts + India pest heatmap
Tests none 19/19 passing

Full IP bundle (LoRA v2, BGE-M3 index, 498-query eval, ICAR JSON) is NOT in this Space β€” kept local for NDA-gated buyer sharing.