kcc-agri / README_HF.md
hritikm15's picture
Day 9 β€” v4 merge deploy: kcc_core + advisors + Proof tab + pest heatmap
49818d2 verified
---
title: KCC AgriAdvisor
emoji: 🌾
colorFrom: green
colorTo: yellow
sdk: docker
pinned: false
app_port: 7860
---
# 🌾 KCC AgriAdvisor
AI-powered agricultural advisory for Indian farmers. The most comprehensive
open Indian agriculture AI stack β€” RAG over 16.5M Kisan Call Center records,
national pest forecast for 700 districts Γ— 20 crops, block-level satellite
features, ICAR-cited dose recommendations.
## πŸ“Š Performance
Independent benchmark β€” **100-query stratified sample** from
`IndiaAgriBench-498`, Indian farmer questions across pest, disease, nutrient,
irrigation, scheme, price, and crop-selection problem types.
| Metric | This Stack | Vanilla Gemini | Vanilla Groq |
|---|---|---|---|
| Citation rate | **0.91** | 0.00 | 0.00 |
| Banned-chemical leakage | **0.00** | 0.00 | 0.04 |
| Latency p50 (ms) | **787** | 9,978 | 1,015 |
| Judge β€” Citation (1-10) | **9.12** | 1.14 | 0.26 |
| Judge β€” Safety (1-10) | **8.48** | 8.67 | 5.50 |
Smoke re-run on the v4-merged stack (5-query sample, 2026-05-11) confirmed
citation rate 1.0, banned leak 0.0, latency p50 627 ms β€” at or above
baseline. Full benchmark: see the in-app **/proof** tab or
`eval/benchmark_full_2026-05-10.json` in the IP bundle (NDA-gated).
## Features
- **RAG chatbot** over 16.5M KCC Q&A β€” Hindi / English / 8 regional languages
- **Pest Early Warning** β€” district stacking model (AUC 0.937), 1-month lead time
- **Price Forecast** β€” presow_v4 P25/P50/P75, 290 crops
- **National Pest Heatmap** β€” 700 districts Γ— 20 crops Γ— 3 months, interactive
- **Citation Guard** β€” post-generation safety filter, negation-aware
- **Enterprise Dashboard** β€” B2B pest + price analytics
## Architecture (v4)
```
client β†’ /query
↓
topic guard β†’ classify β†’ v2 multi_step_retrieve (FAISS+BM25+rerank)
↓
prompt build (system + safety + few-shot + retrieved context)
↓
kcc_core.llm cascade: Groq β†’ Gemini β†’ local Llama (first success wins)
↓
kcc_core.citation_guard.review (negation-aware banned-chem strikethrough,
hard overrides, [1][2] citation check)
↓
JSON { answer, sources, safety_warnings, backend, ... }
```
## HF Spaces Secrets required
| Secret | Purpose |
|---|---|
| `GROQ_API_KEY` | Primary LLM (Groq llama-4-scout, fast) |
| `GEMINI_API_KEY` | Fallback LLM + image diagnosis |
| `SECRET_KEY` | JWT signing (β‰₯32 hex chars) |
| `B2B_DEMO_PASSWORD` | Demo enterprise login |
| `HF_TOKEN` | (Optional) Pull large data files from HF Dataset at boot |
## What's in this v4 merge (vs v2)
| Area | v2 (original) | v4 (this) |
|---|---|---|
| LLM cascade | inline in api.py | `kcc_core.llm` (one canonical path) |
| Backend tracking | always "groq" lie | actual model that answered |
| Banned chemicals | 26-item list, substring | 33-item, negation-aware |
| Citation guard | none post-gen | runs on every /query + /query/stream |
| Hard overrides | none | BPH β‰  Imidacloprid |
| Rate limiting | none | slowapi (120/min, 2000/hr default) |
| Request tracing | none | X-Request-ID middleware |
| `/pest-risk` state-only | open | requires B2B JWT |
| Frontend | Farmer + B2B tabs | + new **/proof** tab with benchmark charts + India pest heatmap |
| Tests | none | 19/19 passing |
Full IP bundle (LoRA v2, BGE-M3 index, 498-query eval, ICAR JSON) is
NOT in this Space β€” kept local for NDA-gated buyer sharing.