summerdevlin46 commited on
Commit
d25193e
·
verified ·
1 Parent(s): 3207165

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +74 -13
README.md CHANGED
@@ -1,21 +1,82 @@
1
  ---
2
- base_model: unsloth/Llama-3.2-3B-Instruct
3
- tags:
4
- - text-generation-inference
5
- - transformers
6
- - unsloth
7
- - llama
8
- license: apache-2.0
9
  language:
10
  - en
 
 
 
 
 
 
 
 
 
 
 
11
  ---
12
 
13
- # Uploaded finetuned model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
- - **Developed by:** summerdevlin46
16
- - **License:** apache-2.0
17
- - **Finetuned from model :** unsloth/Llama-3.2-3B-Instruct
 
18
 
19
- This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
20
 
21
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
1
  ---
 
 
 
 
 
 
 
2
  language:
3
  - en
4
+ - te
5
+ license: mit
6
+ base_model: unsloth/Llama-3.2-3B-Instruct-bnb-4bit
7
+ tags:
8
+ - receipt-parsing
9
+ - kirana
10
+ - inventory
11
+ - fine-tuned
12
+ - lora
13
+ - indian-retail
14
+ pipeline_tag: text-generation
15
  ---
16
 
17
+ # Dukaan Saathi — Receipt Parser (Llama-3.2-3B fine-tune)
18
+
19
+ Fine-tuned **Llama-3.2-3B-Instruct** for structured receipt parsing in Indian kirana (convenience) store workflows.
20
+
21
+ Part of the [Dukaan Saathi](https://huggingface.co/spaces/summerdevlin46/dukaan-saathi) inventory copilot demo.
22
+
23
+ ## What it does
24
+
25
+ Takes noisy supplier receipt OCR text and returns a structured JSON object with line items, quantities, prices, and supplier info. Designed for messy real-world receipts: handwritten bills, printed tax invoices, informal tally notes.
26
+
27
+ ## Training data
28
+
29
+ - 6 hand-authored examples from real kirana receipt formats
30
+ - 22 Modal LLM-generated synthetic examples augmenting edge cases
31
+ - Total: 28 examples; training focuses on format consistency over broad generalisation
32
+
33
+ ## Example
34
+
35
+ **Input:**
36
+ ```
37
+ MAHALAKSHMI MARKETING
38
+ No. 2816 Date: 27/5/26
39
+ Parle 1 X 2450 = 2450
40
+ Bingo(C) 4 X 870 = 3480
41
+ Subtotal 5930 Discount 612 Total 6542
42
+ ```
43
+
44
+ **Output:**
45
+ ```json
46
+ {
47
+ "supplier": "Mahalakshmi Marketing",
48
+ "invoice_no": "2816",
49
+ "date": "2026-05-27",
50
+ "items": [
51
+ {"product_raw": "Parle", "qty_cases": 1, "qty_units": 1, "unit_cost": 2450.0, "total": 2450.0},
52
+ {"product_raw": "Bingo(C)", "qty_cases": 4, "qty_units": 4, "unit_cost": 870.0, "total": 3480.0}
53
+ ],
54
+ "subtotal": 5930.0,
55
+ "discount": 612.0,
56
+ "gst": 0.0,
57
+ "net_total": 6542.0
58
+ }
59
+ ```
60
+
61
+ ## Inference
62
+
63
+ ```python
64
+ from huggingface_hub import InferenceClient
65
+
66
+ client = InferenceClient()
67
+ prompt = """### Instruction:
68
+ You are a receipt parser for an Indian convenience store. Extract all line items. Return ONLY valid JSON, no markdown.
69
+
70
+ ### Input:
71
+ <paste receipt text here>
72
 
73
+ ### Response:
74
+ """
75
+ result = client.text_generation(prompt, model="summerdevlin46/dukaan-saathi-receipt-lora", max_new_tokens=768)
76
+ ```
77
 
78
+ ## Limitations
79
 
80
+ - Small training set; overfits to known receipt styles (Mahalakshmi Marketing, Sri Venkateshwara Marketing, Brundavan Buns)
81
+ - Owner approval gate always required before any inventory write
82
+ - Not a general-purpose receipt parser