--- library_name: transformers tags: - llama - invoice-extraction - sft - gguf --- # Llama-base-3.1-8B-invoice-gguf-sft A fine-tuned Llama-3.1-8B model optimized for **invoice understanding and extraction**. This version is exported in **GGUF** format for performant inference with tools such as **llama.cpp**, **Ollama**, and **text-generation-ui**. --- ## Model Details ### Model Description This model adapts Llama-3.1-8B for structured invoice field extraction. The goal is to support tasks such as reading invoice text and identifying key fields (amount, date, vendor, tax, line items, etc.). - **Developed by:** *muhammed-afsal-p-m* - **Model type:** Auto-regressive language model (decoder-only) - **Languages:** English (primary) — Other languages not verified - **License:** *Fill in — e.g., MIT, Apache-2.0, others* - **Fine-tuned from:** Llama-3.1-8B (Meta) ### Model Sources - **Repository:** https://huggingface.co/muhammed-afsal-p-m/Llama-base-3.1-8B-invoice-gguf-sft --- ## Uses ### Direct Use Useful for: - Invoice text understanding - Extracting structured fields - Document parsing prototypes - Local inference via GGUF ### Downstream Use Can be integrated into: - RPA invoice pipelines - Accounting automation - OCR → LLM extraction stages - Document indexing/search systems ### Out-of-Scope Use Not suited for: - Legal/financial decision-making without human review - High-stakes extraction requiring guaranteed accuracy - Multi-language invoice parsing (not validated) - Vision-based tasks (requires text extracted separately) --- ## Bias, Risks, and Limitations - Model accuracy depends heavily on the **quality and consistency** of invoice text. - May hallucinate missing fields instead of explicitly stating absence. - Invoices vary widely in structure; unseen formats may reduce reliability. - Any training biases (invoice styles, languages, domain distribution) affect output. ### Recommendations - Always verify extracted results. - Use deterministic decoding when consistent outputs are required. - Validate outputs with rule-based post-processing. --- ## How to Get Started ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch model_name = "muhammed-afsal-p-m/Llama-base-3.1-8B-invoice-gguf-sft" # For GGUF, use llama.cpp / ctransformers: from ctransformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained( model_name, model_file="model.gguf", # replace with your file name ) print(model("Extract invoice total from: ..."))