---
library_name: peft
tags:
- Prediction
- Summarisation
- Legal NLP
- LoRA
- Flan-T5
datasets:
- L-NLProc/PredEx_Instruction-Tuning_Pred-Exp
language:
- en
metrics:
- bertscore
- rouge
- accuracy
base_model:
- google/flan-t5-large
pipeline_tag: summarization
---
# Legal Text Summarisation & Outcome Prediction Engine

## Model Details

### Model Description

The Legal Text Summarisation model is an advanced NLP pipeline tailored specifically for dense Indian legal documents. It leverages a parameter-efficient fine-tuned (PEFT) architecture using LoRA to accomplish two primary tasks simultaneously: 
1. **Binary Classification:** Accurately predicting the final verdict of a legal case (`1 = Accepted`, `0 = Rejected`).
2. **Abstractive Summarisation:** Generating a concise, legally accurate explanation detailing the reasoning behind the verdict.

Unlike standard summarisation models that struggle with lengthy legalese and hallucinate facts, this model is designed to process dense legal context and output precise legal reasoning validated by semantic and lexical metrics.

- **Developed by:** Swastik Sharma
- **Model type:** Text-to-Text Generation (T5 Architecture with LoRA adapter)
- **Language(s) (NLP):** English
- **License:** Apache 2.0
- **Finetuned from base model:** `google/flan-t5-large`

## Uses

### Direct Use

This model is intended to be used by legal tech developers, researchers, and law students to quickly parse Indian legal judgments. By providing the raw text of a court case, the model outputs a predicted label and a generated summary of the legal rationale.

### Out-of-Scope Use

This model **does not provide legal advice**. It is an AI research tool intended for summarisation and prediction based on historical precedent. It should not be used as a substitute for professional legal counsel or human judicial oversight. 

## Bias, Risks, and Limitations

Legal documents often contain sensitive histories and societal biases. As this model was fine-tuned on historical Indian court cases (`PredEx`), it may reflect historical biases present in the judicial system. Furthermore, generative language models can occasionally hallucinate; therefore, the generated explanations should always be verified against the original court documents before being used in any official legal capacity.

## How to Get Started with the Model

Since this is a LoRA adapter, you need to load the base `flan-t5-large` model and attach these weights using the `peft` library. 

```python
from transformers import T5ForConditionalGeneration, T5Tokenizer
from peft import PeftModel
import torch

base_model_id = "google/flan-t5-large"
peft_model_id = "Beelzi/legal-flan-t5-large-lora"

# Load Tokenizer
tokenizer = T5Tokenizer.from_pretrained(peft_model_id)

# Load Base Model in FP16 to save VRAM
base_model = T5ForConditionalGeneration.from_pretrained(
    base_model_id, 
    torch_dtype=torch.float16,
    device_map="auto"
)

# Apply LoRA Adapter
model = PeftModel.from_pretrained(base_model, peft_model_id)

# Example Inference
legal_text = "the High Court under sub-section (4) thereof. The expression appeal has not been defined in the CPC..."
prompt = f"Predict the legal outcome of this Indian court case. Output the label (0=rejected, 1=accepted) and a brief explanation.\n\nCase: {legal_text}"

inputs = tokenizer(prompt, return_tensors="pt").to("cuda" if torch.cuda.is_available() else "cpu")

with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=256, early_stopping=True)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))