Instructions to use soybelli/gemma-4-E4B-it-cuad-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use soybelli/gemma-4-E4B-it-cuad-lora with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/gemma-4-E4B-it-unsloth-bnb-4bit") model = PeftModel.from_pretrained(base_model, "soybelli/gemma-4-E4B-it-cuad-lora") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Unsloth Studio
How to use soybelli/gemma-4-E4B-it-cuad-lora with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for soybelli/gemma-4-E4B-it-cuad-lora to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for soybelli/gemma-4-E4B-it-cuad-lora to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for soybelli/gemma-4-E4B-it-cuad-lora to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="soybelli/gemma-4-E4B-it-cuad-lora", max_seq_length=2048, )
Gemma 4 E4B — CUAD Contract Review (LoRA)
A LoRA adapter that fine-tunes Gemma 4 E4B (instruct) for legal contract clause review on the CUAD (Contract Understanding Atticus Dataset).
Given a contract excerpt and a clause question (e.g. "Highlight the parts related to 'Governing Law'…"), the model extracts the exact relevant span from the excerpt, or answers "Not found" when the clause is absent.
- Base model:
unsloth/gemma-4-E4B-it(trained on theunsloth/gemma-4-E4B-it-unsloth-bnb-4bit4-bit variant) - Method: QLoRA SFT with Unsloth + TRL (
r=16,alpha=16), language layers only - Task: extractive QA / clause highlighting over 41 CUAD clause categories
- Adapter size: ~37M trainable params (0.46% of the base)
Results
Evaluated on held-out contracts (10% of CUAD held out by contract, so no context leakage), SQuAD-2.0-style scoring on 80 questions (40 answerable / 40 not-found):
| Metric | Base Gemma 4 | + this LoRA | Δ |
|---|---|---|---|
| Overall Exact Match | 56.2 | 76.2 | +20.0 |
| Overall token-F1 | 73.9 | 89.4 | +15.6 |
| Answerable Exact Match | 15.0 | 55.0 | +40.0 |
| Answerable token-F1 | 50.2 | 81.4 | +31.1 |
| Not-found abstention acc | 97.5 | 97.5 | +0.0 |
The base model already abstains well on absent clauses; the fine-tune's main gain is returning exact clause spans instead of verbose paraphrases, with no regression on the not-found class.
Usage
import torch
from transformers import AutoModelForImageTextToText, AutoProcessor
from peft import PeftModel
BASE = "unsloth/gemma-4-E4B-it"
ADAPTER = "soybelli/gemma-4-E4B-it-cuad-lora"
processor = AutoProcessor.from_pretrained(BASE)
model = AutoModelForImageTextToText.from_pretrained(BASE, torch_dtype=torch.bfloat16, device_map="auto")
model = PeftModel.from_pretrained(model, ADAPTER).eval()
SYSTEM = (
"You are a legal contract review assistant. Read the contract excerpt and answer the question. "
"If the relevant clause is present, quote the exact text from the excerpt. "
"If it is not present, reply exactly: Not found."
)
excerpt = "...THIS AGREEMENT shall be governed by and construed in accordance with the laws of the State of New York..."
question = (
'Highlight the parts (if any) of this contract related to "Governing Law" that should be reviewed by a '
"lawyer. Details: Which state/country's law governs the interpretation of the contract?"
)
prompt = f"{SYSTEM}\n\n### Contract excerpt:\n{excerpt}\n\n### Question:\n{question}\n\n### Answer:"
messages = [{"role": "user", "content": [{"type": "text", "text": prompt}]}]
inputs = processor.apply_chat_template(
messages, add_generation_prompt=True, tokenize=True, return_tensors="pt", return_dict=True
).to(model.device)
out = model.generate(**inputs, max_new_tokens=256, do_sample=False)
print(processor.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True).strip())
Training
- Data: CUAD SQuAD-2.0 JSON (
CUAD_v1/CUAD_v1.json), flattened to (excerpt, question, answer) pairs. Excerpts are ~2,400-char windows centered on the answer span; the natural ~32% answerable / 68% not-found distribution is kept. 18,759 training examples (10% of contracts held out for eval). - Config: 1 epoch,
max_seq_length=1024, effective batch 16, lr 2e-4,adamw_8bit, bf16, trained only on the assistant response. Final train loss ≈ 0.14. - Hardware: single NVIDIA RTX 5090 (~95 min).
Limitations
- Trained on ~2,400-char (≈600-token) excerpts; for full contracts, chunk the document and query each chunk.
- English commercial contracts only; not legal advice.
- Inherits the base model's and CUAD's biases. CUAD is licensed CC-BY-4.0.
- Downloads last month
- 14
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/gemma-4-E4B-it-unsloth-bnb-4bit") model = PeftModel.from_pretrained(base_model, "soybelli/gemma-4-E4B-it-cuad-lora")