Instructions to use GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF",
	filename="qwen2.5-1.5b-instruct.Q4_K_M.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF with llama.cpp:

Install (macOS, Linux)

curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama cli -hf GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama cli -hf GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF:Q4_K_M

Use Docker

docker model run hf.co/GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF:Q4_K_M

LM Studio
Jan

vLLM

How to use GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF:Q4_K_M

Ollama
How to use GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF with Ollama:
```
ollama run hf.co/GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF:Q4_K_M
```

Unsloth Studio

How to use GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF to start chatting

How to use GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF:Q4_K_M

Run Hermes

hermes

Atomic Chat new
Docker Model Runner
How to use GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF with Docker Model Runner:
```
docker model run hf.co/GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF:Q4_K_M
```

Lemonade

How to use GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF:Q4_K_M

Run and chat with the model

lemonade run user.Indian-Legal-Qwen2.5-1.5B-GGUF-Q4_K_M

List all available models

lemonade list

Indian-Legal-Qwen2.5-1.5B-GGUF

File size: 7,223 Bytes

---
language:
- en
license: apache-2.0
base_model: unsloth/Qwen2.5-1.5B-Instruct-bnb-4bit
tags:
- legal
- indian-law
- BNS
- BNSS
- BSA
- criminal-law
- qwen
- qwen2.5
- gguf
- llama.cpp
- ollama
- qlora
- unsloth
- domain-adaptation
- instruction-tuning
- question-answering
- law
- india
datasets:
- GSMS-B/Indian-Legal-QA-BNS-BNSS-BSA
pipeline_tag: text-generation
---

# ⚖️🐉 Indian Legal Qwen 2.5 — 1.5B (GGUF)

<p align="center">
  <img src="https://img.shields.io/badge/Base%20Model-Qwen%202.5%201.5B-6366F1?style=for-the-badge" alt="Base Model"/>
  <img src="https://img.shields.io/badge/Type-GGUF%20Quantized-A855F7?style=for-the-badge" alt="Type"/>
  <img src="https://img.shields.io/badge/Domain-Indian%20Criminal%20Law-DC2626?style=for-the-badge" alt="Domain"/>
  <img src="https://img.shields.io/badge/Method-QLoRA-2563EB?style=for-the-badge" alt="Method"/>
  <img src="https://img.shields.io/badge/Acts-BNS%20%7C%20BNSS%20%7C%20BSA-16A34A?style=for-the-badge" alt="Acts"/>
  <img src="https://img.shields.io/badge/License-Apache%202.0-F59E0B?style=for-the-badge" alt="License"/>
</p>

> 🟡 **This is the GGUF-quantized version** — for CPU inference via Ollama or llama.cpp. For full-precision inference see the [Merged Model](https://huggingface.co/GSMS-B/Indian-Legal-Qwen2.5-1.5B) · For lightweight adapter loading see the [Adapter](https://huggingface.co/GSMS-B/Indian-Legal-Qwen2.5-1.5B-Adapter).

---

## 📖 Model Description

**Indian Legal Qwen 2.5 — 1.5B (GGUF)** is a quantized, CPU-friendly version of [`GSMS-B/Indian-Legal-Qwen2.5-1.5B`](https://huggingface.co/GSMS-B/Indian-Legal-Qwen2.5-1.5B), a domain-adapted model fine-tuned using QLoRA on a structured question-answer dataset covering all **1,059 sections** of India's three landmark 2023 criminal justice reform acts:

| Act | Full Name | Replaces | Sections |
|---|---|---|---|
| 📕 **BNS 2023** | Bharatiya Nyaya Sanhita | IPC 1860 | 358 |
| 📗 **BNSS 2023** | Bharatiya Nagarik Suraksha Sanhita | CrPC 1973 | 531 |
| 📘 **BSA 2023** | Bharatiya Sakshya Adhiniyam | Indian Evidence Act 1872 | 170 |

Trained on **6,354 instruction-format QA pairs** — 6 question types per section covering definitions, scenarios, legal elements, exceptions, and consequences — giving it broad, structured coverage of India's reformed criminal law framework. As the smallest model in the family, this GGUF build is ideal for fast, fully offline CPU inference.

---

## 🔗 Model Family — Qwen 2.5 1.5B

| Variant | Repo | Best For |
|---|---|---|
| 🟢 **Merged** | [GSMS-B/Indian-Legal-Qwen2.5-1.5B](https://huggingface.co/GSMS-B/Indian-Legal-Qwen2.5-1.5B) | Out-of-the-box inference, Gradio / API deployment |
| 🔵 **LoRA Adapter** | [GSMS-B/Indian-Legal-Qwen2.5-1.5B-Adapter](https://huggingface.co/GSMS-B/Indian-Legal-Qwen2.5-1.5B-Adapter) | Lightweight loading on top of base model |
| 🟡 **GGUF (this repo)** | `GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF` | CPU inference via Ollama / llama.cpp |

---

## 🚀 Quick Start

### 💻 Run with Ollama
```bash
ollama run hf.co/GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF
```

### ⚙️ Run with llama.cpp
```bash
./llama-cli \
  -hf GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF \
  -p "What is a Zero FIR under BNSS 2023?" \
  -n 300 \
  --temp 0.1
```

### 🐍 Run with llama-cpp-python
```python
from llama_cpp import Llama

llm = Llama.from_pretrained(
    repo_id="GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF",
    filename="*.gguf",
)

SYSTEM = "You are an expert legal assistant specializing in Indian criminal law — BNS, BNSS, and BSA 2023."

response = llm.create_chat_completion(
    messages=[
        {"role": "system", "content": SYSTEM},
        {"role": "user", "content": "What is a Zero FIR under BNSS 2023?"}
    ],
    temperature=0.1,
    max_tokens=300
)

print(response["choices"][0]["message"]["content"])
```

---

## 🎯 Recommended Use Cases

> ⚠️ **Important Note:** This model has been domain-adapted on structured QA data and works best as a **component in a larger pipeline** rather than a standalone answer engine. Direct usage without retrieval context may produce incomplete or imprecise answers on complex legal queries.

### ✅ Where this model excels

| Use Case | 💡 How to Use |
|---|---|
| 🔍 **RAG Pipeline** | Pair with a BM25 or vector retriever over BNS/BNSS/BSA texts; feed retrieved sections as context for grounded, citation-backed answers |
| 🤖 **Legal Chatbot Backend** | Use as the generation backbone of a legal assistant app with a ChromaDB / FAISS document store |
| 📚 **Legal Education Tool** | Build interactive Q&A apps for law students and practitioners learning the 2023 criminal justice reforms |
| 🔎 **Section Lookup Assistant** | Combine with a section index to surface the exact BNS / BNSS / BSA provision relevant to a given situation |
| 💻 **Offline / Edge Deployment** | Smallest model in the family, runnable on consumer CPUs without a GPU — ideal for local apps, kiosks, or low-resource environments |
| 📝 **Structured Legal Summarization** | Summarize individual sections when the section text is supplied as input context |
| 🏛️ **Legal NLP Research** | Benchmark Indian criminal law understanding across model families (Qwen vs Llama) |
| ⚖️ **Comparative Law Analysis** | Highlight differences between old acts (IPC/CrPC/IEA) and their 2023 replacements |

### ❌ Not recommended for
- Standalone legal advice without a retrieval component
- High-stakes legal decisions without qualified human review
- Jurisdictions or acts outside BNS / BNSS / BSA 2023

---

## 🏋️ Training Details

| Property | Value |
|---|---|
| 🤖 Base model | `unsloth/Qwen2.5-1.5B-Instruct-bnb-4bit` |
| 🔧 Fine-tuning method | QLoRA |
| 🎛️ LoRA rank | 64 |
| 🎛️ LoRA alpha | 128 |
| 🧩 Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| 📊 Training data | 6,354 QA pairs — 1,059 sections × 6 question types |
| 🔁 Epochs | 3 |
| 📦 Batch size (per device) | 4 |
| 📈 Learning rate | 2e-4 |
| ⚙️ Optimizer | adamw_8bit |
| 💻 Hardware | Google Colab T4 GPU |
| 🛠️ Framework | Unsloth + TRL SFTTrainer |
| 💬 Prompt format | ChatML |
| 🗜️ Quantization | GGUF (converted from merged FP16 model) |

---

## 📊 Training Dataset

| 📂 Dataset | 🔗 Link |
|---|---|
| Indian Legal QA — BNS + BNSS + BSA 2023 | [GSMS-B/Indian-Legal-QA-BNS-BNSS-BSA](https://huggingface.co/datasets/GSMS-B/Indian-Legal-QA-BNS-BNSS-BSA) |

**6 question types per section:**
`definitional_topic` · `definitional_section` · `scenario` · `elements` · `exceptions` · `consequence`

---

## 👤 Author

**GSMS-B** — Bugatha Ganasyam Mani Sankar
🤗 [Hugging Face Profile](https://huggingface.co/GSMS-B)

---

## ⚠️ Disclaimer

This model is intended for **research and educational purposes only**. It does not constitute legal advice. Outputs should not be relied upon for any legal decision without review by a qualified legal professional. The model's responses reflect patterns in training data and may contain errors or omissions.

---

*⚡ Fine-tuned using [Unsloth](https://github.com/unslothai/unsloth) for training efficiency.*