---
language:
- en

license: apache-2.0

tags:
- llama
- llama-3.2
- lora
- peft
- unsloth
- fine-tuning
- generative-ai
- conversational-ai
- personas
- india
- 4-bit
- instruction-tuning

datasets:
- NVIDIA/Nemotron-Personas

base_model: unsloth/Llama-3.2-3B-bnb-4bit

model-index:
- name: llama-Nemotron-Personas-India-finetuned
  results: []
---

# LLaMA 3.2 3B — Nemotron Personas India Fine-tuned

A parameter-efficient fine-tuned version of LLaMA 3.2 3B, trained using the NVIDIA Nemotron Personas dataset with Indian contextual and conversational style tuning.

This model is built using Unsloth for faster and memory-efficient LoRA fine-tuning.

---

# Model Overview

This model is designed to generate:

- More natural conversational responses
- Indian contextual understanding
- Persona-aware dialogue
- Human-like assistant interactions
- Better response personalization

The model is fine-tuned using LoRA adapters on top of:

`unsloth/Llama-3.2-3B-bnb-4bit`

---

# Quick Start

## Install Requirements

```bash
pip install unsloth
pip install torch transformers peft accelerate bitsandbytes
```

---

# Load Model

```python
from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "Sachin016/llama-Nemotron-Personas-India-finetuned",
    max_seq_length = 2048,
    dtype = None,
    load_in_4bit = True,
)

FastLanguageModel.for_inference(model)
```

---

# Run Inference

```python
alpaca_prompt = """### Instruction:
{}

### Input:
{}

### Response:
{}"""

question = "Explain AI in simple words"

inputs = tokenizer(
    [alpaca_prompt.format(question, "", "")],
    return_tensors = "pt"
).to("cuda")

outputs = model.generate(
    **inputs,
    max_new_tokens = 128,
    use_cache = True
)

response = tokenizer.batch_decode(outputs)[0].split("### Response:")[1].strip()

print(response)
```

---

# Model Details

| Property | Value |
|---|---|
| Base Model | unsloth/Llama-3.2-3B-bnb-4bit |
| Fine-tuning Method | LoRA |
| Quantization | 4-bit |
| Framework | Unsloth + HuggingFace PEFT |
| Max Sequence Length | 2048 |
| Language | English |
| Training Type | Instruction Fine-tuning |
| Domain | Conversational AI |
| Specialization | Persona-based responses |

---

# Dataset

This model was fine-tuned using:

- NVIDIA Nemotron Personas Dataset

The dataset helps improve:

- Conversational quality
- Personality consistency
- Human-like dialogue generation
- Contextual response handling

---

# Training Configuration

```python
TrainingArguments(
    per_device_train_batch_size = 2,
    gradient_accumulation_steps = 4,
    warmup_steps = 5,
    num_train_epochs = 3,
    learning_rate = 2e-4,
    optim = "adamw_8bit",
)
```

---

# Prompt Format

This model uses the Alpaca instruction format.

```text
### Instruction:
<your instruction>

### Input:
<optional context>

### Response:
<model output>
```

---

# Upload to Hugging Face

```python
from huggingface_hub import login

login(token="YOUR_HUGGINGFACE_TOKEN")

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "/content/drive/MyDrive/llama_project/finetuned_model",
    max_seq_length = 2048,
    dtype = None,
    load_in_4bit = True,
)

model.push_to_hub(
    "Sachin016/llama-Nemotron-Personas-India-finetuned",
    token="YOUR_HUGGINGFACE_TOKEN"
)

tokenizer.push_to_hub(
    "Sachin016/llama-Nemotron-Personas-India-finetuned",
    token="YOUR_HUGGINGFACE_TOKEN"
)
```

---

# Recommended Hardware

Recommended GPUs:

- NVIDIA T4
- NVIDIA A100
- RTX 3060+
- Any CUDA GPU with 8GB+ VRAM

Can be trained easily on Google Colab using Unsloth.

---

# Limitations

- May generate hallucinated information
- Performance depends on prompt quality
- Not intended for medical/legal critical systems
- Knowledge limited to base model training cutoff

---

# License

This project is released under the Apache 2.0 License.

The base model follows Meta's LLaMA 3.2 Community License.

---

# Acknowledgements

- Unsloth — optimized LLM fine-tuning
- Meta AI — LLaMA 3.2 base model
- NVIDIA — Nemotron Personas dataset
- Hugging Face — model hosting platform

---

# Author

Made with ❤️ by Sachin016

Hugging Face:
https://huggingface.co/Sachin016