---
tags:
- llama3
- alpaca
- grit
- lora
- qlora
- instruction-tuning
- fine-tuned
base_model: openlm-research/open_llama_3b_v2
library_name: peft
license: apache-2.0
datasets:
- tatsu-lab/alpaca
language:
- en
pipeline_tag: text-generation
---

# OpenLlama-3B-v2 Fine-tuned with GRIT and QLoRA

This model is a fine-tuned version of [openlm-research/open_llama_3b_v2](https://huggingface.co/openlm-research/open_llama_3b_v2) using the **GRIT** (Gradient Regularized Instruction Tuning) algorithm and **QLoRA** on the [Alpaca dataset](https://huggingface.co/datasets/tatsu-lab/alpaca).

The base model is quantized to 4-bit (NF4) to enable efficient fine-tuning.

## 🚀 Training Details

### GRIT Algorithm
- **K-FAC Updates**: Every 200 steps for second-order preconditioning
- **Neural Reprojection**: Every 500 steps for rank optimization
- **Optimized LoRA Modules**: attention + key MLP layers (as per design)

### Fine-tuning Configuration
- **Base Model**: OpenLlama 3B v2
- **Quantization**: 4-bit (NF4) with float16 compute
- **LoRA Rank**: 64  
- **LoRA Alpha**: 128  
- **Batch Size**: 16 (per device)  
- **Gradient Accumulation**: 4 (Effective batch = 64)  
- **Learning Rate**: 5.0e-05  
- **Precision**: bf16 mixed precision  
- **Sequence Length**: 512 tokens  
- **Gradient Checkpointing**: Enabled

### Performance Improvements
- ✅ **Faster Convergence**: K-FAC preconditioning aligns updates with curvature
- ✅ **Memory-Efficient**: 4-bit quantization (QLoRA) and gradient checkpointing used.
- ✅ **Efficient Training**: Utilizes `accelerate` for efficient training.

## 📊 Training Metrics
- **Total Steps**: 732
- **Final Training Loss**: 0.2282
- **Final Validation Loss**: 0.22849
- **BLEU (val)**: 0.2452
- **Trainable Params**: 42,598,400 (1.23% of total)

## 🏷️ Model Tags
- Instruction-tuned with GRIT and QLoRA
- GRIT-tuned Model
- 4-bit Quantized Model
- LoRA rank 64  
- Mixed precision (bf16)  
- Alpaca dataset fine-tuning  

## 📝 Algorithm Details
- **K-FAC Preconditioning** (Natural Gradient) and **Neural Reprojection** as per GRIT method
- **Memory Efficient**: Covariance matrices on CPU to reduce GPU load

## 🏆 Results
In benchmark comparisons, GRIT has shown **faster convergence and better stability** than standard LoRA or fine-tuning, making it well-suited for efficient single-epoch training.

## 📝 Citation
If you use this model, please cite:
```bibtex
@misc{grit-openllama-3b-alpaca,
  title={OpenLlama 3B v2 Fine-tuned with GRIT on Alpaca},
  author={Pritish92},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/Pritish92/open-llama-3b-v2-grit-alpaca}
}
```

## ⚖️ License
This model inherits the Apache 2.0 license.