--- tags: - llama3 - alpaca - grit - lora - qlora - instruction-tuning - fine-tuned base_model: openlm-research/open_llama_3b_v2 library_name: peft license: apache-2.0 datasets: - tatsu-lab/alpaca language: - en pipeline_tag: text-generation --- # OpenLlama-3B-v2 Fine-tuned with GRIT and QLoRA This model is a fine-tuned version of [openlm-research/open_llama_3b_v2](https://huggingface.co/openlm-research/open_llama_3b_v2) using the **GRIT** (Gradient Regularized Instruction Tuning) algorithm and **QLoRA** on the [Alpaca dataset](https://huggingface.co/datasets/tatsu-lab/alpaca). The base model is quantized to 4-bit (NF4) to enable efficient fine-tuning. ## 🚀 Training Details ### GRIT Algorithm - **K-FAC Updates**: Every 200 steps for second-order preconditioning - **Neural Reprojection**: Every 500 steps for rank optimization - **Optimized LoRA Modules**: attention + key MLP layers (as per design) ### Fine-tuning Configuration - **Base Model**: OpenLlama 3B v2 - **Quantization**: 4-bit (NF4) with float16 compute - **LoRA Rank**: 64 - **LoRA Alpha**: 128 - **Batch Size**: 16 (per device) - **Gradient Accumulation**: 4 (Effective batch = 64) - **Learning Rate**: 5.0e-05 - **Precision**: bf16 mixed precision - **Sequence Length**: 512 tokens - **Gradient Checkpointing**: Enabled ### Performance Improvements - ✅ **Faster Convergence**: K-FAC preconditioning aligns updates with curvature - ✅ **Memory-Efficient**: 4-bit quantization (QLoRA) and gradient checkpointing used. - ✅ **Efficient Training**: Utilizes `accelerate` for efficient training. ## 📊 Training Metrics - **Total Steps**: 732 - **Final Training Loss**: 0.2282 - **Final Validation Loss**: 0.22849 - **BLEU (val)**: 0.2452 - **Trainable Params**: 42,598,400 (1.23% of total) ## 🏷️ Model Tags - Instruction-tuned with GRIT and QLoRA - GRIT-tuned Model - 4-bit Quantized Model - LoRA rank 64 - Mixed precision (bf16) - Alpaca dataset fine-tuning ## 📝 Algorithm Details - **K-FAC Preconditioning** (Natural Gradient) and **Neural Reprojection** as per GRIT method - **Memory Efficient**: Covariance matrices on CPU to reduce GPU load ## 🏆 Results In benchmark comparisons, GRIT has shown **faster convergence and better stability** than standard LoRA or fine-tuning, making it well-suited for efficient single-epoch training. ## 📝 Citation If you use this model, please cite: ```bibtex @misc{grit-openllama-3b-alpaca, title={OpenLlama 3B v2 Fine-tuned with GRIT on Alpaca}, author={Pritish92}, year={2025}, publisher={Hugging Face}, url={https://huggingface.co/Pritish92/open-llama-3b-v2-grit-alpaca} } ``` ## ⚖️ License This model inherits the Apache 2.0 license.