---
license: apache-2.0
base_model: meta-llama/Llama-2-7b-hf
tags:
- llama
- lora
- instruction-tuning
- dolly
- peft
datasets:
- databricks/databricks-dolly-15k
language:
- en
---

# LLaMA 2.7B Fine-tuned on Dolly

This is a LoRA adapter for LLaMA-2.7B, fine-tuned on the Databricks Dolly dataset for instruction-following tasks.

## Model Details

- **Base Model**: LLaMA-2.7B
- **Training Method**: LoRA (Low-Rank Adaptation)
- **Dataset**: Databricks Dolly 15k
- **Adapter Type**: PEFT LoRA

## LoRA Configuration

- **Rank (r)**: 16
- **Alpha**: 32
- **Dropout**: 0.05
- **Target Modules**: Query and Value projection layers
- **Trainable Parameters**: ~8-16M (adapters only, <1% of base model)

## Training Configuration

- **Epochs**: 5
- **Batch Size**: 4
- **Learning Rate**: 5e-04
- **Gradient Accumulation**: 1
- **GPUs**: 2
- **Training Steps**: 6810
- **Optimizer**: AdamW
- **Weight Decay**: 0.01

## Usage

You need to install the required packages:

```bash
pip install transformers peft torch
```

Then load and use the model:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model (replace with actual 2.7B base model)
base_model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-2-7b-hf",  # Update to 2.7B base if available
    torch_dtype=torch.float16,
    device_map="auto"
)

# Load LoRA adapter
model = PeftModel.from_pretrained(
    base_model,
    "YOUR_USERNAME/llama-2.7b-fine-tuned-on-dolly"
)

# Optional: Merge adapter for faster inference
# model = model.merge_and_unload()

tokenizer = AutoTokenizer.from_pretrained("YOUR_USERNAME/llama-2.7b-fine-tuned-on-dolly")

# Generate
prompt = "Instruction: Write a short poem about AI.\n\nResponse:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_length=256,
    temperature=0.7,
    top_p=0.9,
    do_sample=True
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

## Key Benefits

- **Efficiency**: Only ~8-16M trainable parameters (vs billions in full fine-tuning)
- **Storage**: Small adapter files (~30-60MB vs multi-GB full models)
- **Modularity**: Can swap adapters on the same base model
- **Quality**: Maintains competitive performance with full fine-tuning

## Limitations

- Requires the base LLaMA model to use
- Performance depends on base model quality
- Trained primarily on English instruction-following tasks
- May generate biased or incorrect responses

## Training Details

This model was fine-tuned using:
- **PEFT/LoRA**: Parameter-efficient fine-tuning
- **Training Data**: 15k instruction-response pairs from Dolly
- **Task**: General instruction following and question answering
- **Learning Rate Schedule**: Cosine decay with warmup

## Citation

```bibtex
@inproceedings{lora,
  title={LoRA: Low-Rank Adaptation of Large Language Models},
  author={Hu, Edward J and Shen, Yelong and Wallis, Phillip and Allen-Zhu, Zeyuan and Li, Yuanzhi and Wang, Shean and Wang, Lu and Chen, Weizhu},
  booktitle={International Conference on Learning Representations},
  year={2022}
}
```

## License

This model is released under Apache 2.0 license. Note that LLaMA models have specific usage terms from Meta.