--- license: apache-2.0 base_model: meta-llama/Llama-2-7b-hf tags: - llama - lora - instruction-tuning - dolly - peft datasets: - databricks/databricks-dolly-15k language: - en --- # LLaMA 2.7B Fine-tuned on Dolly This is a LoRA adapter for LLaMA-2.7B, fine-tuned on the Databricks Dolly dataset for instruction-following tasks. ## Model Details - **Base Model**: LLaMA-2.7B - **Training Method**: LoRA (Low-Rank Adaptation) - **Dataset**: Databricks Dolly 15k - **Adapter Type**: PEFT LoRA ## LoRA Configuration - **Rank (r)**: 16 - **Alpha**: 32 - **Dropout**: 0.05 - **Target Modules**: Query and Value projection layers - **Trainable Parameters**: ~8-16M (adapters only, <1% of base model) ## Training Configuration - **Epochs**: 5 - **Batch Size**: 4 - **Learning Rate**: 5e-04 - **Gradient Accumulation**: 1 - **GPUs**: 2 - **Training Steps**: 6810 - **Optimizer**: AdamW - **Weight Decay**: 0.01 ## Usage You need to install the required packages: ```bash pip install transformers peft torch ``` Then load and use the model: ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel import torch # Load base model (replace with actual 2.7B base model) base_model = AutoModelForCausalLM.from_pretrained( "meta-llama/Llama-2-7b-hf", # Update to 2.7B base if available torch_dtype=torch.float16, device_map="auto" ) # Load LoRA adapter model = PeftModel.from_pretrained( base_model, "YOUR_USERNAME/llama-2.7b-fine-tuned-on-dolly" ) # Optional: Merge adapter for faster inference # model = model.merge_and_unload() tokenizer = AutoTokenizer.from_pretrained("YOUR_USERNAME/llama-2.7b-fine-tuned-on-dolly") # Generate prompt = "Instruction: Write a short poem about AI.\n\nResponse:" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate( **inputs, max_length=256, temperature=0.7, top_p=0.9, do_sample=True ) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Key Benefits - **Efficiency**: Only ~8-16M trainable parameters (vs billions in full fine-tuning) - **Storage**: Small adapter files (~30-60MB vs multi-GB full models) - **Modularity**: Can swap adapters on the same base model - **Quality**: Maintains competitive performance with full fine-tuning ## Limitations - Requires the base LLaMA model to use - Performance depends on base model quality - Trained primarily on English instruction-following tasks - May generate biased or incorrect responses ## Training Details This model was fine-tuned using: - **PEFT/LoRA**: Parameter-efficient fine-tuning - **Training Data**: 15k instruction-response pairs from Dolly - **Task**: General instruction following and question answering - **Learning Rate Schedule**: Cosine decay with warmup ## Citation ```bibtex @inproceedings{lora, title={LoRA: Low-Rank Adaptation of Large Language Models}, author={Hu, Edward J and Shen, Yelong and Wallis, Phillip and Allen-Zhu, Zeyuan and Li, Yuanzhi and Wang, Shean and Wang, Lu and Chen, Weizhu}, booktitle={International Conference on Learning Representations}, year={2022} } ``` ## License This model is released under Apache 2.0 license. Note that LLaMA models have specific usage terms from Meta.