--- base_model: unsloth/llama-3-8b library_name: peft pipeline_tag: text-generation tags: - base_model:adapter:unsloth/llama-3-8b - lora - sft - transformers - trl - unsloth --- # new-llama3-tky-div-0.5 This model is a fine-tuned version of [unsloth/llama-3-8b](https://huggingface.co/unsloth/llama-3-8b) using LoRA (Low-Rank Adaptation) and quantization techniques. ## Model Details - **Base Model:** unsloth/llama-3-8b - **Fine-tuned Model:** comp5331poi/new-llama3-tky-div-0.5 - **Training Run:** new-llama3-tky-div-0.5 - **Device:** cuda ## Training Configuration ### Hyperparameters - **Number of Epochs:** 8 - **Batch Size:** 4 - **Gradient Accumulation Steps:** 2 - **Effective Batch Size:** 8 - **Learning Rate:** 1e-05 - **Learning Rate Scheduler:** constant - **Warmup Steps:** 20 - **Max Sequence Length:** 2048 - **Optimizer:** paged_adamw_8bit - **Max Gradient Norm:** 0.3 - **Random Seed:** 2024 ### LoRA Configuration - **LoRA Rank (r):** 16 - **LoRA Alpha:** 32 - **LoRA Dropout:** 0.1 - **Target Modules:** gate_proj, o_proj, down_proj, k_proj, v_proj, up_proj, q_proj - **Task Type:** CAUSAL_LM ### Quantization - **Quantization Bits:** 4-bit ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel # Load base model base_model = AutoModelForCausalLM.from_pretrained("unsloth/llama-3-8b") # Load LoRA adapter model = PeftModel.from_pretrained(base_model, "comp5331poi/new-llama3-tky-div-0.5") # Load tokenizer tokenizer = AutoTokenizer.from_pretrained("unsloth/llama-3-8b") # Generate text inputs = tokenizer("Your prompt here", return_tensors="pt") outputs = model.generate(**inputs, max_length=2048) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Framework Versions - Transformers - PEFT - TRL - PyTorch - BitsAndBytes