--- library_name: peft license: llama3.2 base_model: unsloth/Llama-3.2-1B-Instruct tags: - axolotl - generated_from_trainer model-index: - name: miner_id_1_383a850e-bb15-45a2-8f4b-fc96eb001a74_1729770655 results: [] --- [Built with Axolotl](https://github.com/axolotl-ai-cloud/axolotl)
See axolotl config axolotl version: `0.4.1` ```yaml adapter: lora base_model: unsloth/Llama-3.2-1B-Instruct bf16: auto chat_template: llama3 dataset_prepared_path: null datasets: - path: mhenrichsen/alpaca_2k_test type: alpaca debug: null deepspeed: null early_stopping_patience: 3 eval_max_new_tokens: 128 eval_steps: 5 eval_table_size: null flash_attention: true fp16: null fsdp: null fsdp_config: null gradient_accumulation_steps: 1 gradient_checkpointing: true group_by_length: false hub_model_id: besimray/miner_id_1_383a850e-bb15-45a2-8f4b-fc96eb001a74_1729770655 hub_strategy: checkpoint hub_token: null learning_rate: 2.0e-05 load_in_4bit: false load_in_8bit: true local_rank: null logging_steps: 1 lora_alpha: 16 lora_dropout: 0.05 lora_fan_in_fan_out: null lora_model_dir: null lora_r: 8 lora_target_linear: true lr_scheduler: cosine max_steps: 10000 micro_batch_size: 10 mlflow_experiment_name: mhenrichsen/alpaca_2k_test model_type: LlamaForCausalLM num_epochs: 5 optimizer: adamw_bnb_8bit output_dir: miner_id_besimray pad_to_sequence_len: true resume_from_checkpoint: null s2_attention: null sample_packing: false save_steps: 5 save_strategy: steps sequence_len: 4096 strict: false tf32: false tokenizer_type: AutoTokenizer train_on_inputs: false val_set_size: 0.05 wandb_entity: besimray24-rayon wandb_mode: online wandb_project: Public_TuningSN wandb_run: miner_id_24 wandb_runid: 383a850e-bb15-45a2-8f4b-fc96eb001a74 warmup_steps: 10 weight_decay: 0.0 xformers_attention: null ```

# miner_id_1_383a850e-bb15-45a2-8f4b-fc96eb001a74_1729770655 This model is a fine-tuned version of [unsloth/Llama-3.2-1B-Instruct](https://huggingface.co/unsloth/Llama-3.2-1B-Instruct) on the None dataset. It achieves the following results on the evaluation set: - Loss: 1.1623 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 10 - eval_batch_size: 10 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - lr_scheduler_warmup_steps: 10 - training_steps: 950 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:------:|:----:|:---------------:| | 1.3316 | 0.0053 | 1 | 1.2586 | | 1.1351 | 0.0263 | 5 | 1.2596 | | 1.2604 | 0.0526 | 10 | 1.2566 | | 1.5396 | 0.0789 | 15 | 1.2454 | | 1.4895 | 0.1053 | 20 | 1.2336 | | 1.1625 | 0.1316 | 25 | 1.2236 | | 1.3554 | 0.1579 | 30 | 1.2150 | | 1.3275 | 0.1842 | 35 | 1.2100 | | 1.1912 | 0.2105 | 40 | 1.2058 | | 1.2335 | 0.2368 | 45 | 1.2030 | | 1.0253 | 0.2632 | 50 | 1.1979 | | 1.1242 | 0.2895 | 55 | 1.1970 | | 0.9963 | 0.3158 | 60 | 1.1910 | | 1.0977 | 0.3421 | 65 | 1.1919 | | 1.1263 | 0.3684 | 70 | 1.1880 | | 1.2144 | 0.3947 | 75 | 1.1860 | | 1.3055 | 0.4211 | 80 | 1.1839 | | 1.1513 | 0.4474 | 85 | 1.1818 | | 1.0702 | 0.4737 | 90 | 1.1819 | | 1.2561 | 0.5 | 95 | 1.1797 | | 1.1373 | 0.5263 | 100 | 1.1775 | | 1.2136 | 0.5526 | 105 | 1.1780 | | 1.3591 | 0.5789 | 110 | 1.1771 | | 1.5703 | 0.6053 | 115 | 1.1744 | | 1.1601 | 0.6316 | 120 | 1.1754 | | 1.1412 | 0.6579 | 125 | 1.1748 | | 1.1449 | 0.6842 | 130 | 1.1731 | | 1.1706 | 0.7105 | 135 | 1.1736 | | 1.0503 | 0.7368 | 140 | 1.1730 | | 1.1938 | 0.7632 | 145 | 1.1730 | | 1.4802 | 0.7895 | 150 | 1.1710 | | 1.1359 | 0.8158 | 155 | 1.1688 | | 1.3575 | 0.8421 | 160 | 1.1709 | | 1.0188 | 0.8684 | 165 | 1.1685 | | 1.147 | 0.8947 | 170 | 1.1684 | | 0.9949 | 0.9211 | 175 | 1.1668 | | 1.3082 | 0.9474 | 180 | 1.1673 | | 1.1995 | 0.9737 | 185 | 1.1654 | | 1.2346 | 1.0 | 190 | 1.1654 | | 1.0948 | 1.0263 | 195 | 1.1660 | | 1.3838 | 1.0526 | 200 | 1.1643 | | 0.9594 | 1.0789 | 205 | 1.1644 | | 1.1423 | 1.1053 | 210 | 1.1635 | | 1.1774 | 1.1316 | 215 | 1.1645 | | 1.0085 | 1.1579 | 220 | 1.1642 | | 1.0912 | 1.1842 | 225 | 1.1611 | | 1.193 | 1.2105 | 230 | 1.1627 | | 1.2437 | 1.2368 | 235 | 1.1640 | | 1.1814 | 1.2632 | 240 | 1.1623 | ### Framework versions - PEFT 0.13.2 - Transformers 4.45.2 - Pytorch 2.3.1+cu121 - Datasets 3.0.1 - Tokenizers 0.20.1