--- tags: - generated_from_trainer model-index: - name: tinyllama-1.1B-intermediate-step-715k-1.5T-dpo-lora-v4 results: [] --- # tinyllama-1.1B-intermediate-step-715k-1.5T-dpo-lora-v4 This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set: - Loss: 0.6904 - Rewards/chosen: -3.5271 - Rewards/rejected: -5.6475 - Rewards/accuracies: 0.7393 - Rewards/margins: 2.1205 - Logps/rejected: -394.1334 - Logps/chosen: -478.6117 - Logits/rejected: -3.8937 - Logits/chosen: -4.0184 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.001 - train_batch_size: 2 - eval_batch_size: 2 - seed: 42 - distributed_type: multi-GPU - gradient_accumulation_steps: 32 - total_train_batch_size: 64 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_ratio: 0.02 - num_epochs: 3 ### Training results | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen | |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:| | 0.5491 | 0.34 | 300 | 0.5719 | -0.5176 | -1.3357 | 0.7015 | 0.8181 | -351.0149 | -448.5167 | -4.0592 | -4.2257 | | 0.5906 | 0.68 | 600 | 0.5625 | -0.3365 | -1.2779 | 0.7191 | 0.9414 | -350.4370 | -446.7061 | -4.0731 | -4.2239 | | 0.2857 | 1.02 | 900 | 0.5723 | -0.3882 | -1.5979 | 0.7141 | 1.2097 | -353.6368 | -447.2226 | -4.0753 | -4.2332 | | 0.2679 | 1.36 | 1200 | 0.5883 | -1.1630 | -2.3423 | 0.7234 | 1.1793 | -361.0811 | -454.9714 | -4.0115 | -4.1888 | | 0.231 | 1.71 | 1500 | 0.5895 | -1.3278 | -2.7966 | 0.7338 | 1.4688 | -365.6242 | -456.6194 | -4.0069 | -4.1696 | | 0.0862 | 2.05 | 1800 | 0.6626 | -2.7764 | -4.6708 | 0.7284 | 1.8944 | -384.3661 | -471.1047 | -3.9624 | -4.0992 | | 0.0804 | 2.39 | 2100 | 0.6818 | -3.0330 | -5.1156 | 0.7410 | 2.0826 | -388.8140 | -473.6706 | -3.9128 | -4.0467 | | 0.0925 | 2.73 | 2400 | 0.6947 | -3.5621 | -5.6537 | 0.7371 | 2.0916 | -394.1956 | -478.9623 | -3.8908 | -4.0137 | ### Framework versions - Transformers 4.35.0 - Pytorch 2.1.0+cu121 - Datasets 2.14.6 - Tokenizers 0.14.1