Yasette commited on
Commit
9a77eca
·
verified ·
1 Parent(s): 90795af

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -16
README.md CHANGED
@@ -7,25 +7,35 @@ base_model:
7
  pipeline_tag: text-generation
8
  ---
9
  ### Meta-Llama-3.1-Math-QA-finetuning-Group-3
10
- This code uses meta-math/MetaMathQA dataset to fine-tune Meta-Llama-3.1 Large Language Model.
 
11
 
12
- LoRA was utilized in order to significantly decrease training time.
13
-
14
- Random (seed = 42) 50.000 lines were selected from the database to be used in training.
15
-
16
- Unsloth framework allows the fine-tuning process to be more memory and time efficient.
17
-
18
- Training hyperparameters:
19
 
 
20
  ```
21
- num_train_epochs = 5
22
- max_steps = 50
23
- learning_rate = 5e-5
24
- logging_steps = 1
25
- optim = "adamw_8bit"
26
- weight_decay = 0.001
27
- lr_scheduler_type = "linear"
28
- seed = 3407
 
 
 
 
 
 
 
 
 
29
  ```
30
 
 
 
31
  - 50th Epoch training loss: 0.551400
 
7
  pipeline_tag: text-generation
8
  ---
9
  ### Meta-Llama-3.1-Math-QA-finetuning-Group-3
10
+ This model is a fine-tuned version of Meta-Llama-3.1-8B on the MetaMathQA dataset for mathematical reasoning tasks.
11
+ Training Details
12
 
13
+ Method: QLoRA (4-bit quantization with LoRA adapters)
14
+ Framework: Unsloth for memory and time efficient fine-tuning
15
+ Dataset: 50,000 randomly selected samples from MetaMathQA (seed=42)
16
+ Hardware: Google Colab T4 GPU
 
 
 
17
 
18
+ Hyperparameters
19
  ```
20
+ # QLoRA Configuration
21
+ load_in_4bit = True
22
+ lora_r = 16
23
+ lora_alpha = 16
24
+ lora_dropout = 0
25
+
26
+ # Training Configuration
27
+ num_train_epochs = 5
28
+ max_steps = 50
29
+ learning_rate = 5e-5
30
+ per_device_train_batch_size = 2
31
+ gradient_accumulation_steps = 4
32
+ warmup_steps = 5
33
+ weight_decay = 0.001
34
+ lr_scheduler_type = "linear"
35
+ optim = "adamw_8bit"
36
+ seed = 3407
37
  ```
38
 
39
+
40
+ Training Results
41
  - 50th Epoch training loss: 0.551400