Instructions to use Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink")
model = AutoModelForMultimodalLM.from_pretrained("Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink

SGLang

How to use Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink with Docker Model Runner:
```
docker model run hf.co/Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink
```

Leeyuyu commited on Aug 7, 2025

Commit

4b9e83e

verified ·

1 Parent(s): e3ac28d

Model save

Browse files

Files changed (4) hide show

README.md +2 -4
all_results.json +6 -6
train_results.json +6 -6
trainer_state.json +112 -280

README.md CHANGED Viewed

@@ -1,10 +1,8 @@
 ---
-datasets: Leeyuyu/fundo_600
 library_name: transformers
 model_name: Qwen2.5-SFT2-GRPO-fundo-nothink
 tags:
 - generated_from_trainer
-- R1-V
 - trl
 - sft
 licence: license
@@ -12,7 +10,7 @@ licence: license
 # Model Card for Qwen2.5-SFT2-GRPO-fundo-nothink
-This model is a fine-tuned version of [None](https://huggingface.co/None) on the [Leeyuyu/fundo_600](https://huggingface.co/datasets/Leeyuyu/fundo_600) dataset.
 It has been trained using [TRL](https://github.com/huggingface/trl).
 ## Quick start
@@ -28,7 +26,7 @@ print(output["generated_text"])
 ## Training procedure
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/brightlight720720_lee/huggingface/runs/8blxzirv)
 This model was trained with SFT.

 ---
 library_name: transformers
 model_name: Qwen2.5-SFT2-GRPO-fundo-nothink
 tags:
 - generated_from_trainer
 - trl
 - sft
 licence: license
 # Model Card for Qwen2.5-SFT2-GRPO-fundo-nothink
+This model is a fine-tuned version of [None](https://huggingface.co/None).
 It has been trained using [TRL](https://github.com/huggingface/trl).
 ## Quick start
 ## Training procedure
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/brightlight720720_lee/huggingface/runs/6ld05gs5)
 This model was trained with SFT.

all_results.json CHANGED Viewed

@@ -1,9 +1,9 @@
 {
     "epoch": 1.0,
-    "total_flos": 1.13782877585408e+17,
-    "train_loss": 5.960167303085327,
-    "train_runtime": 999.5097,
-    "train_samples": 3195,
-    "train_samples_per_second": 3.197,
-    "train_steps_per_second": 0.05
 }

 {
     "epoch": 1.0,
+    "total_flos": 6.191745166684979e+16,
+    "train_loss": 6.366142969865066,
+    "train_runtime": 588.67,
+    "train_samples": 1661,
+    "train_samples_per_second": 2.822,
+    "train_steps_per_second": 0.044
 }

train_results.json CHANGED Viewed

@@ -1,9 +1,9 @@
 {
     "epoch": 1.0,
-    "total_flos": 1.13782877585408e+17,
-    "train_loss": 5.960167303085327,
-    "train_runtime": 999.5097,
-    "train_samples": 3195,
-    "train_samples_per_second": 3.197,
-    "train_steps_per_second": 0.05
 }

 {
     "epoch": 1.0,
+    "total_flos": 6.191745166684979e+16,
+    "train_loss": 6.366142969865066,
+    "train_runtime": 588.67,
+    "train_samples": 1661,
+    "train_samples_per_second": 2.822,
+    "train_steps_per_second": 0.044
 }

trainer_state.json CHANGED Viewed

@@ -3,373 +3,205 @@
   "best_model_checkpoint": null,
   "epoch": 1.0,
   "eval_steps": 50,
-  "global_step": 50,
   "is_hyper_param_search": false,
   "is_local_process_zero": true,
   "is_world_process_zero": true,
   "log_history": [
     {
-      "epoch": 0.02,
-      "grad_norm": 725.3245239257812,
-      "learning_rate": 6.666666666666665e-08,
-      "loss": 6.7377,
       "step": 1
     },
     {
-      "epoch": 0.04,
-      "grad_norm": 725.6357421875,
-      "learning_rate": 1.333333333333333e-07,
-      "loss": 6.7392,
       "step": 2
     },
     {
-      "epoch": 0.06,
-      "grad_norm": 727.9962768554688,
-      "learning_rate": 2e-07,
-      "loss": 6.8578,
       "step": 3
     },
     {
-      "epoch": 0.08,
-      "grad_norm": 725.8609008789062,
-      "learning_rate": 1.997766878623153e-07,
-      "loss": 6.747,
       "step": 4
     },
     {
-      "epoch": 0.1,
-      "grad_norm": 723.0309448242188,
-      "learning_rate": 1.99107748815478e-07,
-      "loss": 6.6956,
       "step": 5
     },
     {
-      "epoch": 0.12,
-      "grad_norm": 724.730224609375,
-      "learning_rate": 1.9799617050365866e-07,
-      "loss": 6.7179,
       "step": 6
     },
     {
-      "epoch": 0.14,
-      "grad_norm": 737.9241333007812,
-      "learning_rate": 1.9644691750543764e-07,
-      "loss": 6.7365,
       "step": 7
     },
     {
-      "epoch": 0.16,
-      "grad_norm": 719.1249389648438,
-      "learning_rate": 1.9446690916079187e-07,
-      "loss": 6.6703,
       "step": 8
     },
     {
-      "epoch": 0.18,
-      "grad_norm": 717.1757202148438,
-      "learning_rate": 1.9206498866764286e-07,
-      "loss": 6.6361,
       "step": 9
     },
     {
-      "epoch": 0.2,
-      "grad_norm": 715.4703369140625,
-      "learning_rate": 1.8925188358598812e-07,
-      "loss": 6.6534,
       "step": 10
     },
     {
-      "epoch": 0.22,
-      "grad_norm": 715.7579345703125,
-      "learning_rate": 1.8604015792601393e-07,
-      "loss": 6.6334,
       "step": 11
     },
     {
-      "epoch": 0.24,
-      "grad_norm": 714.6165161132812,
-      "learning_rate": 1.8244415603417603e-07,
-      "loss": 6.6195,
       "step": 12
     },
     {
-      "epoch": 0.26,
-      "grad_norm": 699.39990234375,
-      "learning_rate": 1.784799385278661e-07,
-      "loss": 6.4717,
       "step": 13
     },
     {
-      "epoch": 0.28,
-      "grad_norm": 690.8859252929688,
-      "learning_rate": 1.7416521056479576e-07,
-      "loss": 6.473,
       "step": 14
     },
     {
-      "epoch": 0.3,
-      "grad_norm": 698.036865234375,
-      "learning_rate": 1.6951924276746424e-07,
-      "loss": 6.4254,
       "step": 15
     },
     {
-      "epoch": 0.32,
-      "grad_norm": 697.6669921875,
-      "learning_rate": 1.6456278515588022e-07,
-      "loss": 6.3941,
       "step": 16
     },
     {
-      "epoch": 0.34,
-      "grad_norm": 699.1641235351562,
-      "learning_rate": 1.593179744729355e-07,
-      "loss": 6.4053,
       "step": 17
     },
     {
-      "epoch": 0.36,
-      "grad_norm": 699.9284057617188,
-      "learning_rate": 1.5380823531633727e-07,
-      "loss": 6.4122,
       "step": 18
     },
     {
-      "epoch": 0.38,
-      "grad_norm": 690.093017578125,
-      "learning_rate": 1.4805817551866838e-07,
-      "loss": 6.3299,
       "step": 19
     },
     {
-      "epoch": 0.4,
-      "grad_norm": 682.5932006835938,
-      "learning_rate": 1.420934762428335e-07,
-      "loss": 6.3191,
       "step": 20
     },
     {
-      "epoch": 0.42,
-      "grad_norm": 689.18896484375,
-      "learning_rate": 1.3594077728375127e-07,
-      "loss": 6.3315,
       "step": 21
     },
     {
-      "epoch": 0.44,
-      "grad_norm": 685.6888427734375,
-      "learning_rate": 1.296275580885634e-07,
-      "loss": 6.2934,
       "step": 22
     },
     {
-      "epoch": 0.46,
-      "grad_norm": 680.367431640625,
-      "learning_rate": 1.2318201502675282e-07,
-      "loss": 6.233,
       "step": 23
     },
     {
-      "epoch": 0.48,
-      "grad_norm": 684.5553588867188,
-      "learning_rate": 1.1663293545831301e-07,
-      "loss": 6.2894,
       "step": 24
     },
     {
-      "epoch": 0.5,
-      "grad_norm": 673.451904296875,
-      "learning_rate": 1.1000956916240985e-07,
-      "loss": 6.0602,
-      "step": 25
-    },
-    {
-      "epoch": 0.52,
-      "grad_norm": 616.918701171875,
-      "learning_rate": 1.0334149770076746e-07,
-      "loss": 5.5602,
-      "step": 26
-    },
-    {
-      "epoch": 0.54,
-      "grad_norm": 597.7125244140625,
-      "learning_rate": 9.665850229923257e-08,
       "loss": 5.4898,
-      "step": 27
-    },
-    {
-      "epoch": 0.56,
-      "grad_norm": 596.3068237304688,
-      "learning_rate": 8.999043083759016e-08,
-      "loss": 5.5096,
-      "step": 28
-    },
-    {
-      "epoch": 0.58,
-      "grad_norm": 598.629638671875,
-      "learning_rate": 8.3367064541687e-08,
-      "loss": 5.4643,
-      "step": 29
-    },
-    {
-      "epoch": 0.6,
-      "grad_norm": 597.4500122070312,
-      "learning_rate": 7.681798497324716e-08,
-      "loss": 5.4378,
-      "step": 30
-    },
-    {
-      "epoch": 0.62,
-      "grad_norm": 595.5535278320312,
-      "learning_rate": 7.037244191143661e-08,
-      "loss": 5.4292,
-      "step": 31
-    },
-    {
-      "epoch": 0.64,
-      "grad_norm": 591.805419921875,
-      "learning_rate": 6.405922271624873e-08,
-      "loss": 5.4496,
-      "step": 32
-    },
-    {
-      "epoch": 0.66,
-      "grad_norm": 590.4281005859375,
-      "learning_rate": 5.790652375716652e-08,
-      "loss": 5.3804,
-      "step": 33
-    },
-    {
-      "epoch": 0.68,
-      "grad_norm": 592.5811157226562,
-      "learning_rate": 5.194182448133162e-08,
-      "loss": 5.4223,
-      "step": 34
-    },
-    {
-      "epoch": 0.7,
-      "grad_norm": 597.3646240234375,
-      "learning_rate": 4.6191764683662737e-08,
-      "loss": 5.4356,
-      "step": 35
-    },
-    {
-      "epoch": 0.72,
-      "grad_norm": 596.3975219726562,
-      "learning_rate": 4.0682025527064476e-08,
-      "loss": 5.4306,
-      "step": 36
-    },
-    {
-      "epoch": 0.74,
-      "grad_norm": 601.251953125,
-      "learning_rate": 3.543721484411975e-08,
-      "loss": 5.4372,
-      "step": 37
-    },
-    {
-      "epoch": 0.76,
-      "grad_norm": 589.8883056640625,
-      "learning_rate": 3.048075723253577e-08,
-      "loss": 5.366,
-      "step": 38
-    },
-    {
-      "epoch": 0.78,
-      "grad_norm": 588.721923828125,
-      "learning_rate": 2.583478943520424e-08,
-      "loss": 5.3937,
-      "step": 39
-    },
-    {
-      "epoch": 0.8,
-      "grad_norm": 589.63232421875,
-      "learning_rate": 2.1520061472133898e-08,
-      "loss": 5.3778,
-      "step": 40
-    },
-    {
-      "epoch": 0.82,
-      "grad_norm": 586.4066772460938,
-      "learning_rate": 1.7555843965823992e-08,
-      "loss": 5.3718,
-      "step": 41
-    },
-    {
-      "epoch": 0.84,
-      "grad_norm": 587.221923828125,
-      "learning_rate": 1.3959842073986083e-08,
-      "loss": 5.3872,
-      "step": 42
-    },
-    {
-      "epoch": 0.86,
-      "grad_norm": 583.54296875,
-      "learning_rate": 1.0748116414011887e-08,
-      "loss": 5.3537,
-      "step": 43
-    },
-    {
-      "epoch": 0.88,
-      "grad_norm": 585.6083374023438,
-      "learning_rate": 7.935011332357112e-09,
-      "loss": 5.3518,
-      "step": 44
-    },
-    {
-      "epoch": 0.9,
-      "grad_norm": 587.9339599609375,
-      "learning_rate": 5.5330908392081325e-09,
-      "loss": 5.3518,
-      "step": 45
-    },
-    {
-      "epoch": 0.92,
-      "grad_norm": 588.0381469726562,
-      "learning_rate": 3.553082494562354e-09,
-      "loss": 5.3738,
-      "step": 46
-    },
-    {
-      "epoch": 0.94,
-      "grad_norm": 585.6659545898438,
-      "learning_rate": 2.003829496341325e-09,
-      "loss": 5.3698,
-      "step": 47
-    },
-    {
-      "epoch": 0.96,
-      "grad_norm": 584.0103149414062,
-      "learning_rate": 8.922511845219971e-10,
-      "loss": 5.3456,
-      "step": 48
-    },
-    {
-      "epoch": 0.98,
-      "grad_norm": 584.41796875,
-      "learning_rate": 2.233121376846836e-10,
-      "loss": 5.3216,
-      "step": 49
     },
     {
       "epoch": 1.0,
-      "grad_norm": 586.73974609375,
       "learning_rate": 0.0,
-      "loss": 5.3149,
-      "step": 50
     },
     {
       "epoch": 1.0,
-      "step": 50,
-      "total_flos": 1.13782877585408e+17,
-      "train_loss": 5.960167303085327,
-      "train_runtime": 999.5097,
-      "train_samples_per_second": 3.197,
-      "train_steps_per_second": 0.05
     }
   ],
   "logging_steps": 1,
-  "max_steps": 50,
   "num_input_tokens_seen": 0,
   "num_train_epochs": 1,
   "save_steps": 100,
@@ -385,7 +217,7 @@
       "attributes": {}
     }
   },
-  "total_flos": 1.13782877585408e+17,
   "train_batch_size": 4,
   "trial_name": null,
   "trial_params": null

   "best_model_checkpoint": null,
   "epoch": 1.0,
   "eval_steps": 50,
+  "global_step": 26,
   "is_hyper_param_search": false,
   "is_local_process_zero": true,
   "is_world_process_zero": true,
   "log_history": [
     {
+      "epoch": 0.038461538461538464,
+      "grad_norm": 779.536865234375,
+      "learning_rate": 1.5e-07,
+      "loss": 6.8089,
       "step": 1
     },
     {
+      "epoch": 0.07692307692307693,
+      "grad_norm": 779.4625854492188,
+      "learning_rate": 3e-07,
+      "loss": 6.8084,
       "step": 2
     },
     {
+      "epoch": 0.11538461538461539,
+      "grad_norm": 780.5224609375,
+      "learning_rate": 2.987167292060716e-07,
+      "loss": 6.8417,
       "step": 3
     },
     {
+      "epoch": 0.15384615384615385,
+      "grad_norm": 779.1381225585938,
+      "learning_rate": 2.9488887394336024e-07,
+      "loss": 6.8013,
       "step": 4
     },
     {
+      "epoch": 0.19230769230769232,
+      "grad_norm": 777.2201538085938,
+      "learning_rate": 2.88581929876693e-07,
+      "loss": 6.7958,
       "step": 5
     },
     {
+      "epoch": 0.23076923076923078,
+      "grad_norm": 772.9076538085938,
+      "learning_rate": 2.799038105676658e-07,
+      "loss": 6.7356,
       "step": 6
     },
     {
+      "epoch": 0.2692307692307692,
+      "grad_norm": 771.179931640625,
+      "learning_rate": 2.6900300104368527e-07,
+      "loss": 6.7095,
       "step": 7
     },
     {
+      "epoch": 0.3076923076923077,
+      "grad_norm": 768.6026000976562,
+      "learning_rate": 2.560660171779821e-07,
+      "loss": 6.6932,
       "step": 8
     },
     {
+      "epoch": 0.34615384615384615,
+      "grad_norm": 765.8309326171875,
+      "learning_rate": 2.413142143513081e-07,
+      "loss": 6.8341,
       "step": 9
     },
     {
+      "epoch": 0.38461538461538464,
+      "grad_norm": 750.5145263671875,
+      "learning_rate": 2.25e-07,
+      "loss": 6.4932,
       "step": 10
     },
     {
+      "epoch": 0.4230769230769231,
+      "grad_norm": 748.4296264648438,
+      "learning_rate": 2.0740251485476348e-07,
+      "loss": 6.4684,
       "step": 11
     },
     {
+      "epoch": 0.46153846153846156,
+      "grad_norm": 748.2885131835938,
+      "learning_rate": 1.888228567653781e-07,
+      "loss": 6.4583,
       "step": 12
     },
     {
+      "epoch": 0.5,
+      "grad_norm": 753.1572265625,
+      "learning_rate": 1.6957892883300777e-07,
+      "loss": 6.465,
       "step": 13
     },
     {
+      "epoch": 0.5384615384615384,
+      "grad_norm": 742.08349609375,
+      "learning_rate": 1.5e-07,
+      "loss": 6.382,
       "step": 14
     },
     {
+      "epoch": 0.5769230769230769,
+      "grad_norm": 737.9681396484375,
+      "learning_rate": 1.304210711669923e-07,
+      "loss": 6.4209,
       "step": 15
     },
     {
+      "epoch": 0.6153846153846154,
+      "grad_norm": 732.2109985351562,
+      "learning_rate": 1.1117714323462187e-07,
+      "loss": 6.3742,
       "step": 16
     },
     {
+      "epoch": 0.6538461538461539,
+      "grad_norm": 735.941162109375,
+      "learning_rate": 9.259748514523654e-08,
+      "loss": 6.2748,
       "step": 17
     },
     {
+      "epoch": 0.6923076923076923,
+      "grad_norm": 739.151611328125,
+      "learning_rate": 7.500000000000004e-08,
+      "loss": 6.3653,
       "step": 18
     },
     {
+      "epoch": 0.7307692307692307,
+      "grad_norm": 733.9830322265625,
+      "learning_rate": 5.86857856486919e-08,
+      "loss": 6.3297,
       "step": 19
     },
     {
+      "epoch": 0.7692307692307693,
+      "grad_norm": 731.849609375,
+      "learning_rate": 4.3933982822017876e-08,
+      "loss": 6.3471,
       "step": 20
     },
     {
+      "epoch": 0.8076923076923077,
+      "grad_norm": 740.6678466796875,
+      "learning_rate": 3.099699895631474e-08,
+      "loss": 6.3124,
       "step": 21
     },
     {
+      "epoch": 0.8461538461538461,
+      "grad_norm": 676.9359130859375,
+      "learning_rate": 2.0096189432334193e-08,
+      "loss": 5.7721,
       "step": 22
     },
     {
+      "epoch": 0.8846153846153846,
+      "grad_norm": 656.2088623046875,
+      "learning_rate": 1.141807012330699e-08,
+      "loss": 5.547,
       "step": 23
     },
     {
+      "epoch": 0.9230769230769231,
+      "grad_norm": 655.7007446289062,
+      "learning_rate": 5.1111260566397696e-09,
+      "loss": 5.4868,
       "step": 24
     },
     {
+      "epoch": 0.9615384615384616,
+      "grad_norm": 654.3903198242188,
+      "learning_rate": 1.2832707939284427e-09,
       "loss": 5.4898,
+      "step": 25
     },
     {
       "epoch": 1.0,
+      "grad_norm": 651.3972778320312,
       "learning_rate": 0.0,
+      "loss": 5.5044,
+      "step": 26
     },
     {
       "epoch": 1.0,
+      "step": 26,
+      "total_flos": 6.191745166684979e+16,
+      "train_loss": 6.366142969865066,
+      "train_runtime": 588.67,
+      "train_samples_per_second": 2.822,
+      "train_steps_per_second": 0.044
     }
   ],
   "logging_steps": 1,
+  "max_steps": 26,
   "num_input_tokens_seen": 0,
   "num_train_epochs": 1,
   "save_steps": 100,
       "attributes": {}
     }
   },
+  "total_flos": 6.191745166684979e+16,
   "train_batch_size": 4,
   "trial_name": null,
   "trial_params": null