Instructions to use Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink")
model = AutoModelForMultimodalLM.from_pretrained("Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink

SGLang

How to use Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink with Docker Model Runner:
```
docker model run hf.co/Leeyuyu/Qwen2.5-SFT2-GRPO-fundo-nothink
```

Leeyuyu commited on Aug 7, 2025

Commit

fa34124

verified ·

1 Parent(s): 0d26a6b

Model save

Browse files

Files changed (4) hide show

README.md +2 -5
all_results.json +4 -4
train_results.json +4 -4
trainer_state.json +82 -82

README.md CHANGED Viewed

@@ -1,11 +1,8 @@
 ---
-datasets: Leeyuyu/fundo_600
 library_name: transformers
 model_name: Qwen2.5-SFT2-GRPO-fundo-nothink
 tags:
 - generated_from_trainer
-- R1-V
-- balanced-filtered-0-2-100pct-others-20pct
 - trl
 - sft
 licence: license
@@ -13,7 +10,7 @@ licence: license
 # Model Card for Qwen2.5-SFT2-GRPO-fundo-nothink
-This model is a fine-tuned version of [None](https://huggingface.co/None) on the [Leeyuyu/fundo_600](https://huggingface.co/datasets/Leeyuyu/fundo_600) dataset.
 It has been trained using [TRL](https://github.com/huggingface/trl).
 ## Quick start
@@ -29,7 +26,7 @@ print(output["generated_text"])
 ## Training procedure
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/brightlight720720_lee/huggingface/runs/tc404qff)
 This model was trained with SFT.

 ---
 library_name: transformers
 model_name: Qwen2.5-SFT2-GRPO-fundo-nothink
 tags:
 - generated_from_trainer
 - trl
 - sft
 licence: license
 # Model Card for Qwen2.5-SFT2-GRPO-fundo-nothink
+This model is a fine-tuned version of [None](https://huggingface.co/None).
 It has been trained using [TRL](https://github.com/huggingface/trl).
 ## Quick start
 ## Training procedure
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/brightlight720720_lee/huggingface/runs/womcol2v)
 This model was trained with SFT.

all_results.json CHANGED Viewed

@@ -1,9 +1,9 @@
 {
     "epoch": 1.0,
-    "total_flos": 6.191745166684979e+16,
-    "train_loss": 2.359536478152642,
-    "train_runtime": 593.7041,
     "train_samples": 1661,
-    "train_samples_per_second": 2.798,
     "train_steps_per_second": 0.044
 }

 {
     "epoch": 1.0,
+    "total_flos": 6.212373250362573e+16,
+    "train_loss": 1.2203706781594799,
+    "train_runtime": 584.4858,
     "train_samples": 1661,
+    "train_samples_per_second": 2.842,
     "train_steps_per_second": 0.044
 }

train_results.json CHANGED Viewed

@@ -1,9 +1,9 @@
 {
     "epoch": 1.0,
-    "total_flos": 6.191745166684979e+16,
-    "train_loss": 2.359536478152642,
-    "train_runtime": 593.7041,
     "train_samples": 1661,
-    "train_samples_per_second": 2.798,
     "train_steps_per_second": 0.044
 }

 {
     "epoch": 1.0,
+    "total_flos": 6.212373250362573e+16,
+    "train_loss": 1.2203706781594799,
+    "train_runtime": 584.4858,
     "train_samples": 1661,
+    "train_samples_per_second": 2.842,
     "train_steps_per_second": 0.044
 }

trainer_state.json CHANGED Viewed

@@ -10,193 +10,193 @@
   "log_history": [
     {
       "epoch": 0.038461538461538464,
-      "grad_norm": 779.4563598632812,
-      "learning_rate": 1.5e-06,
-      "loss": 6.8089,
       "step": 1
     },
     {
       "epoch": 0.07692307692307693,
-      "grad_norm": 779.59716796875,
-      "learning_rate": 3e-06,
-      "loss": 6.8084,
       "step": 2
     },
     {
       "epoch": 0.11538461538461539,
-      "grad_norm": 764.6455688476562,
-      "learning_rate": 2.9871672920607156e-06,
-      "loss": 6.6899,
       "step": 3
     },
     {
       "epoch": 0.15384615384615385,
-      "grad_norm": 647.0328369140625,
-      "learning_rate": 2.9488887394336023e-06,
-      "loss": 5.3962,
       "step": 4
     },
     {
       "epoch": 0.19230769230769232,
-      "grad_norm": 602.569091796875,
-      "learning_rate": 2.88581929876693e-06,
-      "loss": 5.056,
       "step": 5
     },
     {
       "epoch": 0.23076923076923078,
-      "grad_norm": 269.05499267578125,
-      "learning_rate": 2.7990381056766585e-06,
-      "loss": 2.8926,
       "step": 6
     },
     {
       "epoch": 0.2692307692307692,
-      "grad_norm": 197.8215789794922,
-      "learning_rate": 2.690030010436853e-06,
-      "loss": 2.6136,
       "step": 7
     },
     {
       "epoch": 0.3076923076923077,
-      "grad_norm": 17.3836727142334,
-      "learning_rate": 2.5606601717798212e-06,
-      "loss": 1.9375,
       "step": 8
     },
     {
       "epoch": 0.34615384615384615,
-      "grad_norm": 16.178003311157227,
-      "learning_rate": 2.4131421435130812e-06,
-      "loss": 1.8646,
       "step": 9
     },
     {
       "epoch": 0.38461538461538464,
-      "grad_norm": 14.688905715942383,
-      "learning_rate": 2.25e-06,
-      "loss": 1.803,
       "step": 10
     },
     {
       "epoch": 0.4230769230769231,
-      "grad_norm": 13.383400917053223,
-      "learning_rate": 2.074025148547635e-06,
-      "loss": 1.7388,
       "step": 11
     },
     {
       "epoch": 0.46153846153846156,
-      "grad_norm": 12.454666137695312,
-      "learning_rate": 1.888228567653781e-06,
-      "loss": 1.6809,
       "step": 12
     },
     {
       "epoch": 0.5,
-      "grad_norm": 11.727381706237793,
-      "learning_rate": 1.6957892883300778e-06,
-      "loss": 1.6406,
       "step": 13
     },
     {
       "epoch": 0.5384615384615384,
-      "grad_norm": 11.150338172912598,
-      "learning_rate": 1.5e-06,
-      "loss": 1.5987,
       "step": 14
     },
     {
       "epoch": 0.5769230769230769,
-      "grad_norm": 9.359735488891602,
-      "learning_rate": 1.304210711669923e-06,
-      "loss": 1.3709,
       "step": 15
     },
     {
       "epoch": 0.6153846153846154,
-      "grad_norm": 8.161463737487793,
-      "learning_rate": 1.1117714323462188e-06,
-      "loss": 1.2417,
       "step": 16
     },
     {
       "epoch": 0.6538461538461539,
-      "grad_norm": 6.2837347984313965,
-      "learning_rate": 9.259748514523654e-07,
-      "loss": 1.1567,
       "step": 17
     },
     {
       "epoch": 0.6923076923076923,
-      "grad_norm": 6.563652992248535,
-      "learning_rate": 7.500000000000003e-07,
-      "loss": 1.1046,
       "step": 18
     },
     {
       "epoch": 0.7307692307692307,
-      "grad_norm": 7.914792060852051,
-      "learning_rate": 5.868578564869191e-07,
-      "loss": 1.0653,
       "step": 19
     },
     {
       "epoch": 0.7692307692307693,
-      "grad_norm": 6.446779251098633,
-      "learning_rate": 4.3933982822017883e-07,
-      "loss": 1.0332,
       "step": 20
     },
     {
       "epoch": 0.8076923076923077,
-      "grad_norm": 6.5320000648498535,
-      "learning_rate": 3.0996998956314745e-07,
-      "loss": 1.007,
       "step": 21
     },
     {
       "epoch": 0.8461538461538461,
-      "grad_norm": 5.943300247192383,
-      "learning_rate": 2.0096189432334195e-07,
-      "loss": 0.9884,
       "step": 22
     },
     {
       "epoch": 0.8846153846153846,
-      "grad_norm": 5.847461700439453,
-      "learning_rate": 1.141807012330699e-07,
-      "loss": 0.9728,
       "step": 23
     },
     {
       "epoch": 0.9230769230769231,
-      "grad_norm": 5.818417549133301,
-      "learning_rate": 5.11112605663977e-08,
-      "loss": 0.9635,
       "step": 24
     },
     {
       "epoch": 0.9615384615384616,
-      "grad_norm": 5.898157119750977,
-      "learning_rate": 1.2832707939284426e-08,
-      "loss": 0.9585,
       "step": 25
     },
     {
       "epoch": 1.0,
-      "grad_norm": 6.020485877990723,
       "learning_rate": 0.0,
-      "loss": 0.9556,
       "step": 26
     },
     {
       "epoch": 1.0,
       "step": 26,
-      "total_flos": 6.191745166684979e+16,
-      "train_loss": 2.359536478152642,
-      "train_runtime": 593.7041,
-      "train_samples_per_second": 2.798,
       "train_steps_per_second": 0.044
     }
   ],
@@ -217,7 +217,7 @@
       "attributes": {}
     }
   },
-  "total_flos": 6.191745166684979e+16,
   "train_batch_size": 4,
   "trial_name": null,
   "trial_params": null

   "log_history": [
     {
       "epoch": 0.038461538461538464,
+      "grad_norm": 775.9435424804688,
+      "learning_rate": 3.3333333333333333e-06,
+      "loss": 6.788,
       "step": 1
     },
     {
       "epoch": 0.07692307692307693,
+      "grad_norm": 776.0359497070312,
+      "learning_rate": 6.666666666666667e-06,
+      "loss": 6.7875,
       "step": 2
     },
     {
       "epoch": 0.11538461538461539,
+      "grad_norm": 741.0389404296875,
+      "learning_rate": 1e-05,
+      "loss": 6.3773,
       "step": 3
     },
     {
       "epoch": 0.15384615384615385,
+      "grad_norm": 272.2483825683594,
+      "learning_rate": 9.953429730181653e-06,
+      "loss": 2.9091,
       "step": 4
     },
     {
       "epoch": 0.19230769230769232,
+      "grad_norm": 17.248239517211914,
+      "learning_rate": 9.814586436738998e-06,
+      "loss": 1.9227,
       "step": 5
     },
     {
       "epoch": 0.23076923076923078,
+      "grad_norm": 13.99523639678955,
+      "learning_rate": 9.586056507527266e-06,
+      "loss": 1.7623,
       "step": 6
     },
     {
       "epoch": 0.2692307692307692,
+      "grad_norm": 7.265631675720215,
+      "learning_rate": 9.272097022732444e-06,
+      "loss": 1.3012,
       "step": 7
     },
     {
       "epoch": 0.3076923076923077,
+      "grad_norm": 6.176580429077148,
+      "learning_rate": 8.8785564535221e-06,
+      "loss": 1.1133,
       "step": 8
     },
     {
       "epoch": 0.34615384615384615,
+      "grad_norm": 13.723352432250977,
+      "learning_rate": 8.412765716093273e-06,
+      "loss": 0.9558,
       "step": 9
     },
     {
       "epoch": 0.38461538461538464,
+      "grad_norm": 7.0516743659973145,
+      "learning_rate": 7.883401610574338e-06,
+      "loss": 0.6683,
       "step": 10
     },
     {
       "epoch": 0.4230769230769231,
+      "grad_norm": 4.867112636566162,
+      "learning_rate": 7.300325188655762e-06,
+      "loss": 0.3948,
       "step": 11
     },
     {
       "epoch": 0.46153846153846156,
+      "grad_norm": 4.434175491333008,
+      "learning_rate": 6.674398060854931e-06,
+      "loss": 0.2383,
       "step": 12
     },
     {
       "epoch": 0.5,
+      "grad_norm": 6.5230712890625,
+      "learning_rate": 6.0172800652631706e-06,
+      "loss": 0.12,
       "step": 13
     },
     {
       "epoch": 0.5384615384615384,
+      "grad_norm": 1.5070806741714478,
+      "learning_rate": 5.341212066823356e-06,
+      "loss": 0.0766,
       "step": 14
     },
     {
       "epoch": 0.5769230769230769,
+      "grad_norm": 0.9546927809715271,
+      "learning_rate": 4.6587879331766465e-06,
+      "loss": 0.0501,
       "step": 15
     },
     {
       "epoch": 0.6153846153846154,
+      "grad_norm": 1.0180060863494873,
+      "learning_rate": 3.982719934736832e-06,
+      "loss": 0.0352,
       "step": 16
     },
     {
       "epoch": 0.6538461538461539,
+      "grad_norm": 0.9739271998405457,
+      "learning_rate": 3.3256019391450696e-06,
+      "loss": 0.0303,
       "step": 17
     },
     {
       "epoch": 0.6923076923076923,
+      "grad_norm": 0.6057882905006409,
+      "learning_rate": 2.6996748113442397e-06,
+      "loss": 0.0273,
       "step": 18
     },
     {
       "epoch": 0.7307692307692307,
+      "grad_norm": 1.0546596050262451,
+      "learning_rate": 2.1165983894256647e-06,
+      "loss": 0.0248,
       "step": 19
     },
     {
       "epoch": 0.7692307692307693,
+      "grad_norm": 1.1213421821594238,
+      "learning_rate": 1.5872342839067305e-06,
+      "loss": 0.0232,
       "step": 20
     },
     {
       "epoch": 0.8076923076923077,
+      "grad_norm": 0.8204516172409058,
+      "learning_rate": 1.1214435464779006e-06,
+      "loss": 0.0218,
       "step": 21
     },
     {
       "epoch": 0.8461538461538461,
+      "grad_norm": 0.41814181208610535,
+      "learning_rate": 7.279029772675572e-07,
+      "loss": 0.0209,
       "step": 22
     },
     {
       "epoch": 0.8846153846153846,
+      "grad_norm": 0.59604811668396,
+      "learning_rate": 4.139434924727359e-07,
+      "loss": 0.0207,
       "step": 23
     },
     {
       "epoch": 0.9230769230769231,
+      "grad_norm": 0.43155673146247864,
+      "learning_rate": 1.8541356326100436e-07,
+      "loss": 0.0202,
       "step": 24
     },
     {
       "epoch": 0.9615384615384616,
+      "grad_norm": 0.3965984284877777,
+      "learning_rate": 4.657026981834623e-08,
+      "loss": 0.0198,
       "step": 25
     },
     {
       "epoch": 1.0,
+      "grad_norm": 0.37188079953193665,
       "learning_rate": 0.0,
+      "loss": 0.0199,
       "step": 26
     },
     {
       "epoch": 1.0,
       "step": 26,
+      "total_flos": 6.212373250362573e+16,
+      "train_loss": 1.2203706781594799,
+      "train_runtime": 584.4858,
+      "train_samples_per_second": 2.842,
       "train_steps_per_second": 0.044
     }
   ],
       "attributes": {}
     }
   },
+  "total_flos": 6.212373250362573e+16,
   "train_batch_size": 4,
   "trial_name": null,
   "trial_params": null