How to use from
vLLM
Install from pip and serve model
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "taki555/Qwen3-0.6B-Art"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "taki555/Qwen3-0.6B-Art",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'
Use Docker
docker model run hf.co/taki555/Qwen3-0.6B-Art
Quick Links

Qwen3-0.6B-Art

This is the Chain-of-Thought (CoT) efficient version of the Qwen3-0.6B model, trained on the DeepScaleR-Easy dataset.

This model was introduced in the paper The Art of Efficient Reasoning: Data, Reward, and Optimization. Check the Project Page for more details.

Model Description

Large Language Models (LLMs) consistently benefit from scaled Chain-of-Thought (CoT) reasoning but also suffer from heavy computational overhead. Efficient reasoning aims to incentivize short yet accurate thinking trajectories, typically through reward shaping with Reinforcement Learning (RL).

This model follows a two-stage training paradigm: length adaptation and reasoning refinement. It is optimized to maintain a sufficient density of positive reward signals while avoiding the "short-is-correct" trap, demonstrating robust and generalized efficient reasoning capabilities.

Citation

@inproceedings{wu2026art,
  title={The Art of Efficient Reasoning: Data, Reward, and Optimization},
  author={Taiqiang Wu and Zenan Xu and Bo Zhou and Ngai Wong},
  year={2026},
  url={https://arxiv.org/pdf/2602.20945}
}
Downloads last month
2
Safetensors
Model size
0.8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 2 Ask for provider support

Model tree for taki555/Qwen3-0.6B-Art

Finetuned
Qwen/Qwen3-0.6B
Finetuned
(1003)
this model

Dataset used to train taki555/Qwen3-0.6B-Art

Collection including taki555/Qwen3-0.6B-Art

Paper for taki555/Qwen3-0.6B-Art