How to use from
SGLang
Install from pip and serve model
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "dnotitia/Smoothie-Qwen3-32B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dnotitia/Smoothie-Qwen3-32B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'
Use Docker images
docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "dnotitia/Smoothie-Qwen3-32B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dnotitia/Smoothie-Qwen3-32B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'
Quick Links

Smoothie Qwen

Smoothie Qwen is a lightweight adjustment tool that smooths token probabilities in Qwen and similar models, enhancing balanced multilingual generation capabilities. For more details, please refer to https://github.com/dnotitia/smoothie-qwen.

Configuration

  • Base model: Qwen/Qwen3-32B
  • Minimum scale factor: 0.5
  • Smoothness: 10.0
  • Sample size: 1000
  • Window size: 4
  • N-gram weights: [0.5, 0.3, 0.2]

Unicode Ranges

  • Range 1: 0x4e00 - 0x9fff
  • Range 2: 0x3400 - 0x4dbf
  • Range 3: 0x20000 - 0x2a6df
  • Range 4: 0xf900 - 0xfaff
  • Range 5: 0x2e80 - 0x2eff
  • Range 6: 0x2f00 - 0x2fdf
  • Range 7: 0x2ff0 - 0x2fff
  • Range 8: 0x3000 - 0x303f
  • Range 9: 0x31c0 - 0x31ef
  • Range 10: 0x3200 - 0x32ff
  • Range 11: 0x3300 - 0x33ff

Statistics

  • Target tokens: 26,153
  • Broken tokens: 1,457
  • Modified tokens: 27,564
Downloads last month
22
Safetensors
Model size
33B params
Tensor type
BF16
·
Inference Providers NEW
Input a message to start chatting with dnotitia/Smoothie-Qwen3-32B.

Model tree for dnotitia/Smoothie-Qwen3-32B

Base model

Qwen/Qwen3-32B
Finetuned
(507)
this model
Finetunes
1 model
Quantizations
4 models

Collection including dnotitia/Smoothie-Qwen3-32B