Instructions to use wnwu/Qwen3.5-9B-gelv-poet with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use wnwu/Qwen3.5-9B-gelv-poet with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="wnwu/Qwen3.5-9B-gelv-poet") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("wnwu/Qwen3.5-9B-gelv-poet") model = AutoModelForMultimodalLM.from_pretrained("wnwu/Qwen3.5-9B-gelv-poet") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - llama-cpp-python
How to use wnwu/Qwen3.5-9B-gelv-poet with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="wnwu/Qwen3.5-9B-gelv-poet", filename="Qwen3.5-9B-gelv-poet.Q8_0.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use wnwu/Qwen3.5-9B-gelv-poet with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf wnwu/Qwen3.5-9B-gelv-poet:Q8_0 # Run inference directly in the terminal: llama-cli -hf wnwu/Qwen3.5-9B-gelv-poet:Q8_0
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf wnwu/Qwen3.5-9B-gelv-poet:Q8_0 # Run inference directly in the terminal: llama-cli -hf wnwu/Qwen3.5-9B-gelv-poet:Q8_0
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf wnwu/Qwen3.5-9B-gelv-poet:Q8_0 # Run inference directly in the terminal: ./llama-cli -hf wnwu/Qwen3.5-9B-gelv-poet:Q8_0
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf wnwu/Qwen3.5-9B-gelv-poet:Q8_0 # Run inference directly in the terminal: ./build/bin/llama-cli -hf wnwu/Qwen3.5-9B-gelv-poet:Q8_0
Use Docker
docker model run hf.co/wnwu/Qwen3.5-9B-gelv-poet:Q8_0
- LM Studio
- Jan
- vLLM
How to use wnwu/Qwen3.5-9B-gelv-poet with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "wnwu/Qwen3.5-9B-gelv-poet" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "wnwu/Qwen3.5-9B-gelv-poet", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/wnwu/Qwen3.5-9B-gelv-poet:Q8_0
- SGLang
How to use wnwu/Qwen3.5-9B-gelv-poet with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "wnwu/Qwen3.5-9B-gelv-poet" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "wnwu/Qwen3.5-9B-gelv-poet", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "wnwu/Qwen3.5-9B-gelv-poet" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "wnwu/Qwen3.5-9B-gelv-poet", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Ollama
How to use wnwu/Qwen3.5-9B-gelv-poet with Ollama:
ollama run hf.co/wnwu/Qwen3.5-9B-gelv-poet:Q8_0
- Unsloth Studio
How to use wnwu/Qwen3.5-9B-gelv-poet with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for wnwu/Qwen3.5-9B-gelv-poet to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for wnwu/Qwen3.5-9B-gelv-poet to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for wnwu/Qwen3.5-9B-gelv-poet to start chatting
- Pi
How to use wnwu/Qwen3.5-9B-gelv-poet with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf wnwu/Qwen3.5-9B-gelv-poet:Q8_0
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "wnwu/Qwen3.5-9B-gelv-poet:Q8_0" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use wnwu/Qwen3.5-9B-gelv-poet with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf wnwu/Qwen3.5-9B-gelv-poet:Q8_0
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default wnwu/Qwen3.5-9B-gelv-poet:Q8_0
Run Hermes
hermes
- Atomic Chat new
- Docker Model Runner
How to use wnwu/Qwen3.5-9B-gelv-poet with Docker Model Runner:
docker model run hf.co/wnwu/Qwen3.5-9B-gelv-poet:Q8_0
- Lemonade
How to use wnwu/Qwen3.5-9B-gelv-poet with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull wnwu/Qwen3.5-9B-gelv-poet:Q8_0
Run and chat with the model
lemonade run user.Qwen3.5-9B-gelv-poet-Q8_0
List all available models
lemonade list
Configure the model in Pi
# Install Pi:
npm install -g @mariozechner/pi-coding-agent# Add to ~/.pi/agent/models.json:
{
"providers": {
"llama-cpp": {
"baseUrl": "http://localhost:8080/v1",
"api": "openai-completions",
"apiKey": "none",
"models": [
{
"id": "wnwu/Qwen3.5-9B-gelv-poet:Q8_0"
}
]
}
}
}Run Pi
# Start Pi in your project directory:
piQwen3.5-9B 格律诗模型 (Gelv Poet)
基于 Qwen3.5-9B 微调的中国古典格律诗生成模型。严格遵循平仄格律规则,能够创作五言/七言绝句和律诗。
模型特点
- 严格遵循「二四六分明」的平仄规则
- 支持五言绝句、七言绝句、五言律诗、七言律诗
- 律诗颔联颈联对仗工整
- 偶数句押韵
训练细节
- 基座模型: Qwen/Qwen3.5-9B
- 方法: Unsloth + QLoRA (4-bit, LoRA r=64, alpha=64)
- 数据: 从chinese-poetry中筛选的24,525首严格合规格律诗,生成110,060条训练样本
- 硬件: RTX 4090 24GB
- 训练轮数: 3 epochs (10,320 steps)
- 最终 eval_loss: 0.094
评估结果(结构评分)
| 诗体 | 得分 |
|---|---|
| 五言绝句 | 0.90 |
| 七言绝句 | 1.00 |
| 五言律诗 | 0.97 |
| 七言律诗 | 1.00 |
| 整体 | 0.97 |
使用方法
Python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "wnwu/Qwen3.5-9B-gelv-poet"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_name, device_map="auto", torch_dtype=torch.bfloat16, trust_remote_code=True,
)
model.eval()
SYSTEM_PROMPT = (
"你是一位精通中国古典诗词格律的诗人。你严格遵循平仄格律规则,"
"擅长创作五言律诗、七言律诗、五言绝句、七言绝句等格律诗。"
"你熟知「二四六分明」的平仄规则,懂得对仗、押韵的要求。"
)
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": "请以「秋夜」为题,写一首严格符合格律的七言律诗。\n\n请直接输出诗句。"},
]
text = tokenizer.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True, enable_thinking=False,
)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs, max_new_tokens=256, temperature=0.7,
top_p=0.9, top_k=50, do_sample=True, repetition_penalty=1.15,
)
new_tokens = outputs[0][inputs["input_ids"].shape[1]:]
print(tokenizer.decode(new_tokens, skip_special_tokens=True))
交互式推理
python inference.py --model wnwu/Qwen3.5-9B-gelv-poet
Ollama (GGUF)
GGUF Q8_0 量化版本可从本仓库的 gguf 分支下载。
生成示例
七言绝句「塞下曲」:
寒塞無因見落梅,胡人吹入內宮來。 君恩如水東流去,應與春風豈復迴。
五言律诗「山居」:
欲出還中止,微陰却快晴。 檻花栽盡活,籠鳥教初鳴。 身寄江湖久,心知富貴輕。 還家雖有日,遠宦尚餘生。
五言绝句「雪夜」:
酒力欺寒淺,心清睡較遲。 梅花擎雪影,和月度疏籬。
- Downloads last month
- 11
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp# Start a local OpenAI-compatible server: llama-server -hf wnwu/Qwen3.5-9B-gelv-poet:Q8_0