Image-Text-to-Text
GGUF
English
Chinese
Korean
qwen3_5
unsloth
qwen
qwen3.5
reasoning
chain-of-thought
lora
competitive-programming
conversational
Instructions to use diogoxiang/Qwopus3.5-9B-v3-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use diogoxiang/Qwopus3.5-9B-v3-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="diogoxiang/Qwopus3.5-9B-v3-GGUF", filename="Qwen3.5-9B.BF16.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use diogoxiang/Qwopus3.5-9B-v3-GGUF with llama.cpp:
Install (macOS, Linux)
curl -LsSf https://llama.app/install.sh | sh # Start a local OpenAI-compatible server with a web UI: llama serve -hf diogoxiang/Qwopus3.5-9B-v3-GGUF:Q4_K_M # Run inference directly in the terminal: llama cli -hf diogoxiang/Qwopus3.5-9B-v3-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama serve -hf diogoxiang/Qwopus3.5-9B-v3-GGUF:Q4_K_M # Run inference directly in the terminal: llama cli -hf diogoxiang/Qwopus3.5-9B-v3-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf diogoxiang/Qwopus3.5-9B-v3-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf diogoxiang/Qwopus3.5-9B-v3-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf diogoxiang/Qwopus3.5-9B-v3-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf diogoxiang/Qwopus3.5-9B-v3-GGUF:Q4_K_M
Use Docker
docker model run hf.co/diogoxiang/Qwopus3.5-9B-v3-GGUF:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use diogoxiang/Qwopus3.5-9B-v3-GGUF with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "diogoxiang/Qwopus3.5-9B-v3-GGUF" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "diogoxiang/Qwopus3.5-9B-v3-GGUF", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/diogoxiang/Qwopus3.5-9B-v3-GGUF:Q4_K_M
- Ollama
How to use diogoxiang/Qwopus3.5-9B-v3-GGUF with Ollama:
ollama run hf.co/diogoxiang/Qwopus3.5-9B-v3-GGUF:Q4_K_M
- Unsloth Studio
How to use diogoxiang/Qwopus3.5-9B-v3-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for diogoxiang/Qwopus3.5-9B-v3-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for diogoxiang/Qwopus3.5-9B-v3-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for diogoxiang/Qwopus3.5-9B-v3-GGUF to start chatting
- Pi
How to use diogoxiang/Qwopus3.5-9B-v3-GGUF with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama serve -hf diogoxiang/Qwopus3.5-9B-v3-GGUF:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "diogoxiang/Qwopus3.5-9B-v3-GGUF:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use diogoxiang/Qwopus3.5-9B-v3-GGUF with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama serve -hf diogoxiang/Qwopus3.5-9B-v3-GGUF:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default diogoxiang/Qwopus3.5-9B-v3-GGUF:Q4_K_M
Run Hermes
hermes
- Atomic Chat new
- Docker Model Runner
How to use diogoxiang/Qwopus3.5-9B-v3-GGUF with Docker Model Runner:
docker model run hf.co/diogoxiang/Qwopus3.5-9B-v3-GGUF:Q4_K_M
- Lemonade
How to use diogoxiang/Qwopus3.5-9B-v3-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull diogoxiang/Qwopus3.5-9B-v3-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.Qwopus3.5-9B-v3-GGUF-Q4_K_M
List all available models
lemonade list
Commit ·
e28cb40
0
Parent(s):
Duplicate from Jackrong/Qwopus3.5-9B-v3-GGUF
Browse filesCo-authored-by: JIRONG <Jackrong@users.noreply.huggingface.co>
- .gitattributes +42 -0
- Qwen3.5-9B.BF16.gguf +3 -0
- Qwen3.5-9B.Q4_K_M.gguf +3 -0
- Qwen3.5-9B.Q5_K_S.gguf +3 -0
- Qwen3.5-9B.Q6_K.gguf +3 -0
- Qwen3.5-9B.Q8_0.gguf +3 -0
- README.md +178 -0
- config.json +114 -0
- mmproj.gguf +3 -0
.gitattributes
ADDED
|
@@ -0,0 +1,42 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
*.7z filter=lfs diff=lfs merge=lfs -text
|
| 2 |
+
*.arrow filter=lfs diff=lfs merge=lfs -text
|
| 3 |
+
*.bin filter=lfs diff=lfs merge=lfs -text
|
| 4 |
+
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
| 5 |
+
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
| 6 |
+
*.ftz filter=lfs diff=lfs merge=lfs -text
|
| 7 |
+
*.gz filter=lfs diff=lfs merge=lfs -text
|
| 8 |
+
*.h5 filter=lfs diff=lfs merge=lfs -text
|
| 9 |
+
*.joblib filter=lfs diff=lfs merge=lfs -text
|
| 10 |
+
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
| 11 |
+
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
| 12 |
+
*.model filter=lfs diff=lfs merge=lfs -text
|
| 13 |
+
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
| 14 |
+
*.npy filter=lfs diff=lfs merge=lfs -text
|
| 15 |
+
*.npz filter=lfs diff=lfs merge=lfs -text
|
| 16 |
+
*.onnx filter=lfs diff=lfs merge=lfs -text
|
| 17 |
+
*.ot filter=lfs diff=lfs merge=lfs -text
|
| 18 |
+
*.parquet filter=lfs diff=lfs merge=lfs -text
|
| 19 |
+
*.pb filter=lfs diff=lfs merge=lfs -text
|
| 20 |
+
*.pickle filter=lfs diff=lfs merge=lfs -text
|
| 21 |
+
*.pkl filter=lfs diff=lfs merge=lfs -text
|
| 22 |
+
*.pt filter=lfs diff=lfs merge=lfs -text
|
| 23 |
+
*.pth filter=lfs diff=lfs merge=lfs -text
|
| 24 |
+
*.rar filter=lfs diff=lfs merge=lfs -text
|
| 25 |
+
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
| 26 |
+
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
| 27 |
+
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
| 28 |
+
*.tar filter=lfs diff=lfs merge=lfs -text
|
| 29 |
+
*.tflite filter=lfs diff=lfs merge=lfs -text
|
| 30 |
+
*.tgz filter=lfs diff=lfs merge=lfs -text
|
| 31 |
+
*.wasm filter=lfs diff=lfs merge=lfs -text
|
| 32 |
+
*.xz filter=lfs diff=lfs merge=lfs -text
|
| 33 |
+
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
+
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
+
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
Qwen3.5-9B.Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
Qwen3.5-9B.BF16-mmproj.gguf filter=lfs diff=lfs merge=lfs -text
|
| 38 |
+
Qwen3.5-9B.Q5_K_S.gguf filter=lfs diff=lfs merge=lfs -text
|
| 39 |
+
Qwen3.5-9B.Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
|
| 40 |
+
Qwen3.5-9B.Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
|
| 41 |
+
Qwen3.5-9B.BF16.gguf filter=lfs diff=lfs merge=lfs -text
|
| 42 |
+
mmproj.gguf filter=lfs diff=lfs merge=lfs -text
|
Qwen3.5-9B.BF16.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:491df77894117d41dacb86844a4b8f102063b269b53cdd0aae15791fb1931df2
|
| 3 |
+
size 17920693120
|
Qwen3.5-9B.Q4_K_M.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:19d52ddc3343880a3999451c159c3564ba83e96ec24659c0041a83fe4e9c37ec
|
| 3 |
+
size 5629105024
|
Qwen3.5-9B.Q5_K_S.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:edd985f4a8e13639af7ed52c062e27413a5c311a55deb77b7acf76346323fd84
|
| 3 |
+
size 6305305472
|
Qwen3.5-9B.Q6_K.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6677a225f82501c54093581fad891a5f17440cc5cc12dd0609f0f5cf4ef729c8
|
| 3 |
+
size 7359255424
|
Qwen3.5-9B.Q8_0.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3705c24f9de8b851bcc7a78f2e757e28937ebe612372cf8d06383d90a3a67252
|
| 3 |
+
size 9527497600
|
README.md
ADDED
|
@@ -0,0 +1,178 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language:
|
| 3 |
+
- en
|
| 4 |
+
- zh
|
| 5 |
+
- ko
|
| 6 |
+
license: apache-2.0
|
| 7 |
+
base_model: unsloth/Qwen3.5-9B
|
| 8 |
+
tags:
|
| 9 |
+
- unsloth
|
| 10 |
+
- qwen
|
| 11 |
+
- qwen3.5
|
| 12 |
+
- reasoning
|
| 13 |
+
- chain-of-thought
|
| 14 |
+
- lora
|
| 15 |
+
- competitive-programming
|
| 16 |
+
pipeline_tag: image-text-to-text
|
| 17 |
+
---
|
| 18 |
+
|
| 19 |
+
# 🌟 Qwopus3.5-9B-v3
|
| 20 |
+
|
| 21 |
+
## 💡 Model Introduction
|
| 22 |
+
**Qwopus3.5-9B-v3** is a reasoning-enhanced model based on Qwen3.5-9B. Its core objective is to simultaneously improve reasoning stability and correctness while optimizing inference efficiency, ultimately achieving stronger cross-task generalization capabilities—particularly in programming.
|
| 23 |
+
|
| 24 |
+
By continuing to optimize the fundamental structure of its reasoning process alongside high-quality reasoning distillation and structural alignment, it enables the model to achieve higher accuracy rates through shorter, more stable reasoning paths.
|
| 25 |
+
|
| 26 |
+
---
|
| 27 |
+
|
| 28 |
+
### 🍎 Qwopus3.5-9B-v3: Humaneval Benchmark Evaluation
|
| 29 |
+
>Inference for models was conducted under the Unsloth runtime environment using **bfloat16 (BF16)** precision, which provides a balance of numerical range and memory efficiency well-suited to 9B-scale inference. Answer verification, partial chain-of-thought adjudication, and statistical analysis were cross-validated using **GPT-4.5-Pro (Thinking)** and **Claude Opus 4.6 (Thinking)** to ensure accuracy and reproducibility of the evaluation outcomes.
|
| 30 |
+
|
| 31 |
+
**HumanEval**
|
| 32 |
+
I evaluated three 9B-scale Qwen-family models on the full 164-task HumanEval benchmark under a task-level adjudication protocol that resolves code-extraction pollution, answer/code separation issues, and clearly inferable truncated outputs using raw generations. Under this fair and strict evaluation setting, **Qwopus3.5-9B-v3 achieves the best base pass@1 of 87.80% (144/164)**, outperforming both **Qwen3.5-9B** (82.93%, 136/164) and **Claude-Distilled-v2** (82.32%, 135/164). Furthermore, on the stricter **plus pass@1** evaluation, Qwopus3.5-9B-v3 also extends its lead to **82.93% (136/164)** compared to **77.44% (127/164)** for the official baseline (+5.49 pp) and **78.66% (129/164)** for the distilled variant.
|
| 33 |
+
|
| 34 |
+
| Model | Base pass@1 | Plus pass@1 | Rescues (From GPT) | Improvement vs Qwen3.5-9B |
|
| 35 |
+
|---|---|---|---|---|
|
| 36 |
+
| **Qwopus3.5-9B-v3** | **87.80% (144/164)** | **82.93% (136/164)** | 1 | 📈 **Base: +4.87 pp / Plus: +5.49 pp** |
|
| 37 |
+
| Qwen3.5-9B | 82.93% (136/164) | 77.44% (127/164) | 2 | Baseline |
|
| 38 |
+
| Claude-Distilled-v2 | 82.32% (135/164) | 78.66% (129/164) | 0 | 📉 Base: -0.61 pp / 📈 Plus: +1.22 pp vs Qwen3.5-9B |
|
| 39 |
+
|
| 40 |
+
|
| 41 |
+

|
| 42 |
+
|
| 43 |
+
|
| 44 |
+

|
| 45 |
+
|
| 46 |
+
|
| 47 |
+
|
| 48 |
+
|
| 49 |
+
> **Note:** The test results presented here differ from the scores on the 9B-v2 model card because the context length was increased for this evaluation. Consequently, the number of tasks affected by context window truncation has changed for each model, leading to different final scores. Please ensure comparisons are made under the same variable settings.
|
| 50 |
+
|
| 51 |
+
|
| 52 |
+
All post-evaluation standard result files will be uploaded to this repository for transparency and reproducibility. These include:
|
| 53 |
+
- `Jackrong_Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2_humaneval_all_evalonly_eval_results`
|
| 54 |
+
- `Jackrong_Qwopus3.5-9B-v3-test1_humaneval_all_evalonly_eval_results`
|
| 55 |
+
- `qwen_Qwen3.5-9B_humaneval_all_evalonly_eval_results`
|
| 56 |
+
|
| 57 |
+
⚠️ **Note on evaluation artifacts.**
|
| 58 |
+
The released result files are based on **raw model generations**, which may contain formatting issues (e.g., Markdown wrappers, answer/code mixing), truncation, or minor token-level corruption.
|
| 59 |
+
|
| 60 |
+
---
|
| 61 |
+
|
| 62 |
+
### 🏃 Qwopus3.5-9B-v3: MMLU-Pro Benchmark Evaluation
|
| 63 |
+
I evaluated on **280 MMLU-Pro questions** across the following domains: Biology, Chemistry, Computer Science, Health, Mathematics, Physics, and Other Sciences.
|
| 64 |
+
|
| 65 |
+
All question IDs are identical across both model runs.
|
| 66 |
+
|
| 67 |
+
### Accuracy
|
| 68 |
+
|
| 69 |
+
| Model | Correct | Total | Accuracy |
|
| 70 |
+
|------------------|--------|-------|----------|
|
| 71 |
+
| Qwen3.5-9B | 225 | 280 | 80.36% |
|
| 72 |
+
| Qwopus3.5-9B-v3 | 229 | 280 | **81.79%** |
|
| 73 |
+
|
| 74 |
+
**Result:**
|
| 75 |
+
Qwopus3.5-9B-v3 leads by **+1.43 pp**
|
| 76 |
+
|
| 77 |
+
---
|
| 78 |
+
|
| 79 |
+
## Reasoning Efficiency
|
| 80 |
+
|
| 81 |
+
| Metric | Qwen3.5-9B | Qwopus3.5-9B-v3 |
|
| 82 |
+
|--------|------------|--------------|
|
| 83 |
+
| Avg think length | 7116 chars | **5313 chars** |
|
| 84 |
+
| Passes / 10k chars | 1.26 | **1.66** |
|
| 85 |
+
| Chars / correct pass | 7938 | **6032** |
|
| 86 |
+
|
| 87 |
+
### Reasoning Efficiency Improvements
|
| 88 |
+
|
| 89 |
+
- **−25.3%** shorter reasoning
|
| 90 |
+
- **+31.7%** higher efficiency
|
| 91 |
+
- **−24.0%** lower cost per correct answer
|
| 92 |
+
|
| 93 |
+

|
| 94 |
+
|
| 95 |
+
|
| 96 |
+
|
| 97 |
+
### Evaluation Summary
|
| 98 |
+
While the overall accuracy margin (+1.43 pp) is modest, Qwopus3.5-9B-v3 fundamentally shifts the **accuracy-cost paradigm**, achieving its victory while spending significantly less reasoning budget. With a 25.3% reduction in mean think length and 24.0% lower token cost per correct answer, this iteration is highly optimized for latency, token budget, and context pressure.
|
| 99 |
+
|
| 100 |
+
Furthermore, across the mixed domain profile, Qwopus3.5-9B-v3 uniquely offsets Qwen3.5-9B's slight edge in biology, CS, and math by excelling in physics, chemistry, and significantly lowering its unfinished-output rate. Its final rank benefits as much from raw correctness as from an improved ability to cleanly and reliably complete analytical boundaries.
|
| 101 |
+
|
| 102 |
+
|
| 103 |
+
## 🗺️ Training Pipeline Overview
|
| 104 |
+
|
| 105 |
+
```text
|
| 106 |
+
Base Model (Qwen3.5-9B)
|
| 107 |
+
│
|
| 108 |
+
▼
|
| 109 |
+
Qwen3.5-9B fine-tuned with Unsloth
|
| 110 |
+
│
|
| 111 |
+
▼
|
| 112 |
+
Supervised Fine-Tuning (SFT) + LoRA
|
| 113 |
+
(Response-Only Training masked on "<|im_start|>assistant\n<think>")
|
| 114 |
+
│
|
| 115 |
+
▼
|
| 116 |
+
Qwopus3.5-9B-v3
|
| 117 |
+
```
|
| 118 |
+
|
| 119 |
+
### 🧠 Example of Learned Reasoning Scaffold
|
| 120 |
+
|
| 121 |
+
The model includes targeted optimizations addressing Qwen3.5's tendency toward excessive or repetitive reasoning on simple queries. By distilling the structured reasoning habits of top-tier models like Claude Opus, Qwopus3.5-9B-v3 adopts a highly organized, step-by-step cognitive layout.
|
| 122 |
+
|
| 123 |
+
|
| 124 |
+
```text
|
| 125 |
+
Example:The user is asking about [Topic] and how it differs from [Topic B]. This is a [Task type] question. Let me break this down:
|
| 126 |
+
|
| 127 |
+
1. What is [Topic A]?
|
| 128 |
+
- [Fact/Mechanism 1]
|
| 129 |
+
- [Fact/Mechanism 2]
|
| 130 |
+
2. What is [Topic B]?
|
| 131 |
+
- [Fact/Mechanism 1]
|
| 132 |
+
3. Key differences:
|
| 133 |
+
- [Comparison Point 1]
|
| 134 |
+
- [Comparison Point 2]
|
| 135 |
+
|
| 136 |
+
Let me make sure to be accurate: [...]
|
| 137 |
+
Actually, I should double-check: is [Fact] used before [Fact]? Yes, typically...
|
| 138 |
+
Let me provide a clear, well-structured answer:
|
| 139 |
+
```
|
| 140 |
+
|
| 141 |
+
|
| 142 |
+
### 📚 Training Data
|
| 143 |
+
The model was fine-tuned on a **high-fidelity reasoning dataset**, which was meticulously curated from a blend of premium open-source sources on Hugging Face. This dataset is the result of a rigorous **mixing and cleaning process**, specifically designed to filter out low-quality responses and ensure consistently strong logical performance across diverse analytical domains.
|
| 144 |
+
|
| 145 |
+
> *(Rest assured, the entire process is strictly by-the-book and 100% compliant with all terms and open-source licenses!)*
|
| 146 |
+
|
| 147 |
+
|
| 148 |
+
## ⚠️ Limitations & Intended Use
|
| 149 |
+
- **Hallucination Risk:** While reasoning is strong, the model remains an autoregressive LLM; external facts provided during the thinking sequence may occasionally contain hallucinations if verifying real-world events.
|
| 150 |
+
- **Intended Scenario:** Best suited for offline analytical tasks, coding, math, and heavy logic-dependent prompting where the user needs to transparently follow the AI's internal logic.
|
| 151 |
+
- This model is a test version intended solely for learning and demonstration purposes, and is for academic research and technical exploration use only.
|
| 152 |
+
|
| 153 |
+
## 🙏 Acknowledgements
|
| 154 |
+
Significant thanks to the [Unsloth AI](https://unsloth.ai/) team for making rapid fine-tuning of large LLM models accessible. Additionally, we acknowledge Qwen internally, and the open-source community developers producing exceptional distilled datasets.
|
| 155 |
+
|
| 156 |
+
This qwen3_5 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
|
| 157 |
+
|
| 158 |
+
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
|
| 159 |
+
|
| 160 |
+
|
| 161 |
+
|
| 162 |
+
## 📖 Citation
|
| 163 |
+
|
| 164 |
+
If you use this model in your research or projects, please cite:
|
| 165 |
+
|
| 166 |
+
```bibtex
|
| 167 |
+
@misc{jackrong_qwen35_9b_v3
|
| 168 |
+
title = {Jackrong/Qwopus3.5-9B-v3},
|
| 169 |
+
author = {Jackrong},
|
| 170 |
+
year = {2026},
|
| 171 |
+
publisher = {Hugging Face},
|
| 172 |
+
howpublished = {\url{https://huggingface.co/Jackrong/Qwopus3.5-9B-v3-GGUF}}
|
| 173 |
+
}
|
| 174 |
+
```
|
| 175 |
+
|
| 176 |
+
|
| 177 |
+
|
| 178 |
+
|
config.json
ADDED
|
@@ -0,0 +1,114 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"architectures": [
|
| 3 |
+
"Qwen3_5ForConditionalGeneration"
|
| 4 |
+
],
|
| 5 |
+
"torch_dtype": "bfloat16",
|
| 6 |
+
"eos_token_id": 248046,
|
| 7 |
+
"image_token_id": 248056,
|
| 8 |
+
"model_name": "unsloth/Qwen3.5-9B",
|
| 9 |
+
"model_type": "qwen3_5",
|
| 10 |
+
"pad_token_id": 248055,
|
| 11 |
+
"text_config": {
|
| 12 |
+
"attention_bias": false,
|
| 13 |
+
"attention_dropout": 0.0,
|
| 14 |
+
"attn_output_gate": true,
|
| 15 |
+
"bos_token_id": null,
|
| 16 |
+
"torch_dtype": "bfloat16",
|
| 17 |
+
"eos_token_id": 248044,
|
| 18 |
+
"full_attention_interval": 4,
|
| 19 |
+
"head_dim": 256,
|
| 20 |
+
"hidden_act": "silu",
|
| 21 |
+
"hidden_size": 4096,
|
| 22 |
+
"initializer_range": 0.02,
|
| 23 |
+
"intermediate_size": 12288,
|
| 24 |
+
"layer_types": [
|
| 25 |
+
"linear_attention",
|
| 26 |
+
"linear_attention",
|
| 27 |
+
"linear_attention",
|
| 28 |
+
"full_attention",
|
| 29 |
+
"linear_attention",
|
| 30 |
+
"linear_attention",
|
| 31 |
+
"linear_attention",
|
| 32 |
+
"full_attention",
|
| 33 |
+
"linear_attention",
|
| 34 |
+
"linear_attention",
|
| 35 |
+
"linear_attention",
|
| 36 |
+
"full_attention",
|
| 37 |
+
"linear_attention",
|
| 38 |
+
"linear_attention",
|
| 39 |
+
"linear_attention",
|
| 40 |
+
"full_attention",
|
| 41 |
+
"linear_attention",
|
| 42 |
+
"linear_attention",
|
| 43 |
+
"linear_attention",
|
| 44 |
+
"full_attention",
|
| 45 |
+
"linear_attention",
|
| 46 |
+
"linear_attention",
|
| 47 |
+
"linear_attention",
|
| 48 |
+
"full_attention",
|
| 49 |
+
"linear_attention",
|
| 50 |
+
"linear_attention",
|
| 51 |
+
"linear_attention",
|
| 52 |
+
"full_attention",
|
| 53 |
+
"linear_attention",
|
| 54 |
+
"linear_attention",
|
| 55 |
+
"linear_attention",
|
| 56 |
+
"full_attention"
|
| 57 |
+
],
|
| 58 |
+
"linear_conv_kernel_dim": 4,
|
| 59 |
+
"linear_key_head_dim": 128,
|
| 60 |
+
"linear_num_key_heads": 16,
|
| 61 |
+
"linear_num_value_heads": 32,
|
| 62 |
+
"linear_value_head_dim": 128,
|
| 63 |
+
"mamba_ssm_dtype": "float32",
|
| 64 |
+
"max_position_embeddings": 262144,
|
| 65 |
+
"mlp_only_layers": [],
|
| 66 |
+
"model_type": "qwen3_5_text",
|
| 67 |
+
"mtp_num_hidden_layers": 1,
|
| 68 |
+
"mtp_use_dedicated_embeddings": false,
|
| 69 |
+
"num_attention_heads": 16,
|
| 70 |
+
"num_hidden_layers": 32,
|
| 71 |
+
"num_key_value_heads": 4,
|
| 72 |
+
"pad_token_id": null,
|
| 73 |
+
"partial_rotary_factor": 0.25,
|
| 74 |
+
"rms_norm_eps": 1e-06,
|
| 75 |
+
"rope_parameters": {
|
| 76 |
+
"mrope_interleaved": true,
|
| 77 |
+
"mrope_section": [
|
| 78 |
+
11,
|
| 79 |
+
11,
|
| 80 |
+
10
|
| 81 |
+
],
|
| 82 |
+
"partial_rotary_factor": 0.25,
|
| 83 |
+
"rope_theta": 10000000,
|
| 84 |
+
"rope_type": "default"
|
| 85 |
+
},
|
| 86 |
+
"tie_word_embeddings": false,
|
| 87 |
+
"use_cache": true,
|
| 88 |
+
"vocab_size": 248320
|
| 89 |
+
},
|
| 90 |
+
"tie_word_embeddings": false,
|
| 91 |
+
"unsloth_fixed": true,
|
| 92 |
+
"unsloth_version": "2026.3.17",
|
| 93 |
+
"use_cache": false,
|
| 94 |
+
"video_token_id": 248057,
|
| 95 |
+
"vision_config": {
|
| 96 |
+
"deepstack_visual_indexes": [],
|
| 97 |
+
"depth": 27,
|
| 98 |
+
"torch_dtype": "bfloat16",
|
| 99 |
+
"hidden_act": "gelu_pytorch_tanh",
|
| 100 |
+
"hidden_size": 1152,
|
| 101 |
+
"in_channels": 3,
|
| 102 |
+
"initializer_range": 0.02,
|
| 103 |
+
"intermediate_size": 4304,
|
| 104 |
+
"model_type": "qwen3_5",
|
| 105 |
+
"num_heads": 16,
|
| 106 |
+
"num_position_embeddings": 2304,
|
| 107 |
+
"out_hidden_size": 4096,
|
| 108 |
+
"patch_size": 16,
|
| 109 |
+
"spatial_merge_size": 2,
|
| 110 |
+
"temporal_patch_size": 2
|
| 111 |
+
},
|
| 112 |
+
"vision_end_token_id": 248054,
|
| 113 |
+
"vision_start_token_id": 248053
|
| 114 |
+
}
|
mmproj.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:039eb1b8f7d64e5daf9a2bcc830a97e75bc4b344f286ec9d6580bc134cde2976
|
| 3 |
+
size 921704576
|