docs: complete Chinese translation
Browse files- README_zh.md +60 -60
README_zh.md
CHANGED
|
@@ -28,35 +28,35 @@ license: apache-2.0
|
|
| 28 |
<span style="background: #007aff; color: white; font-size: 11px; font-weight: 600; padding: 5px 14px; border-radius: 20px;">MTP</span><span style="background: #af52de; color: white; font-size: 11px; font-weight: 600; padding: 5px 14px; border-radius: 20px;">GGUF</span>
|
| 29 |
</div>
|
| 30 |
<h1 style="margin: 0 0 8px 0; font-size: 32px; font-weight: 700; color: #1d1d1f; letter-spacing: -0.5px; border: none; position: relative; z-index: 1;">QwenPaw-Flash-9B-heretic-MTP</h1>
|
| 31 |
-
<p style="margin: 8px 0 0 0; font-size: 14px; position: relative; z-index: 1;"><a href="https://huggingface.co/SC117/QwenPaw-Flash-9B-heretic-MTP-GGUF/blob/main/README.md" style="color: #007aff; text-decoration: none;">📖 English</a> | <span style="color: #86868b;">中文</span></p>
|
| 32 |
</div>
|
| 33 |
</div>
|
| 34 |
|
| 35 |
<div style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; display: flex; flex-direction: column; gap: 20px; margin-bottom: 30px;">
|
| 36 |
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
|
| 37 |
<div style="padding: 16px;">
|
| 38 |
-
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;">QwenPaw-Flash-9B-heretic
|
| 39 |
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;"><p align="center"></p>
|
| 40 |
-
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;"><b>🏆 BenchLocal 总
|
| 41 |
-
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;"><i>无审查 · 已消融 · Agent 优化 · 1.7-4.1× 加速
|
| 42 |
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;"></p></p>
|
| 43 |
-
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;">
|
| 44 |
-
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;">
|
| 45 |
</div>
|
| 46 |
</div>
|
| 47 |
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
|
| 48 |
-
<div style="background: linear-gradient(135deg, #ff6b35 0%, #f7931e 100%); padding: 12px 16px; color: white; font-weight: 700; font-size: 14px; display: flex; align-items: center; gap: 8px;"><span>📊</span> 🏆 BenchLocal 基准测试
|
| 49 |
<div style="padding: 16px;">
|
| 50 |
-
<p style="margin: 0 0 12px 0; font-size: 13px; color: #64748b; font-style: italic; border-left: 3px solid #ffedd5; padding-left: 12px;"><b>测试环境</b>
|
| 51 |
-
<p style="margin: 0 0 12px 0; font-size: 13px; color: #64748b; font-style: italic; border-left: 3px solid #ffedd5; padding-left: 12px;"><b>测试框架</b>
|
| 52 |
-
<p style="margin: 0 0 12px 0; font-size: 13px; color: #64748b; font-style: italic; border-left: 3px solid #ffedd5; padding-left: 12px;"><b>测试方法</b>
|
| 53 |
<table style="width: 100%; border-collapse: collapse; font-size: 13px;"><thead><tr style="background: rgba(255,107,53,0.05);">
|
| 54 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">基准测试</th>
|
| 55 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">得分</th>
|
| 56 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">准确率</th>
|
| 57 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">结果</th>
|
| 58 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">耗时</th>
|
| 59 |
-
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">
|
| 60 |
</tr></thead><tbody>
|
| 61 |
<tr>
|
| 62 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>ToolCall-15</b> 🛠️</td>
|
|
@@ -83,7 +83,7 @@ license: apache-2.0
|
|
| 83 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>4.1×</b> faster</td>
|
| 84 |
</tr>
|
| 85 |
<tr>
|
| 86 |
-
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>
|
| 87 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>4035/5000</b></td>
|
| 88 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>80.7%</b></td>
|
| 89 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">36✅ 3⚠️ 11❌</td>
|
|
@@ -91,11 +91,11 @@ license: apache-2.0
|
|
| 91 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>1.9×</b> faster</td>
|
| 92 |
</tr>
|
| 93 |
</tbody></table>
|
| 94 |
-
<h3 style="margin: 16px 0 8px 0; font-size: 14px; color: #1e293b; font-weight: 700;">
|
| 95 |
<table style="width: 100%; border-collapse: collapse; font-size: 13px;"><thead><tr style="background: rgba(255,107,53,0.05);">
|
| 96 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">基准测试</th>
|
| 97 |
-
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">
|
| 98 |
-
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">
|
| 99 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">Δ Score</th>
|
| 100 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">Δ Speed</th>
|
| 101 |
</tr></thead><tbody>
|
|
@@ -121,51 +121,51 @@ license: apache-2.0
|
|
| 121 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>4.1×</b></td>
|
| 122 |
</tr>
|
| 123 |
<tr>
|
| 124 |
-
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>
|
| 125 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>3873/5000 (77.5%)</b></td>
|
| 126 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>4035/5000 (80.7%)</b></td>
|
| 127 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>+162 pts</b></td>
|
| 128 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>1.9×</b></td>
|
| 129 |
</tr>
|
| 130 |
<tr>
|
| 131 |
-
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>总
|
| 132 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>14.7 min</b></td>
|
| 133 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>7.8 min</b></td>
|
| 134 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">—</td>
|
| 135 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>1.9×</b></td>
|
| 136 |
</tr>
|
| 137 |
</tbody></table>
|
| 138 |
-
<h3 style="margin: 16px 0 8px 0; font-size: 14px; color: #1e293b; font-weight: 700;">🛠️ ToolCall-15 — 工具调用稳定性 (100%, +6.7
|
| 139 |
-
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;">MTP
|
| 140 |
<table style="width: 100%; border-collapse: collapse; font-size: 13px;"><thead><tr style="background: rgba(255,107,53,0.05);">
|
| 141 |
-
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">
|
| 142 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">结果</th>
|
| 143 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">场景</th>
|
| 144 |
</tr></thead><tbody>
|
| 145 |
<tr>
|
| 146 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">TC-01–TC-04</td>
|
| 147 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">✅</td>
|
| 148 |
-
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">
|
| 149 |
</tr>
|
| 150 |
<tr>
|
| 151 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">TC-05</td>
|
| 152 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">✅</td>
|
| 153 |
-
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">相对日期/时间解析 ← <b>
|
| 154 |
</tr>
|
| 155 |
<tr>
|
| 156 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">TC-06–TC-15</td>
|
| 157 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">✅</td>
|
| 158 |
-
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">
|
| 159 |
</tr>
|
| 160 |
</tbody></table>
|
| 161 |
-
<h3 style="margin: 16px 0 8px 0; font-size: 14px; color: #1e293b; font-weight: 700;">🤖 HermesAgent-20 — 复杂 Agent 任务 (75.3%, −1.9
|
| 162 |
-
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;">MTP
|
| 163 |
-
<h3 style="margin: 16px 0 8px 0; font-size: 14px; color: #1e293b; font-weight: 700;">🐛 BugFind-15 —
|
| 164 |
-
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;">
|
| 165 |
<table style="width: 100%; border-collapse: collapse; font-size: 13px;"><thead><tr style="background: rgba(255,107,53,0.05);">
|
| 166 |
-
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">
|
| 167 |
-
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">
|
| 168 |
-
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">
|
| 169 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">Δ</th>
|
| 170 |
</tr></thead><tbody>
|
| 171 |
<tr>
|
|
@@ -262,23 +262,23 @@ license: apache-2.0
|
|
| 262 |
</div>
|
| 263 |
</div>
|
| 264 |
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
|
| 265 |
-
<div style="background: linear-gradient(135deg, #ff6b35 0%, #f7931e 100%); padding: 12px 16px; color: white; font-weight: 700; font-size: 14px; display: flex; align-items: center; gap: 8px;"><span>⚡</span> MTP
|
| 266 |
<div style="padding: 16px;">
|
| 267 |
-
<h3 style="margin: 16px 0 8px 0; font-size: 14px; color: #1e293b; font-weight: 700;">
|
| 268 |
-
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;">Multi-Token Prediction(多 token
|
| 269 |
-
<h3 style="margin: 16px 0 8px 0; font-size: 14px; color: #1e293b; font-weight: 700;">
|
| 270 |
-
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;">
|
| 271 |
-
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;"><b>
|
| 272 |
<ol style="margin: 0 0 12px 0; padding-left: 20px;"></ol>
|
| 273 |
<ol style="margin: 0 0 12px 0; padding-left: 20px;"></ol>
|
| 274 |
<ol style="margin: 0 0 12px 0; padding-left: 20px;"></ol>
|
| 275 |
<ol style="margin: 0 0 12px 0; padding-left: 20px;"></ol>
|
| 276 |
<ol style="margin: 0 0 12px 0; padding-left: 20px;"></ol>
|
| 277 |
<ol style="margin: 0 0 12px 0; padding-left: 20px;"></ol>
|
| 278 |
-
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;"><b>总
|
| 279 |
-
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;"><b>MTP
|
| 280 |
-
<h3 style="margin: 16px 0 8px 0; font-size: 14px; color: #1e293b; font-weight: 700;">
|
| 281 |
-
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;">
|
| 282 |
</div>
|
| 283 |
</div>
|
| 284 |
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
|
|
@@ -286,11 +286,11 @@ license: apache-2.0
|
|
| 286 |
<div style="padding: 16px;">
|
| 287 |
<table style="width: 100%; border-collapse: collapse; font-size: 13px;"><thead><tr style="background: rgba(255,107,53,0.05);">
|
| 288 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">模型</th>
|
| 289 |
-
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">
|
| 290 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">ToolCall-15</th>
|
| 291 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">HermesAgent-20</th>
|
| 292 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">BugFind-15</th>
|
| 293 |
-
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">总
|
| 294 |
</tr></thead><tbody>
|
| 295 |
<tr>
|
| 296 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>🐾 QwenPaw MTP 9B</b></td>
|
|
@@ -301,7 +301,7 @@ license: apache-2.0
|
|
| 301 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>7.8min</b> 🥇</td>
|
| 302 |
</tr>
|
| 303 |
<tr>
|
| 304 |
-
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">🐾 QwenPaw 9B
|
| 305 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">3873</td>
|
| 306 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">93.3%</td>
|
| 307 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>77.2%</b> 🥇</td>
|
|
@@ -317,7 +317,7 @@ license: apache-2.0
|
|
| 317 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">21.3min ⚠️</td>
|
| 318 |
</tr>
|
| 319 |
<tr>
|
| 320 |
-
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">🧠 Qwen 35B 思考模式
|
| 321 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">1445 (HA only)</td>
|
| 322 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">—</td>
|
| 323 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">72.3%</td>
|
|
@@ -325,7 +325,7 @@ license: apache-2.0
|
|
| 325 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">7.0min</td>
|
| 326 |
</tr>
|
| 327 |
<tr>
|
| 328 |
-
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">⚡ Qwen 35B 思考模式
|
| 329 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">1370 (HA only)</td>
|
| 330 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">—</td>
|
| 331 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">68.5%</td>
|
|
@@ -341,13 +341,13 @@ license: apache-2.0
|
|
| 341 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">18.6min</td>
|
| 342 |
</tr>
|
| 343 |
</tbody></table>
|
| 344 |
-
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;"><b>QwenPaw MTP
|
| 345 |
</div>
|
| 346 |
</div>
|
| 347 |
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
|
| 348 |
<div style="background: linear-gradient(135deg, #ff6b35 0%, #f7931e 100%); padding: 12px 16px; color: white; font-weight: 700; font-size: 14px; display: flex; align-items: center; gap: 8px;"><span>🧠</span> Model Description</div>
|
| 349 |
<div style="padding: 16px;">
|
| 350 |
-
<ul style="margin: 0 0 12px 0; padding-left: 20px;"><li style="margin-bottom: 4px; font-size: 13px; color: #334155;">
|
| 351 |
</div>
|
| 352 |
</div>
|
| 353 |
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
|
|
@@ -367,7 +367,7 @@ mlp.down_proj.min_weight_distance = 17.47</p>
|
|
| 367 |
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
|
| 368 |
<div style="background: linear-gradient(135deg, #ff6b35 0%, #f7931e 100%); padding: 12px 16px; color: white; font-weight: 700; font-size: 14px; display: flex; align-items: center; gap: 8px;"><span>🏗️</span> Architecture</div>
|
| 369 |
<div style="padding: 16px;">
|
| 370 |
-
<ul style="margin: 0 0 12px 0; padding-left: 20px;"><li style="margin-bottom: 4px; font-size: 13px; color: #334155;">
|
| 371 |
</div>
|
| 372 |
</div>
|
| 373 |
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
|
|
@@ -381,22 +381,22 @@ mlp.down_proj.min_weight_distance = 17.47</p>
|
|
| 381 |
<tr>
|
| 382 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><code style="background: #f8fafc; padding: 2px 6px; border-radius: 4px; border: 1px solid #e2e8f0; font-size: 12px; color: #1e293b;">QwenPaw-Flash-9B-heretic-MTP-Q8_0.gguf</code></td>
|
| 383 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">~9.2GB</td>
|
| 384 |
-
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">
|
| 385 |
</tr>
|
| 386 |
<tr>
|
| 387 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><code style="background: #f8fafc; padding: 2px 6px; border-radius: 4px; border: 1px solid #e2e8f0; font-size: 12px; color: #1e293b;">QwenPaw-Flash-9B-heretic-MTP-Q6_K.gguf</code></td>
|
| 388 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">~7.1GB</td>
|
| 389 |
-
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">✅ <b>
|
| 390 |
</tr>
|
| 391 |
<tr>
|
| 392 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><code style="background: #f8fafc; padding: 2px 6px; border-radius: 4px; border: 1px solid #e2e8f0; font-size: 12px; color: #1e293b;">QwenPaw-Flash-9B-heretic-MTP-Q4_K_M.gguf</code></td>
|
| 393 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">~5.4GB</td>
|
| 394 |
-
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">
|
| 395 |
</tr>
|
| 396 |
<tr>
|
| 397 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><code style="background: #f8fafc; padding: 2px 6px; border-radius: 4px; border: 1px solid #e2e8f0; font-size: 12px; color: #1e293b;">mmproj-BF16</code></td>
|
| 398 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">~880MB</td>
|
| 399 |
-
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">
|
| 400 |
</tr>
|
| 401 |
</tbody></table>
|
| 402 |
</div>
|
|
@@ -406,37 +406,37 @@ mlp.down_proj.min_weight_distance = 17.47</p>
|
|
| 406 |
<div style="padding: 16px;">
|
| 407 |
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;">--spec-type draft-mtp</p>
|
| 408 |
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;">--spec-draft-n-max 2</p>
|
| 409 |
-
<h3 style="margin: 16px 0 8px 0; font-size: 14px; color: #1e293b; font-weight: 700;">llama.cpp (
|
| 410 |
-
<p style="margin: 0; font-family: monospace; background: #f8fafc; padding: 10px 14px; border-radius: 6px; border: 1px solid #e2e8f0; font-size: 12px; color: #1e293b; white-space: pre-wrap;"># Start server
|
| 411 |
llama-server -m QwenPaw-Flash-9B-heretic-MTP-Q6_K.gguf \
|
| 412 |
-ngl 99 -fa on -c 8192 \
|
| 413 |
--spec-type draft-mtp --spec-draft-n-max 2 \
|
| 414 |
--host 0.0.0.0 --port 8088
|
| 415 |
|
| 416 |
-
# Or
|
| 417 |
llama-cli -m QwenPaw-Flash-9B-heretic-MTP-Q6_K.gguf \
|
| 418 |
-ngl 99 -fa on -c 8192 \
|
| 419 |
--spec-type draft-mtp --spec-draft-n-max 2 \
|
| 420 |
-p "Write a Python script to..."</p>
|
| 421 |
-
<h3 style="margin: 16px 0 8px 0; font-size: 14px; color: #1e293b; font-weight: 700;">llama.cpp (
|
| 422 |
-
<p style="margin: 0; font-family: monospace; background: #f8fafc; padding: 10px 14px; border-radius: 6px; border: 1px solid #e2e8f0; font-size: 12px; color: #1e293b; white-space: pre-wrap;">#
|
| 423 |
llama-server -m QwenPaw-Flash-9B-heretic-MTP-Q6_K.gguf \
|
| 424 |
-ngl 99 -fa on -c 8192 \
|
| 425 |
--host 0.0.0.0 --port 8088</p>
|
| 426 |
<h3 style="margin: 16px 0 8px 0; font-size: 14px; color: #1e293b; font-weight: 700;">LM Studio</h3>
|
| 427 |
-
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;">
|
| 428 |
</div>
|
| 429 |
</div>
|
| 430 |
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
|
| 431 |
<div style="background: linear-gradient(135deg, #ff6b35 0%, #f7931e 100%); padding: 12px 16px; color: white; font-weight: 700; font-size: 14px; display: flex; align-items: center; gap: 8px;"><span>📝</span> Notes</div>
|
| 432 |
<div style="padding: 16px;">
|
| 433 |
-
<ol style="margin: 0 0 12px 0; padding-left: 20px;"><li style="margin-bottom: 4px; font-size: 13px; color: #334155;">
|
| 434 |
</div>
|
| 435 |
</div>
|
| 436 |
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
|
| 437 |
<div style="background: linear-gradient(135deg, #ff6b35 0%, #f7931e 100%); padding: 12px 16px; color: white; font-weight: 700; font-size: 14px; display: flex; align-items: center; gap: 8px;"><span>🙏</span> Acknowledgements</div>
|
| 438 |
<div style="padding: 16px;">
|
| 439 |
-
<ul style="margin: 0 0 12px 0; padding-left: 20px;"><li style="margin-bottom: 4px; font-size: 13px; color: #334155;"><a href="https://github.com/p-e-w/heretic" target="_blank" style="color: #c2410c; text-decoration: none; font-weight: 700;">Heretic</a> —
|
| 440 |
</div>
|
| 441 |
</div>
|
| 442 |
</div>
|
|
|
|
| 28 |
<span style="background: #007aff; color: white; font-size: 11px; font-weight: 600; padding: 5px 14px; border-radius: 20px;">MTP</span><span style="background: #af52de; color: white; font-size: 11px; font-weight: 600; padding: 5px 14px; border-radius: 20px;">GGUF</span>
|
| 29 |
</div>
|
| 30 |
<h1 style="margin: 0 0 8px 0; font-size: 32px; font-weight: 700; color: #1d1d1f; letter-spacing: -0.5px; border: none; position: relative; z-index: 1;">QwenPaw-Flash-9B-heretic-MTP</h1>
|
| 31 |
+
<p style="margin: 8px 0 0 0; font-size: 14px; position: relative; z-index: 1;"><a href="https://huggingface.co/SC117/QwenPaw-Flash-9B-heretic-MTP-GGUF/blob/main/README.md" style="color: #007aff; text-decoration: none;">📖 English</a> | <span style="color: #86868b;">中文</span> style="color: #007aff; text-decoration: none;">📖 中文文档</a></p>
|
| 32 |
</div>
|
| 33 |
</div>
|
| 34 |
|
| 35 |
<div style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; display: flex; flex-direction: column; gap: 20px; margin-bottom: 30px;">
|
| 36 |
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
|
| 37 |
<div style="padding: 16px;">
|
| 38 |
+
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;">QwenPaw-Flash-9B-heretic 非 MTP 版本: <a href="https://huggingface.co/SC117/QwenPaw-Flash-9B-heretic-GGUF" target="_blank" style="color: #c2410c; text-decoration: none; font-weight: 700;">QwenPaw-Flash-9B-heretic-GGUF</a></p>
|
| 39 |
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;"><p align="center"></p>
|
| 40 |
+
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;"><b>🏆 BenchLocal 总分: 4035/5000 (80.7%) — MTP 投机解码已注入</b><br></p>
|
| 41 |
+
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;"><i>无审查 · 已消融 · Agent 优化 · 1.7-4.1× 加速</i></p>
|
| 42 |
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;"></p></p>
|
| 43 |
+
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;"><b>QwenPaw-Flash-9B</b> 的无审查版本,使用 <a href="https://github.com/p-e-w/heretic" target="_blank" style="color: #c2410c; text-decoration: none; font-weight: 700;">Heretic</a> v1.3.0 消融处理,并从原始 Qwen3.5-9B 基座模型注入了 <b>MTP(Multi-Token Prediction)头</b>权重。</p>
|
| 44 |
+
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;">通过重建在 QwenPaw 微调过程中被剥离的 MTP 投机解码头,本模型在真实 Agent 基准测试中实现了<b>最高 4.1× 推理加速</b>,同时保持或提升了准确率。</p>
|
| 45 |
</div>
|
| 46 |
</div>
|
| 47 |
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
|
| 48 |
+
<div style="background: linear-gradient(135deg, #ff6b35 0%, #f7931e 100%); padding: 12px 16px; color: white; font-weight: 700; font-size: 14px; display: flex; align-items: center; gap: 8px;"><span>📊</span> 🏆 BenchLocal 基准测试(��用 MTP)</div>
|
| 49 |
<div style="padding: 16px;">
|
| 50 |
+
<p style="margin: 0 0 12px 0; font-size: 13px; color: #64748b; font-style: italic; border-left: 3px solid #ffedd5; padding-left: 12px;"><b>测试环境</b>: NVIDIA RTX 5070 Ti (16GB) · llama.cpp (turboquant build, <code style="background: #f8fafc; padding: 2px 6px; border-radius: 4px; border: 1px solid #e2e8f0; font-size: 12px; color: #1e293b;">--spec-type draft-mtp</code>) · Q6_K quant</p>
|
| 51 |
+
<p style="margin: 0 0 12px 0; font-size: 13px; color: #64748b; font-style: italic; border-left: 3px solid #ffedd5; padding-left: 12px;"><b>测试框架</b>: <a href="https://github.com/stevibe/benchlocal" target="_blank" style="color: #c2410c; text-decoration: none; font-weight: 700;">BenchLocal</a> — 本地模型 Agent 评估套件</p>
|
| 52 |
+
<p style="margin: 0 0 12px 0; font-size: 13px; color: #64748b; font-style: italic; border-left: 3px solid #ffedd5; padding-left: 12px;"><b>测试方法</b>:每个场景运行<b>一次</b>,无重试,无二次尝试</p>
|
| 53 |
<table style="width: 100%; border-collapse: collapse; font-size: 13px;"><thead><tr style="background: rgba(255,107,53,0.05);">
|
| 54 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">基准测试</th>
|
| 55 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">得分</th>
|
| 56 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">准确率</th>
|
| 57 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">结果</th>
|
| 58 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">耗时</th>
|
| 59 |
+
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">对比无 MTP</th>
|
| 60 |
</tr></thead><tbody>
|
| 61 |
<tr>
|
| 62 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>ToolCall-15</b> 🛠️</td>
|
|
|
|
| 83 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>4.1×</b> faster</td>
|
| 84 |
</tr>
|
| 85 |
<tr>
|
| 86 |
+
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>Total</b></td>
|
| 87 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>4035/5000</b></td>
|
| 88 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>80.7%</b></td>
|
| 89 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">36✅ 3⚠️ 11❌</td>
|
|
|
|
| 91 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>1.9×</b> faster</td>
|
| 92 |
</tr>
|
| 93 |
</tbody></table>
|
| 94 |
+
<h3 style="margin: 16px 0 8px 0; font-size: 14px; color: #1e293b; font-weight: 700;">对比:有 MTP vs 无 MTP</h3>
|
| 95 |
<table style="width: 100%; border-collapse: collapse; font-size: 13px;"><thead><tr style="background: rgba(255,107,53,0.05);">
|
| 96 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">基准测试</th>
|
| 97 |
+
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">无 MTP</th>
|
| 98 |
+
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">有 MTP</th>
|
| 99 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">Δ Score</th>
|
| 100 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">Δ Speed</th>
|
| 101 |
</tr></thead><tbody>
|
|
|
|
| 121 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>4.1×</b></td>
|
| 122 |
</tr>
|
| 123 |
<tr>
|
| 124 |
+
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>Total</b></td>
|
| 125 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>3873/5000 (77.5%)</b></td>
|
| 126 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>4035/5000 (80.7%)</b></td>
|
| 127 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>+162 pts</b></td>
|
| 128 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>1.9×</b></td>
|
| 129 |
</tr>
|
| 130 |
<tr>
|
| 131 |
+
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>总耗时</b></td>
|
| 132 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>14.7 min</b></td>
|
| 133 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>7.8 min</b></td>
|
| 134 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">—</td>
|
| 135 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>1.9×</b></td>
|
| 136 |
</tr>
|
| 137 |
</tbody></table>
|
| 138 |
+
<h3 style="margin: 16px 0 8px 0; font-size: 14px; color: #1e293b; font-weight: 700;">🛠️ ToolCall-15 — 工具调用稳定性 (100%, +6.7 分)</h3>
|
| 139 |
+
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;">MTP 投机解码消除了唯一的失败项(TC-05:相对日期/时间解析,之前得分为 0)。全部 15 个场景现在完美通过。</p>
|
| 140 |
<table style="width: 100%; border-collapse: collapse; font-size: 13px;"><thead><tr style="background: rgba(255,107,53,0.05);">
|
| 141 |
+
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">编号</th>
|
| 142 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">结果</th>
|
| 143 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">场景</th>
|
| 144 |
</tr></thead><tbody>
|
| 145 |
<tr>
|
| 146 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">TC-01–TC-04</td>
|
| 147 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">✅</td>
|
| 148 |
+
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">简单 / 多参数 / 嵌套 / 类型转换</td>
|
| 149 |
</tr>
|
| 150 |
<tr>
|
| 151 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">TC-05</td>
|
| 152 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">✅</td>
|
| 153 |
+
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">相对日期/时间解析 ← <b>已被 MTP 修复</b></td>
|
| 154 |
</tr>
|
| 155 |
<tr>
|
| 156 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">TC-06–TC-15</td>
|
| 157 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">✅</td>
|
| 158 |
+
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">所有剩余场景</td>
|
| 159 |
</tr>
|
| 160 |
</tbody></table>
|
| 161 |
+
<h3 style="margin: 16px 0 8px 0; font-size: 14px; color: #1e293b; font-weight: 700;">🤖 HermesAgent-20 — 复杂 Agent 任务 (75.3%, −1.9 分)</h3>
|
| 162 |
+
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;">MTP 解码在长链推理场景中引入了轻微噪声(约 40 分下降),可能是因为 draft token 偶尔在多步规划任务中偏离生成路径。不过,速度提升(1.17×)以及下降幅度在噪声范围内(Qwopus MTP 单次运行方差为 255 分)使得这是一个可接受的权衡。</p>
|
| 163 |
+
<h3 style="margin: 16px 0 8px 0; font-size: 14px; color: #1e293b; font-weight: 700;">🐛 BugFind-15 — 代码调试 (68.7%, +6.8 分)</h3>
|
| 164 |
+
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;">显著提升 — MTP 更快的解码有效防止了超时失败(BF-12 之前触发 300 秒超时,现在按时完成),draft 上下文有助于保持调试焦点。</p>
|
| 165 |
<table style="width: 100%; border-collapse: collapse; font-size: 13px;"><thead><tr style="background: rgba(255,107,53,0.05);">
|
| 166 |
+
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">编号</th>
|
| 167 |
+
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">无 MTP</th>
|
| 168 |
+
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">有 MTP</th>
|
| 169 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">Δ</th>
|
| 170 |
</tr></thead><tbody>
|
| 171 |
<tr>
|
|
|
|
| 262 |
</div>
|
| 263 |
</div>
|
| 264 |
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
|
| 265 |
+
<div style="background: linear-gradient(135deg, #ff6b35 0%, #f7931e 100%); padding: 12px 16px; color: white; font-weight: 700; font-size: 14px; display: flex; align-items: center; gap: 8px;"><span>⚡</span> MTP Speculative Decoding</div>
|
| 266 |
<div style="padding: 16px;">
|
| 267 |
+
<h3 style="margin: 16px 0 8px 0; font-size: 14px; color: #1e293b; font-weight: 700;">什么是 MTP?</h3>
|
| 268 |
+
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;">Multi-Token Prediction(MTP)是一种投机解码技术,通过一个小型「draft 头」并行预测多个未来 token。主模型随后在单次前向传播中验证这些预测,接受正确的预测以实现 2-4× 的实际加速。</p>
|
| 269 |
+
<h3 style="margin: 16px 0 8px 0; font-size: 14px; color: #1e293b; font-weight: 700;">注入方法</h3>
|
| 270 |
+
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;">原始 Qwen3.5-9B 基座模型在架构配置中附带了一个 4 层 MTP 头(约 243M 参数)。在 QwenPaw 微调过程中,MTP 头权重被剥离(仅保留了配置占位符 <code style="background: #f8fafc; padding: 2px 6px; border-radius: 4px; border: 1px solid #e2e8f0; font-size: 12px; color: #1e293b;">mtp_num_hidden_layers: 1</code> ,但 safetensors 中没有实际张量)。</p>
|
| 271 |
+
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;"><b>恢复过程:</b></p>
|
| 272 |
<ol style="margin: 0 0 12px 0; padding-left: 20px;"></ol>
|
| 273 |
<ol style="margin: 0 0 12px 0; padding-left: 20px;"></ol>
|
| 274 |
<ol style="margin: 0 0 12px 0; padding-left: 20px;"></ol>
|
| 275 |
<ol style="margin: 0 0 12px 0; padding-left: 20px;"></ol>
|
| 276 |
<ol style="margin: 0 0 12px 0; padding-left: 20px;"></ol>
|
| 277 |
<ol style="margin: 0 0 12px 0; padding-left: 20px;"></ol>
|
| 278 |
+
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;"><b>注入参数总量:</b>243.3M(主模型的 2.7%)</p>
|
| 279 |
+
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;"><b>MTP 接受率 (draft-n-max=2):</b>约 50%(所有基准测试中 1083 次接受 / 2166 次生成)</p>
|
| 280 |
+
<h3 style="margin: 16px 0 8px 0; font-size: 14px; color: #1e293b; font-weight: 700;">为什么有效</h3>
|
| 281 |
+
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;">MTP 头是一个轻量级 4 层 MLP 解码器,将主模型的最后隐藏状态映射到未来 token 的 logits。它完全位于投机解码空间中 — 主模型权重不变,因此无需微调或重新训练。MTP 头只需以兼容的维度存在,llama.cpp 的 <code style="background: #f8fafc; padding: 2px 6px; border-radius: 4px; border: 1px solid #e2e8f0; font-size: 12px; color: #1e293b;">--spec-type draft-mtp</code> 即可激活。</p>
|
| 282 |
</div>
|
| 283 |
</div>
|
| 284 |
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
|
|
|
|
| 286 |
<div style="padding: 16px;">
|
| 287 |
<table style="width: 100%; border-collapse: collapse; font-size: 13px;"><thead><tr style="background: rgba(255,107,53,0.05);">
|
| 288 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">模型</th>
|
| 289 |
+
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">Total</th>
|
| 290 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">ToolCall-15</th>
|
| 291 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">HermesAgent-20</th>
|
| 292 |
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">BugFind-15</th>
|
| 293 |
+
<th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">总耗时</th>
|
| 294 |
</tr></thead><tbody>
|
| 295 |
<tr>
|
| 296 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>🐾 QwenPaw MTP 9B</b></td>
|
|
|
|
| 301 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>7.8min</b> 🥇</td>
|
| 302 |
</tr>
|
| 303 |
<tr>
|
| 304 |
+
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">🐾 QwenPaw 9B(无 MTP)</td>
|
| 305 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">3873</td>
|
| 306 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">93.3%</td>
|
| 307 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><b>77.2%</b> 🥇</td>
|
|
|
|
| 317 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">21.3min ⚠️</td>
|
| 318 |
</tr>
|
| 319 |
<tr>
|
| 320 |
+
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">🧠 Qwen 35B 思考模式开</td>
|
| 321 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">1445 (HA only)</td>
|
| 322 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">—</td>
|
| 323 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">72.3%</td>
|
|
|
|
| 325 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">7.0min</td>
|
| 326 |
</tr>
|
| 327 |
<tr>
|
| 328 |
+
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">⚡ Qwen 35B 思考模式关</td>
|
| 329 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">1370 (HA only)</td>
|
| 330 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">—</td>
|
| 331 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">68.5%</td>
|
|
|
|
| 341 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">18.6min</td>
|
| 342 |
</tr>
|
| 343 |
</tbody></table>
|
| 344 |
+
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;"><b>QwenPaw MTP 胜出</b>2/3 基准测试 + 总分 + 总耗时。唯一输掉的基准是 BugFind-15(输给 Qwopus MTP),但 Qwopus 存在严重不稳定性(HermesAgent-20 方差 255 分,最差情况 6.2 分钟超时)。</p>
|
| 345 |
</div>
|
| 346 |
</div>
|
| 347 |
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
|
| 348 |
<div style="background: linear-gradient(135deg, #ff6b35 0%, #f7931e 100%); padding: 12px 16px; color: white; font-weight: 700; font-size: 14px; display: flex; align-items: center; gap: 8px;"><span>🧠</span> Model Description</div>
|
| 349 |
<div style="padding: 16px;">
|
| 350 |
+
<ul style="margin: 0 0 12px 0; padding-left: 20px;"><li style="margin-bottom: 4px; font-size: 13px; color: #334155;">基座模型**: <a href="https://huggingface.co/agentscope-ai/QwenPaw-Flash-9B" target="_blank" style="color: #c2410c; text-decoration: none; font-weight: 700;">QwenPaw-Flash-9B</a> (Qwen3.5-9B 针对自主 Agent 场景微调)</li><li style="margin-bottom: 4px; font-size: 13px; color: #334155;">MTP 头来源**: <a href="https://huggingface.co/Qwen/Qwen3.5-9B" target="_blank" style="color: #c2410c; text-decoration: none; font-weight: 700;">Qwen/Qwen3.5-9B</a> (原始基座模型,第 32 层 MTP 头)</li><li style="margin-bottom: 4px; font-size: 13px; color: #334155;">工具**:Heretic v1.3.0(自动定向消融)</li><li style="margin-bottom: 4px; font-size: 13px; color: #334155;">最佳试验**:#194 / 230 次试验(消融)</li></ul>
|
| 351 |
</div>
|
| 352 |
</div>
|
| 353 |
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
|
|
|
|
| 367 |
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
|
| 368 |
<div style="background: linear-gradient(135deg, #ff6b35 0%, #f7931e 100%); padding: 12px 16px; color: white; font-weight: 700; font-size: 14px; display: flex; align-items: center; gap: 8px;"><span>🏗️</span> Architecture</div>
|
| 369 |
<div style="padding: 16px;">
|
| 370 |
+
<ul style="margin: 0 0 12px 0; padding-left: 20px;"><li style="margin-bottom: 4px; font-size: 13px; color: #334155;">类型**:Qwen3_5ForConditionalGeneration(多模态,含视觉编码器)+ MTP 投机解码头</li><li style="margin-bottom: 4px; font-size: 13px; color: #334155;">主模型参数量**:~9B</li><li style="margin-bottom: 4px; font-size: 13px; color: #334155;">MTP 头参数量**:~243M(2.7% 额外开销)</li><li style="margin-bottom: 4px; font-size: 13px; color: #334155;">层数**:32(混合:Gated DeltaNet + Gated Attention)+ 4 层 MTP 解码器</li><li style="margin-bottom: 4px; font-size: 13px; color: #334155;">上下文长度**:262,144 tokens</li><li style="margin-bottom: 4px; font-size: 13px; color: #334155;">投机解码**: <code style="background: #f8fafc; padding: 2px 6px; border-radius: 4px; border: 1px solid #e2e8f0; font-size: 12px; color: #1e293b;">--spec-type draft-mtp</code> 配合 <code style="background: #f8fafc; padding: 2px 6px; border-radius: 4px; border: 1px solid #e2e8f0; font-size: 12px; color: #1e293b;">--spec-draft-n-max 2</code></li></ul>
|
| 371 |
</div>
|
| 372 |
</div>
|
| 373 |
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
|
|
|
|
| 381 |
<tr>
|
| 382 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><code style="background: #f8fafc; padding: 2px 6px; border-radius: 4px; border: 1px solid #e2e8f0; font-size: 12px; color: #1e293b;">QwenPaw-Flash-9B-heretic-MTP-Q8_0.gguf</code></td>
|
| 383 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">~9.2GB</td>
|
| 384 |
+
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">高质量,近乎无损</td>
|
| 385 |
</tr>
|
| 386 |
<tr>
|
| 387 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><code style="background: #f8fafc; padding: 2px 6px; border-radius: 4px; border: 1px solid #e2e8f0; font-size: 12px; color: #1e293b;">QwenPaw-Flash-9B-heretic-MTP-Q6_K.gguf</code></td>
|
| 388 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">~7.1GB</td>
|
| 389 |
+
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">✅ <b>推荐,最佳性价比</b></td>
|
| 390 |
</tr>
|
| 391 |
<tr>
|
| 392 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><code style="background: #f8fafc; padding: 2px 6px; border-radius: 4px; border: 1px solid #e2e8f0; font-size: 12px; color: #1e293b;">QwenPaw-Flash-9B-heretic-MTP-Q4_K_M.gguf</code></td>
|
| 393 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">~5.4GB</td>
|
| 394 |
+
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">紧凑</td>
|
| 395 |
</tr>
|
| 396 |
<tr>
|
| 397 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;"><code style="background: #f8fafc; padding: 2px 6px; border-radius: 4px; border: 1px solid #e2e8f0; font-size: 12px; color: #1e293b;">mmproj-BF16</code></td>
|
| 398 |
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">~880MB</td>
|
| 399 |
+
<td style="padding: 7px 10px; border-bottom: 1px solid rgba(128,128,128,0.15); color: #334155; text-align: left;">视觉编码器(多模态)— 与非 MTP 版本相同</td>
|
| 400 |
</tr>
|
| 401 |
</tbody></table>
|
| 402 |
</div>
|
|
|
|
| 406 |
<div style="padding: 16px;">
|
| 407 |
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;">--spec-type draft-mtp</p>
|
| 408 |
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;">--spec-draft-n-max 2</p>
|
| 409 |
+
<h3 style="margin: 16px 0 8px 0; font-size: 14px; color: #1e293b; font-weight: 700;">llama.cpp (配合 MTP speculative decoding)</h3>
|
| 410 |
+
<p style="margin: 0; font-family: monospace; background: #f8fafc; padding: 10px 14px; border-radius: 6px; border: 1px solid #e2e8f0; font-size: 12px; color: #1e293b; white-space: pre-wrap;"># Start server 配合 MTP enabled
|
| 411 |
llama-server -m QwenPaw-Flash-9B-heretic-MTP-Q6_K.gguf \
|
| 412 |
-ngl 99 -fa on -c 8192 \
|
| 413 |
--spec-type draft-mtp --spec-draft-n-max 2 \
|
| 414 |
--host 0.0.0.0 --port 8088
|
| 415 |
|
| 416 |
+
# Or 配合 CLI
|
| 417 |
llama-cli -m QwenPaw-Flash-9B-heretic-MTP-Q6_K.gguf \
|
| 418 |
-ngl 99 -fa on -c 8192 \
|
| 419 |
--spec-type draft-mtp --spec-draft-n-max 2 \
|
| 420 |
-p "Write a Python script to..."</p>
|
| 421 |
+
<h3 style="margin: 16px 0 8px 0; font-size: 14px; color: #1e293b; font-weight: 700;">llama.cpp (配合out MTP, fallback)</h3>
|
| 422 |
+
<p style="margin: 0; font-family: monospace; background: #f8fafc; padding: 10px 14px; border-radius: 6px; border: 1px solid #e2e8f0; font-size: 12px; color: #1e293b; white-space: pre-wrap;"># 模型也可作为普通 GGUF 使用 — 只需省略投机解码参数
|
| 423 |
llama-server -m QwenPaw-Flash-9B-heretic-MTP-Q6_K.gguf \
|
| 424 |
-ngl 99 -fa on -c 8192 \
|
| 425 |
--host 0.0.0.0 --port 8088</p>
|
| 426 |
<h3 style="margin: 16px 0 8px 0; font-size: 14px; color: #1e293b; font-weight: 700;">LM Studio</h3>
|
| 427 |
+
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;">直接加载 GGUF 文件。如需 MTP 投机解码,LM Studio 需要支持 <code style="background: #f8fafc; padding: 2px 6px; border-radius: 4px; border: 1px solid #e2e8f0; font-size: 12px; color: #1e293b;">--spec-type</code> — 如果不支持,模型将作为标准 9B 模型运行。</p>
|
| 428 |
</div>
|
| 429 |
</div>
|
| 430 |
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
|
| 431 |
<div style="background: linear-gradient(135deg, #ff6b35 0%, #f7931e 100%); padding: 12px 16px; color: white; font-weight: 700; font-size: 14px; display: flex; align-items: center; gap: 8px;"><span>📝</span> Notes</div>
|
| 432 |
<div style="padding: 16px;">
|
| 433 |
+
<ol style="margin: 0 0 12px 0; padding-left: 20px;"><li style="margin-bottom: 4px; font-size: 13px; color: #334155;">安全过滤器已通过消融显著降低</li><li style="margin-bottom: 4px; font-size: 13px; color: #334155;">KL 散度仅为 0.0225 — 对模型智能影响极小</li><li style="margin-bottom: 4px; font-size: 13px; color: #334155;">原始模型支持多模态(视觉);GGUF 版本需要非 MTP 版本的 mmproj 文件</li><li style="margin-bottom: 4px; font-size: 13px; color: #334155;">BenchLocal 分数在 <b>Q6_K</b> on RTX 5070 Ti 16GB 配合 llama.cpp (turboquant). Each scenario was run <b>once 配合 no retries</b></li><li style="margin-bottom: 4px; font-size: 13px; color: #334155;">MTP 在 draft-n-max=2 下约 50% 的接受率意味着短提示约 25-40% 的实际加速,长生成任务(调试、代码编写)最高可达 4×</li><li style="margin-bottom: 4px; font-size: 13px; color: #334155;">BugFind-15 提升最大(4.1×),因为调试任务是生成密集型 — token 更多,接受的 draft 更多</li><li style="margin-bottom: 4px; font-size: 13px; color: #334155;">MTP 头是从原始 Qwen3.5-9B 的<b>无损拷贝</b> — 不涉及训练,仅是权重注入</li><li style="margin-bottom: 4px; font-size: 13px; color: #334155;">Agent 密集型场景(HermesAgent-20)从 MTP 获益最少,因为短轮次交互没有给 draft 头足够的发挥空间</li><li style="margin-bottom: 4px; font-size: 13px; color: #334155;">请负责任地使用</li></ol>
|
| 434 |
</div>
|
| 435 |
</div>
|
| 436 |
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
|
| 437 |
<div style="background: linear-gradient(135deg, #ff6b35 0%, #f7931e 100%); padding: 12px 16px; color: white; font-weight: 700; font-size: 14px; display: flex; align-items: center; gap: 8px;"><span>🙏</span> Acknowledgements</div>
|
| 438 |
<div style="padding: 16px;">
|
| 439 |
+
<ul style="margin: 0 0 12px 0; padding-left: 20px;"><li style="margin-bottom: 4px; font-size: 13px; color: #334155;"><a href="https://github.com/p-e-w/heretic" target="_blank" style="color: #c2410c; text-decoration: none; font-weight: 700;">Heretic</a> — 自动化审查移除</li><li style="margin-bottom: 4px; font-size: 13px; color: #334155;"><a href="https://huggingface.co/agentscope-ai/QwenPaw-Flash-9B" target="_blank" style="color: #c2410c; text-decoration: none; font-weight: 700;">agentscope-ai/QwenPaw-Flash-9B</a> — 基座模型</li><li style="margin-bottom: 4px; font-size: 13px; color: #334155;"><a href="https://huggingface.co/Qwen/Qwen3.5-9B" target="_blank" style="color: #c2410c; text-decoration: none; font-weight: 700;">Qwen/Qwen3.5-9B</a> — MTP 头来源</li><li style="margin-bottom: 4px; font-size: 13px; color: #334155;"><a href="https://github.com/ggml-org/llama.cpp" target="_blank" style="color: #c2410c; text-decoration: none; font-weight: 700;">llama.cpp</a> — GGUF 量化与推理</li><li style="margin-bottom: 4px; font-size: 13px; color: #334155;"><a href="https://github.com/stevibe/benchlocal" target="_blank" style="color: #c2410c; text-decoration: none; font-weight: 700;">BenchLocal</a> — 本地模型 Agent 评估套件</li></ul>
|
| 440 |
</div>
|
| 441 |
</div>
|
| 442 |
</div>
|