How to use from
llama.cpp
Install (macOS, Linux)
curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf xCloudinfo/Qwen3.6-35B-A3B-Uncensored-xCloud-GGUF:UD-Q4_K_M
# Run inference directly in the terminal:
llama cli -hf xCloudinfo/Qwen3.6-35B-A3B-Uncensored-xCloud-GGUF:UD-Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf xCloudinfo/Qwen3.6-35B-A3B-Uncensored-xCloud-GGUF:UD-Q4_K_M
# Run inference directly in the terminal:
llama cli -hf xCloudinfo/Qwen3.6-35B-A3B-Uncensored-xCloud-GGUF:UD-Q4_K_M
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf xCloudinfo/Qwen3.6-35B-A3B-Uncensored-xCloud-GGUF:UD-Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf xCloudinfo/Qwen3.6-35B-A3B-Uncensored-xCloud-GGUF:UD-Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf xCloudinfo/Qwen3.6-35B-A3B-Uncensored-xCloud-GGUF:UD-Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf xCloudinfo/Qwen3.6-35B-A3B-Uncensored-xCloud-GGUF:UD-Q4_K_M
Use Docker
docker model run hf.co/xCloudinfo/Qwen3.6-35B-A3B-Uncensored-xCloud-GGUF:UD-Q4_K_M
Quick Links

Qwen3.6-35B-A3B-Uncensored-xCloud-GGUF

云碩科技 · xCloudinfo · 系列:無審查 · Abliterated / Uncensored

⚠️ 經 abliteration(aggressive) 處理,會比原版更少拒答。請僅在你被授權的範圍內使用。

xCloudinfo/Qwen3.6-35B-A3B-Uncensored-xCloudGGUF(llama.cpp / Ollama) 量化版本。

檔案 量化 大小
…-Q6_K.gguf Q6_K(接近無損) ≈ 28 GB
…-Q4_K_M.gguf Q4_K_M(部署首選) ≈ 21 GB

純文字 GGUF(不含 mmproj)。需要圖文請用 merged safetensors 版。

用法

llama-server -m Qwen3.6-35B-A3B-Uncensored-xCloud-Q4_K_M.gguf -c 8192 -ngl 99

原理

依 Arditi et al. (2024),以權重正交化移除殘差流拒絕方向(91 矩陣含 256-expert MoE、strength 1.0),不重訓。詳見 merged 版模型卡。

用途與責任

  • 設計用途:降低過度拒絕,供授權範圍內研究與應用。
  • 已移除安全拒絕傾向,使用者須自行加上安全防護與輸出審查,並對用途與後果負完全責任;不得用於有害或違法用途。

授權與來源


由 云碩科技 xCloudinfo 於自有 AI 算力資源池製作。

Downloads last month
708
GGUF
Model size
35B params
Architecture
qwen35moe
Hardware compatibility
Log In to add your hardware

4-bit

6-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for xCloudinfo/Qwen3.6-35B-A3B-Uncensored-xCloud-GGUF

Quantized
(1)
this model

Collection including xCloudinfo/Qwen3.6-35B-A3B-Uncensored-xCloud-GGUF