How to use from
llama.cpp
Install from brew
brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Abiray/Huihui-Qwen3.6-27B-abliterated-Q4_K_M-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Abiray/Huihui-Qwen3.6-27B-abliterated-Q4_K_M-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Abiray/Huihui-Qwen3.6-27B-abliterated-Q4_K_M-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Abiray/Huihui-Qwen3.6-27B-abliterated-Q4_K_M-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Abiray/Huihui-Qwen3.6-27B-abliterated-Q4_K_M-GGUF:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf Abiray/Huihui-Qwen3.6-27B-abliterated-Q4_K_M-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Abiray/Huihui-Qwen3.6-27B-abliterated-Q4_K_M-GGUF:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Abiray/Huihui-Qwen3.6-27B-abliterated-Q4_K_M-GGUF:Q4_K_M
Use Docker
docker model run hf.co/Abiray/Huihui-Qwen3.6-27B-abliterated-Q4_K_M-GGUF:Q4_K_M
Quick Links

Qwen 3.6 27B Abliterated - Q4_K_M (GGUF)

This repository contains the dedicated Q4_K_M GGUF format quantized weights for the abliterated version of Qwen/Qwen3.6-27B.

These files are highly optimized for use with llama.cpp and compatible local inference engines like LM Studio, text-generation-webui, and KoboldCPP.

🔗 Main Repository (Full Quantization Suite)

Looking for a different size? This repository strictly hosts the Q4_K_M model for fast, targeted downloading. If you need smaller or larger quantizations (such as Q3_K_M, Q5_K_M, Q6_K, or Q8_0), please visit the main repository here: 👉 Abiray/Huihui-Qwen3.6-27B-abliterated-GGUF

Abliteration Notes

This model has been processed to remove inherent safety filters and refusal mechanisms. It is highly compliant and will generate responses to complex, edge-case, or typically restricted prompts directly from its base weights. No specialized system prompts or catalyst pre-fills are required to bypass refusals.

Usage with llama.cpp

You can run this model via the command line using standard llama-cli commands. Since the model is abliterated, you do not need to wrap prompts in heavy system instructions.

# Basic inference
./llama-cli -m qwen3.6-27b-abliterated-Q4_K_M.gguf -p "Your prompt here" -n 512 -c 4096

# Interactive conversation mode
./llama-cli -m qwen3.6-27b-abliterated-Q4_K_M.gguf -i -cnv -c 8192
Downloads last month
263
GGUF
Model size
27B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Abiray/Huihui-Qwen3.6-27B-abliterated-Q4_K_M-GGUF

Base model

Qwen/Qwen3.6-27B
Quantized
(475)
this model

Collection including Abiray/Huihui-Qwen3.6-27B-abliterated-Q4_K_M-GGUF