---
base_model: huihui-ai/Huihui-Qwen3.5-35B-A3B-abliterated
tags:
- gguf
- abliterated
- uncensored
- conversational
- qwen3_5_moe
- llama-cpp
library_name: gguf
license: apache-2.0
---

> [!WARNING]
> **A significantly improved version of this model is available.**
>
> This repo quantizes [huihui-ai/Huihui-Qwen3.5-35B-A3B-abliterated](https://huggingface.co/huihui-ai/Huihui-Qwen3.5-35B-A3B-abliterated) — the standard abliteration.
> A newer fine-tune of the same architecture, trained in the style of Claude 4.6 Opus, has since been released and produces noticeably richer, more expressive outputs.
>
> **➡️ Recommended upgrade: [Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-Q4_K_M-GGUF](https://huggingface.co/cesarsal1nas/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-Q4_K_M-GGUF)**
>
> Same architecture, same Q4_K_M quantization, same VRAM footprint — just a better fine-tune. This repo will remain available for reference.


# Huihui-Qwen3.5-35B-A3B-abliterated — Q4_K_M GGUF

This is a Q4_K_M GGUF quantization of [huihui-ai/Huihui-Qwen3.5-35B-A3B-abliterated](https://huggingface.co/huihui-ai/Huihui-Qwen3.5-35B-A3B-abliterated).

Refer to the [original model card](https://huggingface.co/huihui-ai/Huihui-Qwen3.5-35B-A3B-abliterated) for full details, usage warnings, and licensing information.

## Details

| Property | Value |
|---|---|
| Source model | [huihui-ai/Huihui-Qwen3.5-35B-A3B-abliterated](https://huggingface.co/huihui-ai/Huihui-Qwen3.5-35B-A3B-abliterated) |
| Architecture | qwen35moe (35B total params, ~3B active; 256 experts, 8 active per token) |
| Quantization | Q4_K_M (~4.8 BPW) |
| File size | ~21 GB |
| Quantized with | [llama.cpp](https://github.com/ggerganov/llama.cpp) |

## Usage with llama.cpp

```bash
llama-cli \
  --hf-repo cesarsal1nas/Huihui-Qwen3.5-35B-A3B-abliterated-Q4_K_M-GGUF \
  --hf-file huihui-qwen3.5-35b-a3b-abliterated-Q4_K_M.gguf \
  -p "Tell me about the universe"
```

```bash
llama-server \
  --hf-repo cesarsal1nas/Huihui-Qwen3.5-35B-A3B-abliterated-Q4_K_M-GGUF \
  --hf-file huihui-qwen3.5-35b-a3b-abliterated-Q4_K_M.gguf \
  -c 8192
```

## Usage with Ollama

Requires Ollama with qwen35moe support. See [PR #14506](https://github.com/ollama/ollama/pull/14506) for the architecture patch.

```bash
ollama run hf.co/cesarsal1nas/Huihui-Qwen3.5-35B-A3B-abliterated-Q4_K_M-GGUF
```

## Credits

- **Abliteration** by [huihui-ai](https://huggingface.co/huihui-ai) — [Huihui-Qwen3.5-35B-A3B-abliterated](https://huggingface.co/huihui-ai/Huihui-Qwen3.5-35B-A3B-abliterated)
- **Base model** by [Qwen](https://huggingface.co/Qwen) — [Qwen3.5-35B-A3B](https://huggingface.co/Qwen/Qwen3.5-35B-A3B)
- **Quantization** by [cesarsal1nas](https://huggingface.co/cesarsal1nas)