--- base_model: huihui-ai/Huihui-Qwen3.5-35B-A3B-abliterated tags: - gguf - abliterated - uncensored - conversational - qwen3_5_moe - llama-cpp library_name: gguf license: apache-2.0 --- > [!WARNING] > **A significantly improved version of this model is available.** > > This repo quantizes [huihui-ai/Huihui-Qwen3.5-35B-A3B-abliterated](https://huggingface.co/huihui-ai/Huihui-Qwen3.5-35B-A3B-abliterated) — the standard abliteration. > A newer fine-tune of the same architecture, trained in the style of Claude 4.6 Opus, has since been released and produces noticeably richer, more expressive outputs. > > **➡️ Recommended upgrade: [Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-Q4_K_M-GGUF](https://huggingface.co/cesarsal1nas/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-Q4_K_M-GGUF)** > > Same architecture, same Q4_K_M quantization, same VRAM footprint — just a better fine-tune. This repo will remain available for reference. # Huihui-Qwen3.5-35B-A3B-abliterated — Q4_K_M GGUF This is a Q4_K_M GGUF quantization of [huihui-ai/Huihui-Qwen3.5-35B-A3B-abliterated](https://huggingface.co/huihui-ai/Huihui-Qwen3.5-35B-A3B-abliterated). Refer to the [original model card](https://huggingface.co/huihui-ai/Huihui-Qwen3.5-35B-A3B-abliterated) for full details, usage warnings, and licensing information. ## Details | Property | Value | |---|---| | Source model | [huihui-ai/Huihui-Qwen3.5-35B-A3B-abliterated](https://huggingface.co/huihui-ai/Huihui-Qwen3.5-35B-A3B-abliterated) | | Architecture | qwen35moe (35B total params, ~3B active; 256 experts, 8 active per token) | | Quantization | Q4_K_M (~4.8 BPW) | | File size | ~21 GB | | Quantized with | [llama.cpp](https://github.com/ggerganov/llama.cpp) | ## Usage with llama.cpp ```bash llama-cli \ --hf-repo cesarsal1nas/Huihui-Qwen3.5-35B-A3B-abliterated-Q4_K_M-GGUF \ --hf-file huihui-qwen3.5-35b-a3b-abliterated-Q4_K_M.gguf \ -p "Tell me about the universe" ``` ```bash llama-server \ --hf-repo cesarsal1nas/Huihui-Qwen3.5-35B-A3B-abliterated-Q4_K_M-GGUF \ --hf-file huihui-qwen3.5-35b-a3b-abliterated-Q4_K_M.gguf \ -c 8192 ``` ## Usage with Ollama Requires Ollama with qwen35moe support. See [PR #14506](https://github.com/ollama/ollama/pull/14506) for the architecture patch. ```bash ollama run hf.co/cesarsal1nas/Huihui-Qwen3.5-35B-A3B-abliterated-Q4_K_M-GGUF ``` ## Credits - **Abliteration** by [huihui-ai](https://huggingface.co/huihui-ai) — [Huihui-Qwen3.5-35B-A3B-abliterated](https://huggingface.co/huihui-ai/Huihui-Qwen3.5-35B-A3B-abliterated) - **Base model** by [Qwen](https://huggingface.co/Qwen) — [Qwen3.5-35B-A3B](https://huggingface.co/Qwen/Qwen3.5-35B-A3B) - **Quantization** by [cesarsal1nas](https://huggingface.co/cesarsal1nas)