--- license: apache-2.0 base_model: huihui-ai/Huihui-Qwen3.6-35B-A3B-abliterated tags: - qwen3.6 - gguf - llama.cpp - abliterated - uncensored - roleplay --- # Huihui Qwen3.6-35B A3B Abliterated (GGUF) This repository provides GGUF format quantizations for the [huihui-ai/Huihui-Qwen3.6-35B-A3B-abliterated](https://huggingface.co/huihui-ai/Huihui-Qwen3.6-35B-A3B-abliterated) model. Because this model has been fully "abliterated" to bypass alignment and safety refusals, it acts as a highly capable engine for unrestricted creative writing, dynamic storytelling, and immersive roleplay scenarios. ## Available Quantizations | File | Bit Size | Description | |------|----------|-------------| | `huihui-35B-Q8_0.gguf` | 8-bit | Highest quality quant, virtually indistinguishable from F16. | | `huihui-35B-Q6_K.gguf` | 6-bit | Excellent quality with a noticeably reduced memory footprint. | | `huihui-35B-Q5_K_M.gguf`| 5-bit | Great balance between reasoning performance and RAM usage. | | `huihui-35B-Q4_K_M.gguf`| 4-bit | **Recommended.** The optimal sweet spot for speed and quality. | | `huihui-35B-Q4_K_S.gguf`| 4-bit | Slightly smaller than K_M, allowing for faster inference on constrained setups. | | `huihui-35B-Q3_K_M.gguf`| 3-bit | Lowest resource requirement, though perplexity loss becomes more noticeable. | ## Quick Start (llama.cpp) These models are designed to be run directly via `llama.cpp`. The following commands are standard for local Linux environments (such as Linux Mint or Ubuntu). **1. Clone and compile via CMake:** ```bash git clone [https://github.com/ggerganov/llama.cpp](https://github.com/ggerganov/llama.cpp) cd llama.cpp cmake -B build cmake --build build --config Release