--- license: apache-2.0 language: - en - zh - multilingual tags: - qwen3.5 - qwen35 - gguf - multimodal - vision - image-text-to-text - abliterated - uncensored - llama.cpp - 4b pipeline_tag: image-text-to-text base_model: - huihui-ai/Huihui-Qwen3.5-4B-Claude-4.6-Opus-abliterated - Qwen/Qwen3.5-4B library_name: llama.cpp --- # Huihui-Qwen3.5-4B-Claude-4.6-Opus-abliterated — GGUF GGUF conversion of [huihui-ai/Huihui-Qwen3.5-4B-Claude-4.6-Opus-abliterated](https://huggingface.co/huihui-ai/Huihui-Qwen3.5-4B-Claude-4.6-Opus-abliterated) for use with [llama.cpp](https://github.com/ggerganov/llama.cpp). ## Credits | Role | Model / Author | |---|---| | **Base LLM** | [Qwen/Qwen3.5-4B](https://huggingface.co/Qwen/Qwen3.5-4B) — Alibaba Qwen Team | | **Abliterated (uncensored)** | [huihui-ai/Huihui-Qwen3.5-4B-Claude-4.6-Opus-abliterated](https://huggingface.co/huihui-ai/Huihui-Qwen3.5-4B-Claude-4.6-Opus-abliterated) — Huihui AI | | **GGUF Conversion** | [hotdogs](https://huggingface.co/hotdogs) — via [llama.cpp](https://github.com/ggerganov/llama.cpp) | 🙏 Huge thanks to **Qwen Team** (Alibaba) for the base model, **Huihui AI** for the abliteration, and **ggerganov** for llama.cpp! ## Model Details | Spec | Value | |---|---| | Parameters | ~4B | | Architecture | Qwen3.5 Multimodal (QWEN35) | | hiddensize | 2560 | | Layers | 32 | | Attention Heads | 16 (KV: 4) | | Context Length | **262,144** (256K tokens) | | FFN Intermediate | 9216 | | Vision Encoder | 24 layers, hiddensize=1024, patchsize=16 | | Modality | **image-text-to-text** 🖼️➡️📝 | | Censorship | **Abliterated** (refusal direction removed) | | License | Apache 2.0 | ## Available Quantizations | File | Size | BPW | Quality | Recommended For | |---|---|---|---|---| | huihui-qwen35-4b-BF16.gguf | 7.9 GB | 16.00 | ⭐ Full | Best quality, 16GB+ VRAM | | huihui-qwen35-4b-Q8_0.gguf | 4.2 GB | ~8.00 | ⭐ Very High | Balanced, 8GB+ VRAM | | huihui-qwen35-4b-Q4_K_M.gguf | 2.6 GB | 5.13 | ⭐ Good | Low VRAM, 6GB+ VRAM | | mmproj-huihui-qwen35-4b-BF16.gguf | 645 MB | — | Vision | **Multimodal projector** (required for images) | ## Usage ### Text-only ./llama-cli -m huihui-qwen35-4b-Q4_K_M.gguf -p "Hello!" -n 256 ### Multimodal (image + text) ./llama-qwen2vl-cli -m huihui-qwen35-4b-Q4_K_M.gguf --mmproj mmproj-huihui-qwen35-4b-BF16.gguf --image photo.jpg -p "Describe this image" ### Server (OpenAI-compatible API) ./llama-server -m huihui-qwen35-4b-Q4_K_M.gguf --mmproj mmproj-huihui-qwen35-4b-BF16.gguf --host 0.0.0.0 --port 8080 ### Python (llama-cpp-python) llm = Llama(model_path="huihui-qwen35-4b-Q4_K_M.gguf", n_ctx=32768) output = llm("Hello!", max_tokens=128) ## About Abliteration This model has undergone **directional ablation** — a technique that removes the "refusal direction" from the model's activation space (Arditi et al. 2024). The model will not refuse questions that base Qwen3.5 would normally decline. **Use responsibly.** Ensure your use case complies with applicable laws. ## Conversion Notes - Converted with llama.cpp convert_hf_to_gguf.py - BF16 output type - QWEN35 architecture, Qwen3VLVisionModel for mmproj - Metadata preserved from source model