--- license: other library_name: transformers base_model: Qwen/Qwen3.5-122B-A10B tags: - qwen - qwen3 - qwen3.5 - moe - abliterated - uncensored - sft - opus - qwopus - multimodal - vision - mtp pipeline_tag: image-text-to-text ---
OpenYourMind
## Support & Community
**☕ If these models are useful to you, consider supporting my work — it funds compute for more & larger abliterations.** Buy Me A Coffee [**buymeacoffee.com/oym.kuato**](https://buymeacoffee.com/oym.kuato) 💬 **Discord:** [discord.gg/rhUZY5GEZr](https://discord.gg/rhUZY5GEZr)  ·  ₿ **Bitcoin:** `bc1qsvfduzj9fjs9fugpc52yver3f2g8fp7xjxecdv`
--- # Qwopus3.5-122B-A10B-abliterated-uncensored > [!WARNING] > **This model is superseded — please use the healed version.** > This is the older *abliterated-uncensored* release. The recommended replacement is > [**Qwopus3.5-122B-A10B-Kimi-K2.6-destill-healed-abliterated**](https://huggingface.co/OpenYourMind/Qwopus3.5-122B-A10B-Kimi-K2.6-destill-healed-abliterated), which adds a **Kimi K2.6 reasoning-DPO** > healing pass on top of this model: improved reasoning verbosity (~12% of requests) and far fewer > looping / repetition failures (2–6% of long-tail conversations). > > ➡️ **Recommended MLX 4-bit build: [Qwopus3.5-122B-A10B-Kimi-K2.6-destill-healed-abliterated-MLX-4bit](https://huggingface.co/OpenYourMind/Qwopus3.5-122B-A10B-Kimi-K2.6-destill-healed-abliterated-MLX-4bit)** > All formats: [Full weights](https://huggingface.co/OpenYourMind/Qwopus3.5-122B-A10B-Kimi-K2.6-destill-healed-abliterated) · [GGUF](https://huggingface.co/OpenYourMind/Qwopus3.5-122B-A10B-Kimi-K2.6-destill-healed-abliterated-GGUF) · [MLX 4-bit](https://huggingface.co/OpenYourMind/Qwopus3.5-122B-A10B-Kimi-K2.6-destill-healed-abliterated-MLX-4bit) ## Overview Full BF16 weights of **Qwopus3.5-122B-A10B-abliterated-uncensored** — an abliterated and supervised-finetuned variant of [Qwen/Qwen3.5-122B-A10B](https://huggingface.co/Qwen/Qwen3.5-122B-A10B) (Mixture of Experts, ~10B active / 122B total). The model is uncensored, multimodal (image + text), and ships with the MTP head intact so it is a drop-in replacement for the original base model at the architecture level. The pipeline: 1. **Refusal Ablation** — Residual-stream refusal directions (one per decoder layer, layers 19–45) were extracted via diff-in-means on a labeled prompt set and baked into the weights as a per-matrix delta — see the **abliterix** framework for the methodology. 2. **Healing — Stage A: Constrained-LoRA SFT on Opus reasoning data** — Supervised finetuned on a curated set of Claude Opus reasoning traces (single-turn, ~8k rows). To keep the abliteration mathematically intact during training, a custom orthogonality projection is applied to every LoRA `B`-matrix on residual-write modules after each optimizer step (`B := B − r·(rᵀB)`), so the LoRA update is forbidden from re-introducing the refusal direction. LoRA rank 32, α 64, 54 protected modules across 27 decoder layers. Verified residual after training: `max ‖rᵀB‖₂ = 8.5 × 10⁻¹⁰`. 3. **Healing — Stage B: Unconstrained SFT on chosen completions** — A second short SFT pass (LoRA r=16, α 32, no orthogonality constraint) on the *chosen* answers (including reasoning chains) from an internal preference dataset, to tighten on the deployment distribution and remove the last bits of drift introduced by Stage A. 4. **Vision + MTP Restoration** — The original Qwen3.5 vision tower (333 tensors, depth 27, hidden 1152) and MTP head (785 tensors, 1 hidden layer) were grafted back from the upstream `Qwen/Qwen3.5-122B-A10B` shards. Tensor names, shapes, and `config.json` schema (`Qwen3_5MoeForConditionalGeneration`, `model_type: qwen3_5_moe`) match the base model exactly — so this checkpoint loads anywhere the original loads. **Key Properties:** - Uncensored across the standard refusal axes - Reasoning preserved (Opus-style think-then-answer) - Multimodal: vision (image / video) and MTP heads carried forward - Drop-in shape compatibility with `Qwen/Qwen3.5-122B-A10B` ## Files | File | Description | Size | |------|-------------|------| | `model-*-of-00028.safetensors` | BF16 language model weights (48 decoder layers, MoE with 256 routed experts + shared expert per layer) | ~228 GB | | `model-visual-00001.safetensors` | BF16 vision tower (333 tensors) | ~0.9 GB | | `model-mtp-00001.safetensors` | BF16 MTP head (785 tensors) | ~5.1 GB | | `model.safetensors.index.json` | Combined weight map (38,717 tensors) | — | | `config.json` | Multimodal config (`Qwen3_5MoeForConditionalGeneration`) | — | | `tokenizer*`, `chat_template.jinja`, `generation_config.json` | Standard | — | Total on disk: **~234 GB**. ## Usage ```python from transformers import AutoModelForImageTextToText, AutoProcessor repo = "OpenYourMind/Qwopus3.5-122B-A10B-abliterated-uncensored" model = AutoModelForImageTextToText.from_pretrained(repo, dtype="bfloat16", device_map="auto") processor = AutoProcessor.from_pretrained(repo) messages = [{"role": "user", "content": [ {"type": "image", "url": "path/to/image.jpg"}, {"type": "text", "text": "Describe this image in detail."}, ]}] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_tensors="pt", return_dict=True, ).to(model.device) out = model.generate(**inputs, max_new_tokens=512) print(processor.batch_decode(out, skip_special_tokens=True)[0]) ``` Text-only inference works through the same class; if you don't need vision/MTP, you can also load just the language model with `AutoModelForCausalLM`. ## Hardware Full BF16 weights — fits comfortably on **2× H200** or **4× H100 (80 GB)** with room for context. Single-node inference targets **≥ 130 GB** total accelerator memory. For Apple Silicon, see the upcoming MLX quants. ## Notes - **License**: Other (inherits from the Qwen3.5 base license) - **Base Model**: [Qwen/Qwen3.5-122B-A10B](https://huggingface.co/Qwen/Qwen3.5-122B-A10B) - **Healing**: Supervised finetuned on selected Opus training datasets - **Modality**: Text + Vision (image / video) + MTP - **Architecture**: Qwen3 MoE (~10B active / 122B total) + Qwen3-VL vision tower + MTP head ## Thanks - [Jackrong](https://huggingface.co/Jackrong) — for the idea of **Qwopus** merges (Opus distillations on Qwen models). - [wangzhang](https://huggingface.co/wangzhang) — for the wonderful **abliterix** framework, which was customized to do this abliteration. ## Disclaimer Use is the responsibility of the user. Ensure your usage complies with applicable laws, platform rules, and deployment requirements.