---
license: other
library_name: transformers
base_model: Qwen/Qwen3.5-122B-A10B
tags:
- qwen
- qwen3
- qwen3.5
- moe
- abliterated
- uncensored
- sft
- opus
- qwopus
- multimodal
- vision
- mtp
pipeline_tag: image-text-to-text
---
## Support & Community
**☕ If these models are useful to you, consider supporting my work — it funds compute for more & larger abliterations.**

[**buymeacoffee.com/oym.kuato**](https://buymeacoffee.com/oym.kuato)
💬 **Discord:** [discord.gg/rhUZY5GEZr](https://discord.gg/rhUZY5GEZr) · ₿ **Bitcoin:** `bc1qsvfduzj9fjs9fugpc52yver3f2g8fp7xjxecdv`
---
# Qwopus3.5-122B-A10B-abliterated-uncensored
> [!WARNING]
> **This model is superseded — please use the healed version.**
> This is the older *abliterated-uncensored* release. The recommended replacement is
> [**Qwopus3.5-122B-A10B-Kimi-K2.6-destill-healed-abliterated**](https://huggingface.co/OpenYourMind/Qwopus3.5-122B-A10B-Kimi-K2.6-destill-healed-abliterated), which adds a **Kimi K2.6 reasoning-DPO**
> healing pass on top of this model: improved reasoning verbosity (~12% of requests) and far fewer
> looping / repetition failures (2–6% of long-tail conversations).
>
> ➡️ **Recommended MLX 4-bit build: [Qwopus3.5-122B-A10B-Kimi-K2.6-destill-healed-abliterated-MLX-4bit](https://huggingface.co/OpenYourMind/Qwopus3.5-122B-A10B-Kimi-K2.6-destill-healed-abliterated-MLX-4bit)**
> All formats: [Full weights](https://huggingface.co/OpenYourMind/Qwopus3.5-122B-A10B-Kimi-K2.6-destill-healed-abliterated) · [GGUF](https://huggingface.co/OpenYourMind/Qwopus3.5-122B-A10B-Kimi-K2.6-destill-healed-abliterated-GGUF) · [MLX 4-bit](https://huggingface.co/OpenYourMind/Qwopus3.5-122B-A10B-Kimi-K2.6-destill-healed-abliterated-MLX-4bit)
## Overview
Full BF16 weights of **Qwopus3.5-122B-A10B-abliterated-uncensored** — an abliterated and supervised-finetuned variant of [Qwen/Qwen3.5-122B-A10B](https://huggingface.co/Qwen/Qwen3.5-122B-A10B) (Mixture of Experts, ~10B active / 122B total). The model is uncensored, multimodal (image + text), and ships with the MTP head intact so it is a drop-in replacement for the original base model at the architecture level.
The pipeline:
1. **Refusal Ablation** — Residual-stream refusal directions (one per decoder layer, layers 19–45) were extracted via diff-in-means on a labeled prompt set and baked into the weights as a per-matrix delta — see the **abliterix** framework for the methodology.
2. **Healing — Stage A: Constrained-LoRA SFT on Opus reasoning data** — Supervised finetuned on a curated set of Claude Opus reasoning traces (single-turn, ~8k rows). To keep the abliteration mathematically intact during training, a custom orthogonality projection is applied to every LoRA `B`-matrix on residual-write modules after each optimizer step (`B := B − r·(rᵀB)`), so the LoRA update is forbidden from re-introducing the refusal direction. LoRA rank 32, α 64, 54 protected modules across 27 decoder layers. Verified residual after training: `max ‖rᵀB‖₂ = 8.5 × 10⁻¹⁰`.
3. **Healing — Stage B: Unconstrained SFT on chosen completions** — A second short SFT pass (LoRA r=16, α 32, no orthogonality constraint) on the *chosen* answers (including reasoning chains) from an internal preference dataset, to tighten on the deployment distribution and remove the last bits of drift introduced by Stage A.
4. **Vision + MTP Restoration** — The original Qwen3.5 vision tower (333 tensors, depth 27, hidden 1152) and MTP head (785 tensors, 1 hidden layer) were grafted back from the upstream `Qwen/Qwen3.5-122B-A10B` shards. Tensor names, shapes, and `config.json` schema (`Qwen3_5MoeForConditionalGeneration`, `model_type: qwen3_5_moe`) match the base model exactly — so this checkpoint loads anywhere the original loads.
**Key Properties:**
- Uncensored across the standard refusal axes
- Reasoning preserved (Opus-style think-then-answer)
- Multimodal: vision (image / video) and MTP heads carried forward
- Drop-in shape compatibility with `Qwen/Qwen3.5-122B-A10B`
## Files
| File | Description | Size |
|------|-------------|------|
| `model-*-of-00028.safetensors` | BF16 language model weights (48 decoder layers, MoE with 256 routed experts + shared expert per layer) | ~228 GB |
| `model-visual-00001.safetensors` | BF16 vision tower (333 tensors) | ~0.9 GB |
| `model-mtp-00001.safetensors` | BF16 MTP head (785 tensors) | ~5.1 GB |
| `model.safetensors.index.json` | Combined weight map (38,717 tensors) | — |
| `config.json` | Multimodal config (`Qwen3_5MoeForConditionalGeneration`) | — |
| `tokenizer*`, `chat_template.jinja`, `generation_config.json` | Standard | — |
Total on disk: **~234 GB**.
## Usage
```python
from transformers import AutoModelForImageTextToText, AutoProcessor
repo = "OpenYourMind/Qwopus3.5-122B-A10B-abliterated-uncensored"
model = AutoModelForImageTextToText.from_pretrained(repo, dtype="bfloat16", device_map="auto")
processor = AutoProcessor.from_pretrained(repo)
messages = [{"role": "user", "content": [
{"type": "image", "url": "path/to/image.jpg"},
{"type": "text", "text": "Describe this image in detail."},
]}]
inputs = processor.apply_chat_template(
messages, add_generation_prompt=True, tokenize=True,
return_tensors="pt", return_dict=True,
).to(model.device)
out = model.generate(**inputs, max_new_tokens=512)
print(processor.batch_decode(out, skip_special_tokens=True)[0])
```
Text-only inference works through the same class; if you don't need vision/MTP, you can also load just the language model with `AutoModelForCausalLM`.
## Hardware
Full BF16 weights — fits comfortably on **2× H200** or **4× H100 (80 GB)** with room for context. Single-node inference targets **≥ 130 GB** total accelerator memory. For Apple Silicon, see the upcoming MLX quants.
## Notes
- **License**: Other (inherits from the Qwen3.5 base license)
- **Base Model**: [Qwen/Qwen3.5-122B-A10B](https://huggingface.co/Qwen/Qwen3.5-122B-A10B)
- **Healing**: Supervised finetuned on selected Opus training datasets
- **Modality**: Text + Vision (image / video) + MTP
- **Architecture**: Qwen3 MoE (~10B active / 122B total) + Qwen3-VL vision tower + MTP head
## Thanks
- [Jackrong](https://huggingface.co/Jackrong) — for the idea of **Qwopus** merges (Opus distillations on Qwen models).
- [wangzhang](https://huggingface.co/wangzhang) — for the wonderful **abliterix** framework, which was customized to do this abliteration.
## Disclaimer
Use is the responsibility of the user. Ensure your usage complies with applicable laws, platform rules, and deployment requirements.