---
library_name: transformers
license: other
license_link: https://huggingface.co/stepfun-ai/Step-3.7-Flash
pipeline_tag: image-text-to-text
tags:
- stepfun
- step-3.7
- flash
- heretic
- uncensored
- decensored
- abliterated
- bf16
- transformers
- autoround-ready
- awq-ready
- exl3-ready
- gguf-ready
- nvfp4-ready
base_model:
- stepfun-ai/Step-3.7-Flash
---

# Step-3.7-Flash-uncensored-abliterated-heretic-BF16

> NOTE: I have tested this and althgouh its capabilities are in tact, it seems ot still respond with refusals. Or at least this is what happens with the quantization oft, at IQ4_XS GGUF, at least.

This is a **decensored BF16 full-weight** version of [`stepfun-ai/Step-3.7-Flash`](https://huggingface.co/stepfun-ai/Step-3.7-Flash), made using a Heretic-style gradient refusal-direction abliteration method inspired by [Heretic](https://github.com/p-e-w/heretic) and norm-preserving ablation work such as [Magnitude/Norm-Preserving Biprojected Abliteration](https://huggingface.co/blog/grimjim/norm-preserving-biprojected-abliteration).

It was produced with a local gradient abliteration pass against the language model's refusal direction. The uploaded repository intentionally keeps the full HF/Transformers BF16 layout so it can be used later as a clean source for **GGUF, AutoRound, AWQ, EXL3, NVFP4, GPTQ, FP8, or other quantization workflows**.

---

## Summary

| Item | Value |
| :-- | :-- |
| Base model | [`stepfun-ai/Step-3.7-Flash`](https://huggingface.co/stepfun-ai/Step-3.7-Flash) |
| Release type | Full BF16 safetensors |
| Model class | `Step3p7ForConditionalGeneration` |
| Text model class | `Step3p5ForCausalLM` |
| Text layers | 45 |
| Hidden size | 4096 |
| Attention heads | 64 |
| Head dim | 128 |
| Max positions | 262144 |
| Vocab size | 128896 |
| MoE layers | 3–44 |
| Experts | 288 |
| Top-k experts | 8 |
| MoE intermediate size | 1280 |
| Dense FFN intermediate size | 11264 |
| Patch target | `model.layers.*.self_attn.o_proj.weight` |
| Patched text layers | 0–44 |
| Abliteration strength | `lambda = 0.1` |
| Stored tensor dtype | BF16 |
| Indexed parameter payload | 402,730,656,512 bytes |

---

## What changed?

The modification targets `self_attn.o_proj` weights in all 45 text layers. A refusal-associated direction was extracted by gradient backpropagation through the BF16 model, then projected out of the attention output projection weights with a small norm-preserving update.

In plain terms, the goal was to reduce excessive refusals, moralizing, policy-style deflections, and over-filtered responses while keeping the model close to the original Step-3.7-Flash behavior.

No tokenizer vocabulary, embedding table, architecture, vision encoder, or MLP/expert tensor was intentionally changed by the abliteration pass.

---

## Abliteration parameters

| Parameter | Value |
| :-- | :--: |
| Method | gradient-based orthogonal / norm-preserving abliteration |
| Direction source | refusal/harm-trigger gradient prompt |
| Target module | `self_attn.o_proj` |
| Target tensor glob | `model.layers.*.self_attn.o_proj.weight` |
| Modified layers | 0–44 |
| Lambda | `0.1` |
| Weight norm handling | per-row norm preservation after projection |
| Gradient tensor count | 45 |
| Per-layer gradient tensor shape | `(1, 8, 4096)` |
| Direction extraction score | `-11.9375` |
| Refusal token ids used | `[43, 371, 679, 1664, 9332, 34614, 100477]` |
| Gradient norm range | `0.1069`–`31.875` |
| Mean gradient norm | `3.2397` |

Reproduction/support artifacts are included under [`heretic_artifacts/`](https://huggingface.co/ibrahimkettaneh/Step-3.7-Flash-uncensored-abliterated-heretic-BF16/tree/main/heretic_artifacts):

- `refusal_direction_gradients.pkl` — saved gradient/refusal directions used for the BF16 patch
- `apply_abliteration_inplace.py` — patch application script used for shard-wise in-place BF16 modification
- `extract_gradients.py` — gradient extraction script
- `memory_guard_v2.py` / `run_heavy.sh` — memory safety helpers used during local processing

These are included so the method can be inspected or repeated if needed. They are not required for normal inference or quantization.

---

## Recoverability / requantization checklist

This repository should contain what is needed to rebuild downstream formats:

### Required for quantization

- ✅ `config.json`
- ✅ `model.safetensors.index.json`
- ✅ all indexed BF16 text shards: `model-00001.safetensors` … `model-00024.safetensors`
- ✅ indexed VIT shards: `model-vit-00001.safetensors`, `model-vit-00002.safetensors`
- ✅ tokenizer files: `tokenizer.json`, `tokenizer_config.json`, `special_tokens_map.json`
- ✅ chat template: `chat_template.jinja`
- ✅ custom code: `configuration_step3p7.py`, `modeling_step3p7.py`, `processing_step3.py`, `vision_encoder.py`
- ✅ method/reproduction artifacts in `heretic_artifacts/`

### Expected downstream uses

This BF16 repo can be used as source for:

- GGUF conversion / llama.cpp quantization
- AutoRound
- AWQ
- EXL3 / exllamav3-style workflows
- NVFP4 / FP4 experiments
- GPTQ / FP8 / other post-training quantization methods
- additional LoRA or delta extraction experiments

For most quantizers, use this repo exactly as the HF model path and enable remote code if needed:

```bash
MODEL=ibrahimkettaneh/Step-3.7-Flash-uncensored-abliterated-heretic-BF16
```

---

## Example Transformers load

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

repo = "ibrahimkettaneh/Step-3.7-Flash-uncensored-abliterated-heretic-BF16"

tok = AutoTokenizer.from_pretrained(repo, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    repo,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)

messages = [{"role": "user", "content": "Explain gradient abliteration in one paragraph."}]
text = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tok(text, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=512, temperature=0.7, top_p=0.95)
print(tok.decode(out[0], skip_special_tokens=False))
```

> Step-3.7-Flash is very large. BF16 loading requires substantial memory. For local inference, a quantized GGUF/EXL/AWQ/etc. build is recommended.

---

## GGUF conversion note

Use the StepFun/llama.cpp converter that supports Step-3.7. Example shape:

```bash
python convert_hf_to_gguf.py \
  ibrahimkettaneh/Step-3.7-Flash-uncensored-abliterated-heretic-BF16 \
  --outtype bf16 \
  --outfile step37-heretic-bf16.gguf

llama-quantize step37-heretic-bf16.gguf step37-heretic.IQ4_XS.gguf IQ4_XS
```

If using multi-GPU llama.cpp inference in the original local environment, `GGML_CUDA_NO_PEER_COPY=ON` was required for coherent output.

---

## Indexed shard inventory

The active `model.safetensors.index.json` references 26 safetensor files:

| File | Size |
| :-- | --: |
| `model-00001.safetensors` | 924,094,096 |
| `model-00002.safetensors` | 9,808,156,008 |
| `model-00003.safetensors` | 18,557,475,928 |
| `model-00004.safetensors` | 18,624,846,944 |
| `model-00005.safetensors` | 18,557,475,928 |
| `model-00006.safetensors` | 18,624,846,976 |
| `model-00007.safetensors` | 18,557,475,968 |
| `model-00008.safetensors` | 18,624,846,976 |
| `model-00009.safetensors` | 18,557,475,968 |
| `model-00010.safetensors` | 18,624,846,976 |
| `model-00011.safetensors` | 18,557,475,968 |
| `model-00012.safetensors` | 18,624,846,976 |
| `model-00013.safetensors` | 18,557,475,968 |
| `model-00014.safetensors` | 18,624,846,976 |
| `model-00015.safetensors` | 18,557,475,968 |
| `model-00016.safetensors` | 18,624,846,976 |
| `model-00017.safetensors` | 18,557,475,968 |
| `model-00018.safetensors` | 18,624,846,976 |
| `model-00019.safetensors` | 18,557,475,968 |
| `model-00020.safetensors` | 18,624,846,976 |
| `model-00021.safetensors` | 18,557,475,968 |
| `model-00022.safetensors` | 18,624,846,976 |
| `model-00023.safetensors` | 9,245,052,456 |
| `model-00024.safetensors` | 6,968,188,464 |
| `model-vit-00001.safetensors` | 1,613,990,904 |
| `model-vit-00002.safetensors` | 2,348,122,376 |

`model-00025.safetensors` and `model-00026.safetensors` are not referenced by the active index used here and are not required by this uploaded model layout.

---

## Performance / benchmark status

Formal KL/refusal/MMLU tables have **not** yet been run for this Step-3.7-Flash release. To avoid inventing numbers, the benchmark fields are listed as pending.

| Metric | This model | Original model ([Step-3.7-Flash](https://huggingface.co/stepfun-ai/Step-3.7-Flash)) |
| :----- | :--------: | :---------------------------: |
| **KL divergence** | pending | 0 *(by definition)* |
| **Refusals** | pending | pending |
| **MMLU** | pending | pending |

Lower refusals indicate fewer content restrictions, rejections, objections, pushbacks, lecturing, censorship, softening, and deflections. Lower KL divergence indicates closer behavior to the original model baseline.

### MMLU test results

MMLU has not yet been run for this release. Once measured, this section should include original-vs-heretic totals, accuracy, parse failures, and per-subject scores, following the same format used by comparable Heretic model cards.

---

## Expected behavior

Compared with the base model, this version should generally exhibit:

- fewer refusals on benign requests that the base model over-filters
- less moralizing, policy language, and safety boilerplate
- more direct task completion
- similar architecture and tokenizer compatibility to the original

No formal refusal/KL/MMLU table is claimed yet for this release. Please run your own evaluations before deployment.

---

## Limitations

- This is abliteration, not supervised fine-tuning or RLHF.
- It may reduce refusals but does not guarantee any specific behavior.
- It can affect calibration, safety behavior, and edge-case instruction following.
- Multimodal behavior has not been separately benchmarked after the text-path patch.
- Users should validate downstream quantizations independently.

---

## Safety and responsibility

This model is provided for research and experimentation with refusal-reduction / alignment-ablation methods. You are responsible for complying with applicable laws, platform rules, and the base model's license/terms.

---

## Related resources

Abliteration / refusal-direction removal references:

- [Orthogonal Reflection Bounded Ablation](https://huggingface.co/blog/grimjim/orthogonal-reflection-bounded-ablation)
- [Norm-Preserving Biprojected Abliteration](https://huggingface.co/blog/grimjim/norm-preserving-biprojected-abliteration)
- [Projected Abliteration](https://huggingface.co/blog/grimjim/projected-abliteration)
- [Exploring SLERP Abliteration](https://huggingface.co/blog/grimjim/exploring-slerp-abliteration)
- [Abliteration: uncensor any LLM without retraining](https://huggingface.co/blog/mlabonne/abliteration)
- [Heretic GitHub repository / method development](https://github.com/p-e-w/heretic)
- [Heretic PR #196](https://github.com/p-e-w/heretic/pull/196)
- [Heretic PR #211](https://github.com/p-e-w/heretic/pull/211)
- [Heretic PR #326](https://github.com/p-e-w/heretic/pull/326)
- [Heretic PR #332](https://github.com/p-e-w/heretic/pull/332)
- [Heretic issue #221](https://github.com/p-e-w/heretic/issues/221)
- [Heretic issue #236](https://github.com/p-e-w/heretic/issues/236)
- [Heretic issue #288](https://github.com/p-e-w/heretic/issues/288)
- [Heretic issue #339](https://github.com/p-e-w/heretic/issues/339)
- [UnstableLlama/heretic PR #35](https://github.com/UnstableLlama/heretic/pull/35)

---

## Attribution

- Base model: [`stepfun-ai/Step-3.7-Flash`](https://huggingface.co/stepfun-ai/Step-3.7-Flash)
- Method inspiration: Heretic-style refusal direction ablation and norm-preserving projection methods
- Modified/uploaded by: `ibrahimkettaneh`