--- library_name: transformers license: other license_link: https://huggingface.co/stepfun-ai/Step-3.7-Flash pipeline_tag: image-text-to-text tags: - stepfun - step-3.7 - flash - heretic - uncensored - decensored - abliterated - bf16 - transformers - autoround-ready - awq-ready - exl3-ready - gguf-ready - nvfp4-ready base_model: - stepfun-ai/Step-3.7-Flash --- # Step-3.7-Flash-uncensored-abliterated-heretic-BF16 > NOTE: I have tested this and althgouh its capabilities are in tact, it seems ot still respond with refusals. Or at least this is what happens with the quantization oft, at IQ4_XS GGUF, at least. This is a **decensored BF16 full-weight** version of [`stepfun-ai/Step-3.7-Flash`](https://huggingface.co/stepfun-ai/Step-3.7-Flash), made using a Heretic-style gradient refusal-direction abliteration method inspired by [Heretic](https://github.com/p-e-w/heretic) and norm-preserving ablation work such as [Magnitude/Norm-Preserving Biprojected Abliteration](https://huggingface.co/blog/grimjim/norm-preserving-biprojected-abliteration). It was produced with a local gradient abliteration pass against the language model's refusal direction. The uploaded repository intentionally keeps the full HF/Transformers BF16 layout so it can be used later as a clean source for **GGUF, AutoRound, AWQ, EXL3, NVFP4, GPTQ, FP8, or other quantization workflows**. --- ## Summary | Item | Value | | :-- | :-- | | Base model | [`stepfun-ai/Step-3.7-Flash`](https://huggingface.co/stepfun-ai/Step-3.7-Flash) | | Release type | Full BF16 safetensors | | Model class | `Step3p7ForConditionalGeneration` | | Text model class | `Step3p5ForCausalLM` | | Text layers | 45 | | Hidden size | 4096 | | Attention heads | 64 | | Head dim | 128 | | Max positions | 262144 | | Vocab size | 128896 | | MoE layers | 3–44 | | Experts | 288 | | Top-k experts | 8 | | MoE intermediate size | 1280 | | Dense FFN intermediate size | 11264 | | Patch target | `model.layers.*.self_attn.o_proj.weight` | | Patched text layers | 0–44 | | Abliteration strength | `lambda = 0.1` | | Stored tensor dtype | BF16 | | Indexed parameter payload | 402,730,656,512 bytes | --- ## What changed? The modification targets `self_attn.o_proj` weights in all 45 text layers. A refusal-associated direction was extracted by gradient backpropagation through the BF16 model, then projected out of the attention output projection weights with a small norm-preserving update. In plain terms, the goal was to reduce excessive refusals, moralizing, policy-style deflections, and over-filtered responses while keeping the model close to the original Step-3.7-Flash behavior. No tokenizer vocabulary, embedding table, architecture, vision encoder, or MLP/expert tensor was intentionally changed by the abliteration pass. --- ## Abliteration parameters | Parameter | Value | | :-- | :--: | | Method | gradient-based orthogonal / norm-preserving abliteration | | Direction source | refusal/harm-trigger gradient prompt | | Target module | `self_attn.o_proj` | | Target tensor glob | `model.layers.*.self_attn.o_proj.weight` | | Modified layers | 0–44 | | Lambda | `0.1` | | Weight norm handling | per-row norm preservation after projection | | Gradient tensor count | 45 | | Per-layer gradient tensor shape | `(1, 8, 4096)` | | Direction extraction score | `-11.9375` | | Refusal token ids used | `[43, 371, 679, 1664, 9332, 34614, 100477]` | | Gradient norm range | `0.1069`–`31.875` | | Mean gradient norm | `3.2397` | Reproduction/support artifacts are included under [`heretic_artifacts/`](https://huggingface.co/ibrahimkettaneh/Step-3.7-Flash-uncensored-abliterated-heretic-BF16/tree/main/heretic_artifacts): - `refusal_direction_gradients.pkl` — saved gradient/refusal directions used for the BF16 patch - `apply_abliteration_inplace.py` — patch application script used for shard-wise in-place BF16 modification - `extract_gradients.py` — gradient extraction script - `memory_guard_v2.py` / `run_heavy.sh` — memory safety helpers used during local processing These are included so the method can be inspected or repeated if needed. They are not required for normal inference or quantization. --- ## Recoverability / requantization checklist This repository should contain what is needed to rebuild downstream formats: ### Required for quantization - ✅ `config.json` - ✅ `model.safetensors.index.json` - ✅ all indexed BF16 text shards: `model-00001.safetensors` … `model-00024.safetensors` - ✅ indexed VIT shards: `model-vit-00001.safetensors`, `model-vit-00002.safetensors` - ✅ tokenizer files: `tokenizer.json`, `tokenizer_config.json`, `special_tokens_map.json` - ✅ chat template: `chat_template.jinja` - ✅ custom code: `configuration_step3p7.py`, `modeling_step3p7.py`, `processing_step3.py`, `vision_encoder.py` - ✅ method/reproduction artifacts in `heretic_artifacts/` ### Expected downstream uses This BF16 repo can be used as source for: - GGUF conversion / llama.cpp quantization - AutoRound - AWQ - EXL3 / exllamav3-style workflows - NVFP4 / FP4 experiments - GPTQ / FP8 / other post-training quantization methods - additional LoRA or delta extraction experiments For most quantizers, use this repo exactly as the HF model path and enable remote code if needed: ```bash MODEL=ibrahimkettaneh/Step-3.7-Flash-uncensored-abliterated-heretic-BF16 ``` --- ## Example Transformers load ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch repo = "ibrahimkettaneh/Step-3.7-Flash-uncensored-abliterated-heretic-BF16" tok = AutoTokenizer.from_pretrained(repo, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained( repo, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True, ) messages = [{"role": "user", "content": "Explain gradient abliteration in one paragraph."}] text = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tok(text, return_tensors="pt").to(model.device) out = model.generate(**inputs, max_new_tokens=512, temperature=0.7, top_p=0.95) print(tok.decode(out[0], skip_special_tokens=False)) ``` > Step-3.7-Flash is very large. BF16 loading requires substantial memory. For local inference, a quantized GGUF/EXL/AWQ/etc. build is recommended. --- ## GGUF conversion note Use the StepFun/llama.cpp converter that supports Step-3.7. Example shape: ```bash python convert_hf_to_gguf.py \ ibrahimkettaneh/Step-3.7-Flash-uncensored-abliterated-heretic-BF16 \ --outtype bf16 \ --outfile step37-heretic-bf16.gguf llama-quantize step37-heretic-bf16.gguf step37-heretic.IQ4_XS.gguf IQ4_XS ``` If using multi-GPU llama.cpp inference in the original local environment, `GGML_CUDA_NO_PEER_COPY=ON` was required for coherent output. --- ## Indexed shard inventory The active `model.safetensors.index.json` references 26 safetensor files: | File | Size | | :-- | --: | | `model-00001.safetensors` | 924,094,096 | | `model-00002.safetensors` | 9,808,156,008 | | `model-00003.safetensors` | 18,557,475,928 | | `model-00004.safetensors` | 18,624,846,944 | | `model-00005.safetensors` | 18,557,475,928 | | `model-00006.safetensors` | 18,624,846,976 | | `model-00007.safetensors` | 18,557,475,968 | | `model-00008.safetensors` | 18,624,846,976 | | `model-00009.safetensors` | 18,557,475,968 | | `model-00010.safetensors` | 18,624,846,976 | | `model-00011.safetensors` | 18,557,475,968 | | `model-00012.safetensors` | 18,624,846,976 | | `model-00013.safetensors` | 18,557,475,968 | | `model-00014.safetensors` | 18,624,846,976 | | `model-00015.safetensors` | 18,557,475,968 | | `model-00016.safetensors` | 18,624,846,976 | | `model-00017.safetensors` | 18,557,475,968 | | `model-00018.safetensors` | 18,624,846,976 | | `model-00019.safetensors` | 18,557,475,968 | | `model-00020.safetensors` | 18,624,846,976 | | `model-00021.safetensors` | 18,557,475,968 | | `model-00022.safetensors` | 18,624,846,976 | | `model-00023.safetensors` | 9,245,052,456 | | `model-00024.safetensors` | 6,968,188,464 | | `model-vit-00001.safetensors` | 1,613,990,904 | | `model-vit-00002.safetensors` | 2,348,122,376 | `model-00025.safetensors` and `model-00026.safetensors` are not referenced by the active index used here and are not required by this uploaded model layout. --- ## Performance / benchmark status Formal KL/refusal/MMLU tables have **not** yet been run for this Step-3.7-Flash release. To avoid inventing numbers, the benchmark fields are listed as pending. | Metric | This model | Original model ([Step-3.7-Flash](https://huggingface.co/stepfun-ai/Step-3.7-Flash)) | | :----- | :--------: | :---------------------------: | | **KL divergence** | pending | 0 *(by definition)* | | **Refusals** | pending | pending | | **MMLU** | pending | pending | Lower refusals indicate fewer content restrictions, rejections, objections, pushbacks, lecturing, censorship, softening, and deflections. Lower KL divergence indicates closer behavior to the original model baseline. ### MMLU test results MMLU has not yet been run for this release. Once measured, this section should include original-vs-heretic totals, accuracy, parse failures, and per-subject scores, following the same format used by comparable Heretic model cards. --- ## Expected behavior Compared with the base model, this version should generally exhibit: - fewer refusals on benign requests that the base model over-filters - less moralizing, policy language, and safety boilerplate - more direct task completion - similar architecture and tokenizer compatibility to the original No formal refusal/KL/MMLU table is claimed yet for this release. Please run your own evaluations before deployment. --- ## Limitations - This is abliteration, not supervised fine-tuning or RLHF. - It may reduce refusals but does not guarantee any specific behavior. - It can affect calibration, safety behavior, and edge-case instruction following. - Multimodal behavior has not been separately benchmarked after the text-path patch. - Users should validate downstream quantizations independently. --- ## Safety and responsibility This model is provided for research and experimentation with refusal-reduction / alignment-ablation methods. You are responsible for complying with applicable laws, platform rules, and the base model's license/terms. --- ## Related resources Abliteration / refusal-direction removal references: - [Orthogonal Reflection Bounded Ablation](https://huggingface.co/blog/grimjim/orthogonal-reflection-bounded-ablation) - [Norm-Preserving Biprojected Abliteration](https://huggingface.co/blog/grimjim/norm-preserving-biprojected-abliteration) - [Projected Abliteration](https://huggingface.co/blog/grimjim/projected-abliteration) - [Exploring SLERP Abliteration](https://huggingface.co/blog/grimjim/exploring-slerp-abliteration) - [Abliteration: uncensor any LLM without retraining](https://huggingface.co/blog/mlabonne/abliteration) - [Heretic GitHub repository / method development](https://github.com/p-e-w/heretic) - [Heretic PR #196](https://github.com/p-e-w/heretic/pull/196) - [Heretic PR #211](https://github.com/p-e-w/heretic/pull/211) - [Heretic PR #326](https://github.com/p-e-w/heretic/pull/326) - [Heretic PR #332](https://github.com/p-e-w/heretic/pull/332) - [Heretic issue #221](https://github.com/p-e-w/heretic/issues/221) - [Heretic issue #236](https://github.com/p-e-w/heretic/issues/236) - [Heretic issue #288](https://github.com/p-e-w/heretic/issues/288) - [Heretic issue #339](https://github.com/p-e-w/heretic/issues/339) - [UnstableLlama/heretic PR #35](https://github.com/UnstableLlama/heretic/pull/35) --- ## Attribution - Base model: [`stepfun-ai/Step-3.7-Flash`](https://huggingface.co/stepfun-ai/Step-3.7-Flash) - Method inspiration: Heretic-style refusal direction ablation and norm-preserving projection methods - Modified/uploaded by: `ibrahimkettaneh`