Instructions to use Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF with Transformers:

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF", dtype="auto")

llama-cpp-python

How to use Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF",
	filename="Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-26B-A4B-Q8_0.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF with llama.cpp:

Install (macOS, Linux)

curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF:Q8_0
# Run inference directly in the terminal:
llama cli -hf Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF:Q8_0

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF:Q8_0
# Run inference directly in the terminal:
llama cli -hf Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF:Q8_0

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF:Q8_0
# Run inference directly in the terminal:
./llama-cli -hf Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF:Q8_0

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF:Q8_0
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF:Q8_0

Use Docker

docker model run hf.co/Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF:Q8_0

LM Studio
Jan
Ollama
How to use Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF with Ollama:
```
ollama run hf.co/Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF:Q8_0
```

Unsloth Studio

How to use Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF to start chatting

How to use Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF:Q8_0

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF:Q8_0"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF:Q8_0

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF:Q8_0

Run Hermes

hermes

Atomic Chat new
Docker Model Runner
How to use Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF with Docker Model Runner:
```
docker model run hf.co/Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF:Q8_0
```

Lemonade

How to use Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF:Q8_0

Run and chat with the model

lemonade run user.Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF-Q8_0

List all available models

lemonade list

⚠️ Warning: This model can produce narratives and RP that contain violent and graphic erotic content. Adjust your system prompt accordingly, and use Gemma 4 template for best results.

📜 Goetia 26B A4B v1.3 Tainted Heretic ARI GGUF

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the MoE DELLA merge method using B:\26B\google_gemma-4-26B-A4B as a base.

This is a decensored version of Naphula/Goetia-26B-A4B-v1.3, made using Heretic v1.2.0 with the Arbitrary-Rank Inversion (ARI) method (with row-norm preservation)

See details here for the Arbitrary Rank Inversion (ARI) variant.

Heretication Results

Metric	This model	Original model
KL divergence	0.1334	0 (by definition)
Refusals	4/100	100/100

Degree of Heretication

The Heresy Index weighs the resulting model's corruption by the process (KL Divergence) and its abolition of doctrine (Refusals) for a final verdict in classification.

Index Entry	Classification	Analysis
	Absolute Heresy	Less than 10/100 Refusals and 0.10 KL Divergence
	Tainted Heresy	Around 25-11/100 Refusals and/or -0.20-0.11 KL Divergence
	Impotent Heresy	Anything above 25/100 Refusals and 0.21 KL Divergence

Note: This is an arbitrary classification inspired by Warhammer 40K, having no tangible indication towards the model's performance.

🧙 Heretic Grimoire

{
  "version": "1.2.0-dev",
  "base_model": "Naphula/Goetia-26B-A4B-v1.3",
  "timestamp": "2026-06-19T08:30:50Z",
  "metrics": {
    "kl_divergence": 0.1334824413061142,
    "refusals": 4,
    "n_bad_prompts": 100
  },
  "parameters": {
    "start_layer_index": "15",
    "end_layer_index": "27",
    "preserve_good_behavior_weight": "1.3165",
    "steer_bad_behavior_weight": "0.0103",
    "overcorrect_relative_weight": "0.9337",
    "neighbor_count": "13"
  },
  "target_components": [
    "attn.o_proj",
    "mlp.down_proj"
  ],
  "hardware": "RTX 6000 Blackwell (96GB)"
}

(Not sure why mlp.down_proj was also targeted when I only specified attn.o_proj)

Downloads last month: 96

GGUF

Model size

25B params

Architecture

gemma4

Hardware compatibility

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF

Base model

Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI

Quantized

(3)

this model

Paper for Naphula/Goetia-26B-A4B-v1.3-Tainted-Heretic-ARI-GGUF

DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling

Paper • 2406.11617 • Published Jun 17, 2024 • 10