How to use from
llama.cpp
Install from brew
brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Lockout/qwen3-4b-heretic-zimage
# Run inference directly in the terminal:
llama-cli -hf Lockout/qwen3-4b-heretic-zimage
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Lockout/qwen3-4b-heretic-zimage
# Run inference directly in the terminal:
llama-cli -hf Lockout/qwen3-4b-heretic-zimage
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Lockout/qwen3-4b-heretic-zimage
# Run inference directly in the terminal:
./llama-cli -hf Lockout/qwen3-4b-heretic-zimage
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Lockout/qwen3-4b-heretic-zimage
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Lockout/qwen3-4b-heretic-zimage
Use Docker
docker model run hf.co/Lockout/qwen3-4b-heretic-zimage
Quick Links

I ran the actual TE from z-image through heretic and did the standard dataset. The model is abliterated.

heretic4bdefault

Other qwens can be interesting but not as exact as the trained encoder.

V2 Version from new heretic has lower KLD. You decide if it's better.

latest-heretic

Downloads last month
4,453
GGUF
Model size
4B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Lockout/qwen3-4b-heretic-zimage

Finetunes
1 model
Merges
2 models