Instructions to use mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF with Transformers:

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF", dtype="auto")

llama-cpp-python

How to use mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF",
	filename="Qwen2.5-14B-Base-Heretic.i1-IQ1_M.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF:Q4_K_M

Use Docker

docker model run hf.co/mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF:Q4_K_M

LM Studio
Jan
Ollama
How to use mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF with Ollama:
```
ollama run hf.co/mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF:Q4_K_M
```

Unsloth Studio

How to use mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF to start chatting

How to use mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF:Q4_K_M

Run Hermes

hermes

Atomic Chat new
Docker Model Runner
How to use mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF with Docker Model Runner:
```
docker model run hf.co/mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF:Q4_K_M
```

Lemonade

How to use mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull mradermacher/Qwen2.5-14B-Base-Heretic-i1-GGUF:Q4_K_M

Run and chat with the model

lemonade run user.Qwen2.5-14B-Base-Heretic-i1-GGUF-Q4_K_M

List all available models

lemonade list

can i use it?

by NoNameed - opened May 14

Discussion

NoNameed

May 14

hallo! i have already used many modes before and there was always something puting me off how they are, not the reasoning and stuff but the underlying carackter of the ai, i saw this ai here and i want to make my own version of it with training it. is it still close to the original model without much charackterisation with just obliteranting the set filters? it will take a long time i think but i just want a pretty good base model which i can use to make my own ai, i would apreciate an answewr and maybe a bit of guidance. until then i wish you a good week/ holliday today. greeting from germany,
no named

simonko912

May 14

First, we quant, it aint our model, its this persons https://huggingface.co/TitleOS/Qwen2.5-14B-Base-Heretic and second, things like this are usualy allowed since hugging face is opensourced, but i can try help anyways!

nicoboss

May 14

Booth TitleOS/Qwen2.5-14B-Base-Heretic and Qwen/Qwen2.5-14B from which it is based are Apache 2.0 licensed and so you should legally be able to use them for further finetuning. You obviously can't finetune the GGUFs but you can finetune TitleOS/Qwen2.5-14B-Base-Heretic` with any finetuning framework you like. I personally recommend axolotl. I wish you a great holiday as well and greetings from Switzerland.

yano2mch

May 14

hallo! i have already used many modes before and there was always something puting me off how they are, the underlying carackter of the ai

Are you using any custom definitions or the default of whatever program you're using?

The default tends to be 'you are a great assistant and are helpful to the user' or something. There's other definitions. You can even tell it to be a lazy catgirl maid who meows a lot, and the personality and output completely changes.

NoNameed

May 14

thank you for the answers, i will probably not upload the model on here because i dont want to get smited with mca takedowns or something similar, so just for personal use. i'm still looking for a good framework that will work with this model, if you say axolotl will work them i will probably use that one, are there recomendations on th eparameters for training and the min/max size for the finetuning data?

NoNameed

May 14

Are you using any custom definitions or the default of whatever program you're using?

The default tends to be 'you are a great assistant and are helpful to the user' or something. There's other definitions. You can even tell it to be a lazy catgirl maid who meows a lot, and the personality and output completely changes.
i'm just using oobabooga and tried a few things just with instruct and then chat, with chat i use my own template of a chartackter i have made and it works fine, i just want to start getting into making my own loras and something that is a bit tailored towards myself

NoNameed

May 14

hallo! i have already used many modes before and there was always something puting me off how they are, the underlying carackter of the ai

Are you using any custom definitions or the default of whatever program you're using?

The default tends to be 'you are a great assistant and are helpful to the user' or something. There's other definitions. You can even tell it to be a lazy catgirl maid who meows a lot, and the personality and output completely changes.

NoNameed changed discussion status to closed May 14

NoNameed changed discussion status to open May 14

NoNameed

May 14

the closing was unwanted

yano2mch

May 14

•

edited May 14

i'm just using oobabooga and tried a few things just with instruct and then chat, with chat i use my own template of a chartackter i have made and it works fine, i just want to start getting into making my own loras and something that is a bit tailored towards myself

Mhmm... Yeah that's where i started too. Moved to KoboldCPP which i think is a bit better, along with SillyTavern. Even KoboldCPP's default interface does okay if you don't care much about logs.

Within OobaBooga there's a tab to handle character profiles, which accepts SillyTavern png files. Save the image (it contains a json definition for a bot, you can of course also just get the raw json file too) and import it. There's plenty of places that have character/RP cards, but basic changed assistants can't hurt.

source: https://characterhub.org/characters/ShyChu/aster-a-i-assistant-c91a400eaeb5

You can also with an LLM tell it the tone and feeling and what you want, and then have it help you write a definition too. Though some models tend to be way too terse, don't follow instructions or like huihui models tend to try and be too poetic and flowery in their short 10-20 word 10 line responses. I've found Gemma4 models to be very impressive.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment