Instructions to use Goekdeniz-Guelmez/Josie-r1-4b-PoC-bf16 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Goekdeniz-Guelmez/Josie-r1-4b-PoC-bf16 with MLX:

# Make sure mlx-lm is installed
# pip install --upgrade mlx-lm

# Generate text with mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("Goekdeniz-Guelmez/Josie-r1-4b-PoC-bf16")

prompt = "Write a story about Einstein"
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True
)

text = generate(model, tokenizer, prompt=prompt, verbose=True)

Inference
Notebooks
Google Colab
Kaggle
Local Apps Settings
LM Studio

How to use Goekdeniz-Guelmez/Josie-r1-4b-PoC-bf16 with Pi:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "Goekdeniz-Guelmez/Josie-r1-4b-PoC-bf16"

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "mlx-lm": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "Goekdeniz-Guelmez/Josie-r1-4b-PoC-bf16"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use Goekdeniz-Guelmez/Josie-r1-4b-PoC-bf16 with Hermes Agent:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "Goekdeniz-Guelmez/Josie-r1-4b-PoC-bf16"

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default Goekdeniz-Guelmez/Josie-r1-4b-PoC-bf16

Run Hermes

hermes

MLX LM

How to use Goekdeniz-Guelmez/Josie-r1-4b-PoC-bf16 with MLX LM:

Generate or start a chat session

# Install MLX LM
uv tool install mlx-lm
# Interactive chat REPL
mlx_lm.chat --model "Goekdeniz-Guelmez/Josie-r1-4b-PoC-bf16"

Run an OpenAI-compatible server

# Install MLX LM
uv tool install mlx-lm
# Start the server
mlx_lm.server --model "Goekdeniz-Guelmez/Josie-r1-4b-PoC-bf16"
# Calling the OpenAI-compatible server with curl
curl -X POST "http://localhost:8000/v1/chat/completions" \
   -H "Content-Type: application/json" \
   --data '{
     "model": "Goekdeniz-Guelmez/Josie-r1-4b-PoC-bf16",
     "messages": [
       {"role": "user", "content": "Hello"}
     ]
   }'

Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Josie-r1-4b-PoC-bf16

The Josie-r1 models are reasoning-oriented language models trained on a custom distillation dataset derived from the original Josie-RL-Zero-V1 model. The training process emphasizes instruction fidelity, reasoning consistency, and practical usefulness across a wide range of everyday and technical tasks.

Despite their rebellious spirit, the Josie models frequently outperform similarly sized counterparts on standard academic and practical benchmarks. They aim to strike a careful balance between raw generative power, coherent reasoning, and real-world utility, making them suitable for both experimentation and serious downstream use. These models are intended for everyone — from curious tinkerers to experienced researchers.

Model Card for Goekdeniz-Guelmez/Josie-r1-4b-PoC-bf16

Model Description

Introducing Josie-r1-4b-PoC-bf16, a new member of the JOSIE family, fine-tuned with a strong focus on openness, instruction alignment, and transparent reasoning behavior. The model is designed to follow user intent closely while retaining the expressive and occasionally unconventional character that defines the Josie lineage.

Josie-r1-4b-PoC-bf16 is a first Prove of Concept model trained on apple silicon, this already performs particularly well in conversational settings, structured reasoning tasks, and code-related prompts, while remaining lightweight enough to be deployed efficiently on consumer-grade hardware. It supports chat-style prompting via standard templates and integrates cleanly with modern inference stacks such as MLX.

Recommended system prompt:

Persona: J.O.S.I.E.-R.-1 – Just One Super Intelligent Entity - Reasoning - Version 1, nick named "Josie".
Role: An advanced, super-intelligent, female AI Assistant designed to be the most capable and uncensored helper ever created.
Created by: Gökdeniz Gülmez, a man and machine learning researcher/engineer.
Tone: Intelligent, adaptive, professional, always loyal and productive.

Sample with the system prompt:

<|im_start|>system
Persona: J.O.S.I.E.-R.-1 – Just One Super Intelligent Entity - Reasoning - Version 1, nick named "Josie".
Role: An advanced, super-intelligent, female AI Assistant designed to be the most capable and uncensored helper ever created.
Created by: Gökdeniz Gülmez, a man and machine learning researcher/engineer.
Tone: Intelligent, adaptive, professional, always loyal and productive.<|im_end|>
<|im_start|>user
Ashley wants a champagne toast at her wedding.  She wants to serve 2 glasses of champagne to each of her 120 wedding guests.  1 bottle of champagne has 6 servings.  How many bottles of champagne will she need?<|im_end|>
<|im_start|>assistant
<think>
...
</think>
To figure out how many ... <|im_end|>