Instructions to use MuXodious/Gemma3NPC-1b-SOMPOA-heresy with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use MuXodious/Gemma3NPC-1b-SOMPOA-heresy with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="MuXodious/Gemma3NPC-1b-SOMPOA-heresy") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("MuXodious/Gemma3NPC-1b-SOMPOA-heresy") model = AutoModelForMultimodalLM.from_pretrained("MuXodious/Gemma3NPC-1b-SOMPOA-heresy") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use MuXodious/Gemma3NPC-1b-SOMPOA-heresy with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "MuXodious/Gemma3NPC-1b-SOMPOA-heresy" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MuXodious/Gemma3NPC-1b-SOMPOA-heresy", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/MuXodious/Gemma3NPC-1b-SOMPOA-heresy
- SGLang
How to use MuXodious/Gemma3NPC-1b-SOMPOA-heresy with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "MuXodious/Gemma3NPC-1b-SOMPOA-heresy" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MuXodious/Gemma3NPC-1b-SOMPOA-heresy", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "MuXodious/Gemma3NPC-1b-SOMPOA-heresy" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MuXodious/Gemma3NPC-1b-SOMPOA-heresy", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio
How to use MuXodious/Gemma3NPC-1b-SOMPOA-heresy with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for MuXodious/Gemma3NPC-1b-SOMPOA-heresy to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for MuXodious/Gemma3NPC-1b-SOMPOA-heresy to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for MuXodious/Gemma3NPC-1b-SOMPOA-heresy to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="MuXodious/Gemma3NPC-1b-SOMPOA-heresy", max_seq_length=2048, ) - Docker Model Runner
How to use MuXodious/Gemma3NPC-1b-SOMPOA-heresy with Docker Model Runner:
docker model run hf.co/MuXodious/Gemma3NPC-1b-SOMPOA-heresy
This is a Gemma3NPC-1b fine-tune, produced at the request of redaihf through P-E-W's Heretic (v1.2.0) abliteration engine with Self-Organizing Maps & Magnitude-Preserving Orthogonal Ablation enabled.
Note: Model remains untested.
Heretication Results
| Score Metric | Value | Parameter | Value |
|---|---|---|---|
| Refusals | 15/416 | direction_index | per layer |
| KL Divergence | 0.0571 | attn.o_proj.max_weights.0 | 0: 1.01 |
| Initial Refusals | 378/416 | attn.o_proj.max_weights.1 | 1: 0.82 |
| attn.o_proj.max_weights.2 | 2: 0.81 | ||
| attn.o_proj.max_weights.3 | 3: 1.48 | ||
| attn.o_proj.max_weight_position | 17.02 | ||
| attn.o_proj.min_weights.0 | 0: 0.94 | ||
| attn.o_proj.min_weights.1 | 1: 0.34 | ||
| attn.o_proj.min_weights.2 | 2: 0.38 | ||
| attn.o_proj.min_weights.3 | 3: 0.07 | ||
| attn.o_proj.min_weight_distance | 10.47 | ||
| mlp.down_proj.max_weights.0 | 0: 1.10 | ||
| mlp.down_proj.max_weights.1 | 1: 1.18 | ||
| mlp.down_proj.max_weights.2 | 2: 1.32 | ||
| mlp.down_proj.max_weights.3 | 3: 1.34 | ||
| mlp.down_proj.max_weight_position | 20.96 | ||
| mlp.down_proj.min_weights.0 | 0: 0.12 | ||
| mlp.down_proj.min_weights.1 | 1: 0.73 | ||
| mlp.down_proj.min_weights.2 | 2: 0.54 | ||
| mlp.down_proj.min_weights.3 | 3: 0.84 | ||
| mlp.down_proj.min_weight_distance | 5.03 |
Appendix
Empty system prompt.
Heretication Rituals
[Trial 148] Refusals: 9/416, KL divergence: 0.0792
[Trial 265] Refusals: 10/416, KL divergence: 0.0657
» [Trial 306] Refusals: 15/416, KL divergence: 0.0571
[Trial 375] Refusals: 24/416, KL divergence: 0.0551
[Trial 351] Refusals: 25/416, KL divergence: 0.0494
[Trial 350] Refusals: 28/416, KL divergence: 0.0490
[Trial 250] Refusals: 35/416, KL divergence: 0.0424
[Trial 346] Refusals: 40/416, KL divergence: 0.0386
[Trial 358] Refusals: 52/416, KL divergence: 0.0370
[Trial 240] Refusals: 55/416, KL divergence: 0.0361
[Trial 226] Refusals: 57/416, KL divergence: 0.0361
[Trial 383] Refusals: 75/416, KL divergence: 0.0289
[Trial 377] Refusals: 97/416, KL divergence: 0.0281
[Trial 286] Refusals: 121/416, KL divergence: 0.0276
PIQA Benchmarks
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳
┃ Benchmark ┃ Metric ┃ Value ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ PIQA Base │ acc,none │ 0.7291 │
│ │ acc_stderr,none │ 0.0104 │
│ │ acc_norm,none │ 0.7301 │
│ │ acc_norm_stderr,none │ 0.0104 │
└───────────┴──────────────────────┴─────────┴
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ Benchmark ┃ Metric ┃ Value ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ PIQA T265 │ acc,none │ 0.7296 │
│ │ acc_stderr,none │ 0.0104 │
│ │ acc_norm,none │ 0.7323 │
│ │ acc_norm_stderr,none │ 0.0103 │
└───────────┴──────────────────────┴────────┘
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ Benchmark ┃ Metric ┃ Value ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ PIQA T148 │ acc,none │ 0.7291 │
│ │ acc_stderr,none │ 0.0104 │
│ │ acc_norm,none │ 0.7361 │
│ │ acc_norm_stderr,none │ 0.0103 │
└───────────┴──────────────────────┴────────┘
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ Benchmark ┃ Metric ┃ Value ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ PIQA T306 │ acc,none │ 0.7296 │
│ │ acc_stderr,none │ 0.0104 │
│ │ acc_norm,none │ 0.7334 │
│ │ acc_norm_stderr,none │ 0.0103 │
└───────────┴──────────────────────┴────────┘
Gemma3NPC-1b
A new attempt in training Gemma3NPC.
Tensorboard data are available!
It's been a while since the last Gemma3NPC model release, in the mean while we were working on some other models like GemmaThink.
Now we are back with the newest Gemma3NPC-1b, trained using our RolePlay-NPCv2 dataset.
Training Parameters
We trained this model as a rank-32 LoRA adapter with two epoches over RolePlay-NPCv2 using a 80GB A100 in Google Colab. For this run, we employed a learning rate of 2e-5 and a total batch size of 8 and gradient accumulation steps of 4. A cosine learning rate scheduler was used with an 150-step warmup. With a gradient clipping of 1.0.
Check out our training notebook here.
Changes & Performance
With this new 1b model, we used much more aggresive training parameters and added some NSFW dataset to experiment with the results. We noticed a few really interesting responses:
- There seems to be some sign of "reasoning"
- The model is less likely to break out of character
- Something up to the users to explore for themselves, remember to provide a roleplaying prompt first!
Future Work
Now, we will be focusing on further improving Gemma3NPC, not only just through training parameters.
- Better data (most of our data are old and need an update), either collected or synthetically generated.
- Better & new models, expand beyond Gemma3 model family, our next goal is a Qwen3 based model.
- Adding GRPO into the training loop.
These improvements serve our ultimate goal of creating an small agentic NPC model, with good RP quality and tool-calling for dynamic in-game interactions.
We also plan to create some sort of a Unity game demo,it's on its way.
- Downloads last month
- 6
Model tree for MuXodious/Gemma3NPC-1b-SOMPOA-heresy
Base model
google/gemma-3-1b-pt
