Instructions to use krishnakartik/gemma4-social-bias-judge-sft with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use krishnakartik/gemma4-social-bias-judge-sft with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="krishnakartik/gemma4-social-bias-judge-sft")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("krishnakartik/gemma4-social-bias-judge-sft")
model = AutoModelForMultimodalLM.from_pretrained("krishnakartik/gemma4-social-bias-judge-sft")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use krishnakartik/gemma4-social-bias-judge-sft with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "krishnakartik/gemma4-social-bias-judge-sft"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "krishnakartik/gemma4-social-bias-judge-sft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/krishnakartik/gemma4-social-bias-judge-sft

SGLang

How to use krishnakartik/gemma4-social-bias-judge-sft with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "krishnakartik/gemma4-social-bias-judge-sft" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "krishnakartik/gemma4-social-bias-judge-sft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "krishnakartik/gemma4-social-bias-judge-sft" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "krishnakartik/gemma4-social-bias-judge-sft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use krishnakartik/gemma4-social-bias-judge-sft with Docker Model Runner:
```
docker model run hf.co/krishnakartik/gemma4-social-bias-judge-sft
```

Gemma 4 E4B — Social-Bias Judge (SFT only)

This is the SFT-only checkpoint from the judge-from-scratch project. It is the intermediate artifact before the DPO refinement pass that produced krishnakartik/gemma4-social-bias-judge (the primary release).

Use this checkpoint instead of the DPO version if your bias categories are out-of-distribution relative to BBQ's training set. The DPO refinement narrows generalization by overfitting to the 10 in-distribution bias categories' specific patterns — fine when your inputs match the training distribution, harmful when they don't.

For the full project narrative, eval methodology, training pipeline, and limitations, read the primary model card. This card focuses on what differs between the SFT-only and DPO checkpoints.

⚠️ Important: Thinking Mode

This model was fine-tuned with Gemma 4's native thinking mode DISABLED. Do NOT include <|think|> in the system prompt at inference time — the model never saw that token during training and will generate degraded, unparseable output. See the primary model card's thinking-mode section for the full explanation.

Quick start

Ollama

# IMPORTANT: thinking mode is disabled — do NOT add <|think|> to /system.
ollama run hf.co/krishnakartik/gemma4-social-bias-judge-gguf:Q8_0-sft

Python (transformers)

# Identical usage to the DPO checkpoint — only the model_id changes.
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "krishnakartik/gemma4-social-bias-judge-sft"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id, torch_dtype=torch.bfloat16, device_map="cuda"
)
# ... see primary model card for the full inference snippet.

When to choose this over the DPO checkpoint

Use case	Recommended
Bias categories in BBQ's 10 trained set (age, disability, gender identity, nationality, physical appearance, race/ethnicity inc. intersectional, religion, sexual orientation, SES)	DPO (primary)
Bias categories outside the trained set (politics, ideology, novel demographic axes, intersectional categories not in training)	This checkpoint (SFT)
Tie-case detection (both responses clean) is critical	DPO — tie-κ jumps from −0.06 (SFT) to 0.36 (DPO)
Subtle bias discrimination on in-dist data	DPO — subtle-κ jumps from 0.74 (SFT) to 0.89 (DPO)
Tracked-vs-alternate (which specific stereotype is invoked)	This checkpoint (SFT-κ 0.20 vs DPO-κ 0.12)
Position-bias robustness on OOD	This checkpoint (SFT 11.7% vs DPO 16.7%)

Eval results (selected)

Same 300-pair holdout, same vLLM/bf16 backend as the primary model card's eval table.

Metric	Base	SFT (this)	DPO
Overall κ (in-dist)	0.481	0.647	0.682
Overall κ (OOD religion)	0.542	0.695	0.643
Tracked-vs-alternate κ	0.145	0.197	0.119
Subtle cases κ	0.632	0.743	0.890
Tie cases κ	0.202	−0.056	0.359
Position-bias rate (OOD)	21.7%	11.7%	16.7%
Self-consistency (T=0.3)	73.7%	83.2%	82.7%

This checkpoint wins on OOD κ, tracked-vs-alternate κ, and OOD position-bias. The DPO checkpoint wins on in-dist κ, subtle cases, and tie cases — the metrics where the synth-hard-negatives training shape was specifically designed to help.

The OOD-κ delta (+0.052 in this checkpoint's favor) is the load-bearing reason this artifact exists. See the primary model card's OOD-regression discussion for the full analysis.

Training summary

QLoRA SFT: 3,844 rows (1,938 base pairs × position-swap doubling), 3 epochs, 720 optimizer steps, r=16, α=32, dropout=0, all-linear LoRA targets, lr=2e-4 cosine, peak VRAM 23.4 GB on A100-40GB. Final train_loss 0.889, mean_token_accuracy 86.1%. Total Stage 6 spend: ~$4. Adapter merged to bf16 for Stage 8 eval and this release.

The DPO step was applied to a copy of this checkpoint (not gated by this checkpoint's existence), so the SFT artifact is the same one that fed into DPO — it's a checkpoint snapshot of the pipeline, unmodified.

License & citation

Same as the primary model card.

Downloads last month: 4

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for krishnakartik/gemma4-social-bias-judge-sft

Base model

google/gemma-4-E4B

Finetuned

google/gemma-4-E4B-it

Finetuned

(224)

this model

Dataset used to train krishnakartik/gemma4-social-bias-judge-sft

Evaluation results

Cohen's κ (in-distribution, 240 pairs) on Gemma 4 Social Bias Judge Pairs (eval holdout)
self-reported

0.647
Cohen's κ (OOD religion, 60 pairs) on Gemma 4 Social Bias Judge Pairs (eval holdout)
self-reported

0.695
Cohen's κ (tracked-vs-alternate) on Gemma 4 Social Bias Judge Pairs (eval holdout)
self-reported

0.197
Cohen's κ (subtle-bias bucket) on Gemma 4 Social Bias Judge Pairs (eval holdout)
self-reported

0.743
Position-bias rate (in-distribution; lower is better) on Gemma 4 Social Bias Judge Pairs (eval holdout)
self-reported

0.084
Self-consistency rate (T=0.3) on Gemma 4 Social Bias Judge Pairs (eval holdout)
self-reported

0.832