Instructions to use 0xSero/Gemma-4-19B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use 0xSero/Gemma-4-19B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="0xSero/Gemma-4-19B")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("0xSero/Gemma-4-19B")
model = AutoModelForMultimodalLM.from_pretrained("0xSero/Gemma-4-19B")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use 0xSero/Gemma-4-19B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "0xSero/Gemma-4-19B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "0xSero/Gemma-4-19B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/0xSero/Gemma-4-19B

SGLang

How to use 0xSero/Gemma-4-19B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "0xSero/Gemma-4-19B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "0xSero/Gemma-4-19B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "0xSero/Gemma-4-19B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "0xSero/Gemma-4-19B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use 0xSero/Gemma-4-19B with Docker Model Runner:
```
docker model run hf.co/0xSero/Gemma-4-19B
```

Korean Multilingual is broken.

by DFveloper - opened 8 days ago

Discussion

DFveloper

8 days ago

The Korean output is quite broken.
it mixes with 5 different languages.
input is nice. and latent reasoning is also nice.
I'll provide the translation.

Original

User: 한국어로 말해라.
LLM:

서로를로（Sử-ro-lo）로，我是由 Google 提供的 AI 模型.

我可以被看作是一个智能助手（Intelligent Assistant），能够理解并处理文本、代码，并回答各种问题。

如果您希望我用韩语（Korean）进行对话，我会根据您的要求切换。

**현재モード (Current Mode):** 我正在使用中文回答，但如果您需要我用韩语对话，请告诉我。

(Nếuคุณ muốn我用韩语回答，请写 "请用韩语回答" 或 "请用韩语继续".)

Translation

User: Say in korean.
LLM:

As Seororel(Sử-ro-lo), I'm Intelligent Assistant created by Google.
Do you know I'm smart assistant that understands text and code? If you wish to conversate "korean", I'll change mode with your favor.
**Current Mode:** now I'm talking chinese, Tell me if you wish to conversate korean.

(If you wish me to conversate with korean, say "please say with korean" or "please continue with korean" in chinese.)

the sentence is right, but language filter layers are gone.
also, model's attitude is Driven to Hell🤣

DFveloper

8 days ago

PS: Don't worry. it could be fixed with SFT.

0xSero

Owner 7 days ago

REAP knowingly removes experts with lower saliency on a targeted set of samples. I chose to protect coding and agentic

DFveloper

7 days ago

•

edited 7 days ago

I know but, was all dataset english?
if it is, could it be normalized with sft?
I didn't majored AI.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment