Instructions to use KotshinZ/gpt-oss-120b-rys-0_19-17_35 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use KotshinZ/gpt-oss-120b-rys-0_19-17_35 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="KotshinZ/gpt-oss-120b-rys-0_19-17_35")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("KotshinZ/gpt-oss-120b-rys-0_19-17_35")
model = AutoModelForCausalLM.from_pretrained("KotshinZ/gpt-oss-120b-rys-0_19-17_35")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use KotshinZ/gpt-oss-120b-rys-0_19-17_35 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "KotshinZ/gpt-oss-120b-rys-0_19-17_35"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "KotshinZ/gpt-oss-120b-rys-0_19-17_35",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/KotshinZ/gpt-oss-120b-rys-0_19-17_35

SGLang

How to use KotshinZ/gpt-oss-120b-rys-0_19-17_35 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "KotshinZ/gpt-oss-120b-rys-0_19-17_35" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "KotshinZ/gpt-oss-120b-rys-0_19-17_35",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "KotshinZ/gpt-oss-120b-rys-0_19-17_35" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "KotshinZ/gpt-oss-120b-rys-0_19-17_35",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use KotshinZ/gpt-oss-120b-rys-0_19-17_35 with Docker Model Runner:
```
docker model run hf.co/KotshinZ/gpt-oss-120b-rys-0_19-17_35
```

gpt-oss-120b RYS `0..19,17..35`

This repository is a layer-routed RYS variant of openai/gpt-oss-120b.

Base model revision: b5c939de8f754692c1647ca79fbf85e8c1e70f8a
Requested path: 0..19,17..35
Resolved path: 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35
Original layers: 36
Output layers: 39
Repeated source layers: 17,18,19

No additional quantization was applied while building this repo. The tensor bytes are copied directly from the source checkpoint and only re-indexed into a new layer execution path.

The model config has been updated so model.layers follows the path above. Tokenizer and chat template files are copied from the base repository unchanged.

Downloads last month: 2

Safetensors

Model size

130B params

Tensor type

BF16

Model tree for KotshinZ/gpt-oss-120b-rys-0_19-17_35

Base model

openai/gpt-oss-120b

Quantized

(108)

this model

gpt-oss-120b RYS 0..19,17..35

Model tree for KotshinZ/gpt-oss-120b-rys-0_19-17_35

gpt-oss-120b RYS `0..19,17..35`