How to use from the
Use from the
Transformers library
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="bobchenyx/gpt-oss-120b-GGUF")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)
# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("bobchenyx/gpt-oss-120b-GGUF", dtype="auto")
Quick Links

Llamacpp Quantizations of gpt-oss-120b

Original model: Adopting F16 from unsloth/gpt-oss-120b-GGUF.

MXFP4_MOE quant made with update in this PR llama.cpp #15091

MXFP4_MOE : 59.02 GiB (4.34 BPW)


Download(Example)

# !pip install huggingface_hub hf_transfer
import os
os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"
from huggingface_hub import snapshot_download
snapshot_download(
    repo_id = "bobchenyx/gpt-oss-120b-GGUF",
    local_dir = "bobchenyx/gpt-oss-120b-GGUF",
    allow_patterns = ["*MXFP4_MOE*"],
)
Downloads last month
4
GGUF
Model size
117B params
Architecture
gpt-oss
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for bobchenyx/gpt-oss-120b-GGUF

Quantized
(110)
this model

Collection including bobchenyx/gpt-oss-120b-GGUF