nvidia/Nemotron-ClimbMix
Viewer • Updated • 355M • 9.42k • 112
vLLM compatibility conversion of atrost/climbmix-baseline-small-asymmetric-94m-1p2b-h100.
The source checkpoint is a compact asymmetric draft model: its entry layer uses the large 1280-wide embedding space, then later layers and the LM head operate in the 160-wide nested space. vLLM's generic Transformers causal wrapper assumes one hidden size for the base model and LM head, so this repo expands the checkpoint into a full-width StairFormer-shaped model with masked/zeroed suffix weights.
The converted model preserves the source logits up to floating-point differences, but it is a compatibility artifact rather than an optimized 94M draft runtime.
atrost/climbmix-baseline-small-asymmetric-94m-1p2b-h1000c84f2c2476ac7f45ca7796d5793490efd0131352.861e-06from transformers import AutoModelForCausalLM, AutoTokenizer
repo_id = "atrost/climbmix-baseline-small-asymmetric-94m-1p2b-h100-vllm"
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForCausalLM.from_pretrained(
repo_id,
trust_remote_code=True,
torch_dtype="auto",
)
from vllm import LLM
llm = LLM(model="atrost/climbmix-baseline-small-asymmetric-94m-1p2b-h100-vllm", trust_remote_code=True, model_impl="transformers")