atrost/climbmix-baseline-small-asymmetric-94m-1p2b-h100-vllm

vLLM compatibility conversion of atrost/climbmix-baseline-small-asymmetric-94m-1p2b-h100.

The source checkpoint is a compact asymmetric draft model: its entry layer uses the large 1280-wide embedding space, then later layers and the LM head operate in the 160-wide nested space. vLLM's generic Transformers causal wrapper assumes one hidden size for the base model and LM head, so this repo expands the checkpoint into a full-width StairFormer-shaped model with masked/zeroed suffix weights.

The converted model preserves the source logits up to floating-point differences, but it is a compatibility artifact rather than an optimized 94M draft runtime.

  • Source checkpoint: atrost/climbmix-baseline-small-asymmetric-94m-1p2b-h100
  • Source revision: 0c84f2c2476ac7f45ca7796d5793490efd013135
  • Local parity max logits diff: 2.861e-06
from transformers import AutoModelForCausalLM, AutoTokenizer

repo_id = "atrost/climbmix-baseline-small-asymmetric-94m-1p2b-h100-vllm"
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForCausalLM.from_pretrained(
    repo_id,
    trust_remote_code=True,
    torch_dtype="auto",
)
from vllm import LLM

llm = LLM(model="atrost/climbmix-baseline-small-asymmetric-94m-1p2b-h100-vllm", trust_remote_code=True, model_impl="transformers")
Downloads last month
22
Safetensors
Model size
0.6B params
Tensor type
F32
·
BOOL
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train atrost/climbmix-baseline-small-asymmetric-94m-1p2b-h100-vllm