atrost/nanochat-d24-matformer-ffnlog-fullattn-sft-s
This is a standalone nanochat-native checkpoint extracted from
atrost/nanochat-d24-matformer-ffnlog-fullattn-sft/d24_matformer_ffnlog_fullattn_sft at step 000485.
- Family:
matformer - Submodel:
S - Config:
{"attn_alpha_init_value": 1.0, "dec_alpha_init_value": 1.0, "ffn_alpha_init_value": 1.0, "head_dim": 80, "matformer_ffn_hidden_dims": [640], "matformer_head_counts": [16], "n_embd": 1280, "n_head": 16, "n_kv_head": 16, "n_layer": 24, "norm_type": "rmsnorm", "sequence_len": 2048, "vocab_size": 32768, "window_pattern": "L"}
Load it into the local nanochat checkpoint layout before evaluation:
from nanochat.hf_utils import stage_hf_checkpoint
stage_hf_checkpoint(
repo_id="atrost/nanochat-d24-matformer-ffnlog-fullattn-sft-s",
path_in_repo="d24_matformer_ffnlog_fullattn_sft_s",
model_tag="d24_matformer_ffnlog_fullattn_sft_s",
step=485,
base_dir="/tmp/nanochat-eval",
)
Then run eval with:
NANOCHAT_BASE_DIR=/tmp/nanochat-eval python -m scripts.base_eval --model-tag d24_matformer_ffnlog_fullattn_sft_s --step 485
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support