GPTMoE (custom, Japanese)

自作の GPTMoE 実装（PyTorch）からエクスポートした事前学習言語モデルです。 Transformers の既製クラスでは読み込めないため、同一の GPTMoE 実装で重みをロードしてください。

ファイル

model.safetensors
config.json
tokenizer/ja_unigram32k_v15m.model
tokenizer/ja_unigram32k_v15m.vocab
tokenizer/tokenizer_config.json

使い方（最小）

import json
from safetensors.torch import load_file as load_safetensors
# from your_code.gptmoe import GPTMoE  # ← あなたの実装を import

with open("config.json", "r", encoding="utf-8") as f:
    cfg = json.load(f)
moe = cfg["moe"]
model = GPTMoE(
    cfg["vocab_size"], cfg["d_model"], cfg["n_heads"], cfg["n_layers"], cfg["ffn_mult"],
    dict(
        num_experts=moe["num_experts"], k=moe["top_k"],
        capacity_factor=moe["capacity_factor"], eval_capacity_factor=moe["eval_capacity_factor"],
        min_capacity=0, noisy_gate_policy=moe["noisy_gate_policy"], use_residual=moe["use_residual"],
    ),
).eval()
state = load_safetensors("model.safetensors")
model.load_state_dict(state, strict=True)

トークナイザー

import sentencepiece as spm
sp = spm.SentencePieceProcessor(model_file="tokenizer/ja_unigram32k_v15m.model")
ids = sp.encode("こんにちは", out_type=int)

注意

素の言語モデル（事前学習のみ）です。指示追従は弱いので、few-shotや生成パラメータ（temperature / top-p / top-k / repetition penalty）の調整を推奨します。

Downloads last month: 9

Safetensors

Model size

0.4B params

Tensor type

BF16

iori-ltn
/

jp-gptmoe

GPTMoE (custom, Japanese)

ファイル

使い方（最小）

トークナイザー

注意

Datasets used to train iori-ltn/jp-gptmoe