LickyArc
/

tinystories-nanogpt

Text Generation

Model card Files Files and versions

nanoGPT — TinyStories

A ~30M-parameter GPT trained from scratch on the TinyStories dataset.

Tokenizer: GPT-2 BPE (tiktoken)
Architecture: 6 layers, 6 heads, 384 embedding dim, context 256
Best val loss: 1.7052

Load

import torch, json
import tiktoken

config = json.load(open('model_config.json'))
ckpt   = torch.load('ckpt.pt', map_location='cpu')
# Re-instantiate GPT with the same config, then load state dict.
enc = tiktoken.get_encoding('gpt2')

Downloads last month: -; Downloads are not tracked for this model. How to track