HuggingFaceFW/fineweb-edu
Viewer • Updated • 3.5B • 476k • 1.14k
This is a 124M parameter Language Model (GPT-2 Small architecture) pre-trained from scratch on the FineWeb-Edu dataset.
It is the base model for LiteGPT-Instruct.
This is a completion model. It predicts the next tokens based on the input text. It is NOT an instruction-following model (chatbot).
from transformers import GPT2LMHeadModel, GPT2Tokenizer
model = GPT2LMHeadModel.from_pretrained("koganrath/LiteGPT-Base")
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
text = "Once upon a time in a digital world,"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Trained by koganrath as part of the LiteGPT Project.