Instructions to use Romarchive/toyllama-50m-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use Romarchive/toyllama-50m-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Romarchive/toyllama-50m-GGUF", filename="toyllama-50m BF16 (2026)[Romarchive].gguf", )
output = llm( "Once upon a time,", max_tokens=512, echo=True ) print(output)
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use Romarchive/toyllama-50m-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Romarchive/toyllama-50m-GGUF:BF16 # Run inference directly in the terminal: llama-cli -hf Romarchive/toyllama-50m-GGUF:BF16
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Romarchive/toyllama-50m-GGUF:BF16 # Run inference directly in the terminal: llama-cli -hf Romarchive/toyllama-50m-GGUF:BF16
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf Romarchive/toyllama-50m-GGUF:BF16 # Run inference directly in the terminal: ./llama-cli -hf Romarchive/toyllama-50m-GGUF:BF16
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf Romarchive/toyllama-50m-GGUF:BF16 # Run inference directly in the terminal: ./build/bin/llama-cli -hf Romarchive/toyllama-50m-GGUF:BF16
Use Docker
docker model run hf.co/Romarchive/toyllama-50m-GGUF:BF16
- LM Studio
- Jan
- vLLM
How to use Romarchive/toyllama-50m-GGUF with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Romarchive/toyllama-50m-GGUF" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Romarchive/toyllama-50m-GGUF", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/Romarchive/toyllama-50m-GGUF:BF16
- Ollama
How to use Romarchive/toyllama-50m-GGUF with Ollama:
ollama run hf.co/Romarchive/toyllama-50m-GGUF:BF16
- Unsloth Studio new
How to use Romarchive/toyllama-50m-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Romarchive/toyllama-50m-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Romarchive/toyllama-50m-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Romarchive/toyllama-50m-GGUF to start chatting
- Docker Model Runner
How to use Romarchive/toyllama-50m-GGUF with Docker Model Runner:
docker model run hf.co/Romarchive/toyllama-50m-GGUF:BF16
- Lemonade
How to use Romarchive/toyllama-50m-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull Romarchive/toyllama-50m-GGUF:BF16
Run and chat with the model
lemonade run user.toyllama-50m-GGUF-BF16
List all available models
lemonade list
Brought to you by Romarchive
ToyLlama 50M
Third version of ToyLlama model. See more ToyLlamas in my profile.
(P.S. this one was intended to be 30M v2, but due to increase of data, I choose to use pretrained openai-community/gpt2 tokenizer)
Example generations
All were generated with test.py interactive CLI. (Usage: python3 test.py)
>>> Enter prompt: Once upon a time, the cow walked to the park
------------------------------------------------------------
Once upon a time, the cow walked to the park in the west of the town of Pippa. It was the first of the town's most popular and popular sports. The town was renamed a "Woo" in January 2019.
The town's name was changed to "Doves of the Year" in May 2021.
References
External links
1952 births
Living people
American male football players
American football pitchers
21st-century American football players
21st-century American football players
People from Bambisha, New York
Canadian men's footballers
New York City footballers
Basketball players from New York (state)
Players of American football from New York (state)
People from Santa Cruz, New York (state)
Australian
------------------------------------------------------------
>>> Enter prompt: Abraham Lincoln
------------------------------------------------------------
Abraham Lincoln, the first of the first students of the National Academy of Sciences, the first of the first students of the National Academy of Sciences.
The university was founded in 2005 by the National Academy of Sciences in 2006.
The first student school of the university was named for the first student school in the University of Chicago in 2006. The college is named after the college.
The college is located in the north-central corner of the campus. The school is located in the south-west corner of the campus.
The school is located in the centre of the city, and is now part of the University of Minnesota.
See also
List of the School of Arts and Sciences
References
External links
University of Wisconsin at University
------------------------------------------------------------
(as you see, it became slightly wikipedian)
Training information
Trained for 26 hours on one RX 6600 (8GB VRAM).
| Parameter | Value |
|---|---|
| Loss | 2.5885 |
| Epoches | 1 |
| grad_norm | 0.5695 |
| Learning rate | 5e-4 |
| Batch size | 8 |
| Gradient accumulation steps | 4 |
| Training tokens | 13763555964 (~13B) |
Training data
Training data was HF datasets + multiple other sites (2262401 files total):
$ du -sh ./data/*
12K ./data/cows.info.gf
472K ./data/github.com
5,8M ./data/gutenberg.org
88K ./data/habr.com
9,0G ./data/huggingface.co
474M ./data/textfiles.com
12K ./data/www.reddit.com
$ du -sh ./data/
9,5G ./data/
$ du -sh ./data/huggingface.co/*/*
34M ./data/huggingface.co/cornell-movie-review-data/rotten_tomatoes
21M ./data/huggingface.co/EleutherAI/lambada_openai
5,8M ./data/huggingface.co/gaianet/bible_bot
92K ./data/huggingface.co/gaianet/paris
344K ./data/huggingface.co/gaianet/trumpVSharris
16K ./data/huggingface.co/krinal/fifa_2022
680M ./data/huggingface.co/prayslaks/wikimedia_wikipedia_100K
8,2G ./data/huggingface.co/roneneldan/TinyStories
68M ./data/huggingface.co/stas/openwebtext-10k
Advanced module information
Detailed information about EXACT model.
| Parameter | Value |
|---|---|
| Hidden size | 512 |
| Intermediate size | 1536 |
| Hidden layers | 8 |
| Attention heads | 8 |
| Key value heads | 4 |
| Total parameters | 50906112 |
- Downloads last month
- 397
1-bit
2-bit
8-bit
16-bit
32-bit
Model tree for Romarchive/toyllama-50m-GGUF
Base model
sapbot/toyllama-50m