Instructions to use OpenAssistant/pythia-12b-sft-v8-7k-steps with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use OpenAssistant/pythia-12b-sft-v8-7k-steps with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="OpenAssistant/pythia-12b-sft-v8-7k-steps")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("OpenAssistant/pythia-12b-sft-v8-7k-steps") model = AutoModelForCausalLM.from_pretrained("OpenAssistant/pythia-12b-sft-v8-7k-steps") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use OpenAssistant/pythia-12b-sft-v8-7k-steps with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "OpenAssistant/pythia-12b-sft-v8-7k-steps" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "OpenAssistant/pythia-12b-sft-v8-7k-steps", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/OpenAssistant/pythia-12b-sft-v8-7k-steps
- SGLang
How to use OpenAssistant/pythia-12b-sft-v8-7k-steps with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "OpenAssistant/pythia-12b-sft-v8-7k-steps" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "OpenAssistant/pythia-12b-sft-v8-7k-steps", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "OpenAssistant/pythia-12b-sft-v8-7k-steps" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "OpenAssistant/pythia-12b-sft-v8-7k-steps", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use OpenAssistant/pythia-12b-sft-v8-7k-steps with Docker Model Runner:
docker model run hf.co/OpenAssistant/pythia-12b-sft-v8-7k-steps
Special tokens support?
Hello,
Thanks for this awesome model.
Does this iteration support special tokens?
<|system|>
<|assistant|>
<|prefix_begin|>
<|prefix_end|>
<|prompter|>
if so, can we have some examples on how to use?
Yes, it does seem to support those special tokens. According to MESSAGE_AND_TOKEN_FORMAT, it doesn't support all of them. However, I have used them in my projects so far and it has been working well. This is what I've done:<|system|> - Used to let the model know what it is supposed to do. I don't know if you are meant to include the <|endoftext|> token afterwards, but it seems to be working well for me so far.
Ex: <|system|>You are an AI that summarizes a conversation in as few words as possible.<|endoftext|>
<|assistant|> - The basic token letting the model know it needs to start acting like the assistant.
Ex: <|prompter|>What is a meme, and what's the history behind this word?<|endoftext|><|assistant|>
<|prefix_begin|> & <|prefix_end|> - According to the GitHub, it isn't implemented yet. I've been using it and it seems to be working well so far for me.
Ex: <|prefix_begin|>"Rice is a common grain found in various cuisines worldwide." "There exist multiple types of rices including long grain, medium grain, etcetera." There are numerous recipes one could make from rice depending on their preferences."<|prefix_end|><|prompter|>What is the subject of the text in 1 word?<|endoftext|><|assistant|> -> Food and drink<|endoftext|>
<|prompter|> - I think you can infer the usage based on previous examples.
Ex: <|prompter|>What is a meme, and what's the history behind this word?<|endoftext|><|assistant|>