Instructions to use OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5") model = AutoModelForCausalLM.from_pretrained("OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5
- SGLang
How to use OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5 with Docker Model Runner:
docker model run hf.co/OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5
Interview request: genAI evaluation & documentation
#34 opened almost 2 years ago
by
evatang
Service unavailiable
π 1
4
#33 opened about 2 years ago
by
VladSolyankin
Adding `safetensors` variant of this model
#32 opened over 2 years ago
by
SFconvertbot
Adding Evaluation Results
#30 opened over 2 years ago
by
leaderboard-pr-bot
Adding Evaluation Results
#29 opened over 2 years ago
by
leaderboard-pr-bot
Load in pythia-12b
1
#28 opened over 2 years ago
by
ghidav
Adding `safetensors` variant of this model
#27 opened almost 3 years ago
by
lixue233
Adding `safetensors` variant of this model
#26 opened almost 3 years ago
by
lixue233
few shot learning
#25 opened almost 3 years ago
by
project-test
Vercel AI SDK
3
#24 opened about 3 years ago
by
julianouxui
How to load this model in 4bit?
π 1
#23 opened about 3 years ago
by
banank1989
Giving Partial answers
3
#22 opened about 3 years ago
by
deepakkaura26
URGENT (Can someone help me?)
1
#21 opened about 3 years ago
by
sancassino
Reproducing the fine tuning gets stuck with 100% CPU on one process
#20 opened about 3 years ago
by
felipemv
How to reduce batch size in order to solve CUDA out of memory error?
3
#19 opened about 3 years ago
by
samyar03
Probability tensor contains either `inf`, `nan` or element < 0
1
#18 opened about 3 years ago
by
rociomcomin
OpenAssistant API
5
#16 opened about 3 years ago
by
DeeJaye
How can I use Sagemaker's inference recommender for this model for question-answering task?
#15 opened about 3 years ago
by
monasbhar09
Facing error while inferencing
2
#14 opened about 3 years ago
by
sauravm8
How to use the special token `<|system|>` effectively?
π 1
1
#13 opened about 3 years ago
by
AayushShah
Can the model be used for commercial purposes?
π 2
5
#11 opened about 3 years ago
by
AayushShah
CUDA out of memory
3
#10 opened about 3 years ago
by
Blue-Devil
gptneox.cpp - Fork of llama.cpp (ggml) for OpenAssistant usage
#9 opened about 3 years ago
by
byroneverson
epoch-8 version
#8 opened about 3 years ago
by
gsaivinay
Slow inference time for new version of transformers
11
#7 opened about 3 years ago
by
russellsparadox
wandb link does not seem to work
1
#6 opened about 3 years ago
by
mingaflo
Embeddings?
π 7
#5 opened about 3 years ago
by
hydronab
4bit gglm version?
π 6
1
#4 opened about 3 years ago
by
jbollenbacher
Some Technical and Usage Information
π 1
#3 opened about 3 years ago
by
gsaivinay
How to run that?
23
#2 opened about 3 years ago
by
Guilherme34
Alpaca in dataset
π 1
27
#1 opened about 3 years ago
by
morganpie