Instructions to use SciPhi/SciPhi-Mistral-7B-32k with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use SciPhi/SciPhi-Mistral-7B-32k with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="SciPhi/SciPhi-Mistral-7B-32k")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("SciPhi/SciPhi-Mistral-7B-32k") model = AutoModelForCausalLM.from_pretrained("SciPhi/SciPhi-Mistral-7B-32k") - Inference
- Local Apps Settings
- vLLM
How to use SciPhi/SciPhi-Mistral-7B-32k with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "SciPhi/SciPhi-Mistral-7B-32k" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SciPhi/SciPhi-Mistral-7B-32k", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/SciPhi/SciPhi-Mistral-7B-32k
- SGLang
How to use SciPhi/SciPhi-Mistral-7B-32k with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "SciPhi/SciPhi-Mistral-7B-32k" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SciPhi/SciPhi-Mistral-7B-32k", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "SciPhi/SciPhi-Mistral-7B-32k" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SciPhi/SciPhi-Mistral-7B-32k", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use SciPhi/SciPhi-Mistral-7B-32k with Docker Model Runner:
docker model run hf.co/SciPhi/SciPhi-Mistral-7B-32k
What's the prompt format for this model?
What's the recommended prompt format for this model? what was the model trained with?
Thnx
This looks like an improved base model to be fine-tuned on, so no prompt template.
Wondering the same thing...
Apparently this is not the base model (as that was just uploaded).
So... Is this an instruct? What is the prompt?
I still don't know the ideal format but I had terrible results with the mistral format ([INST] prompt [/INST]) so it clearly isn't this one..
I had better luck with alpaca and zephyr formats but without the eos </s>
Alpaca instruct is preferred.
Alpaca instruct is preferred.
Is it the same as the RAG model?
https://huggingface.co/SciPhi/SciPhi-Self-RAG-Mistral-7B-32k#recommended-chat-formatting
If so, that is enhanced Alpaca (As base Alpaca doesn't use any particular syntax for the system prompt).
Y'all should print the precise format of the trained model in a box on the model page. Something like this template would be very helpful: https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF#prompt-template-zephyr
Thanks for your interest and feedback - you are correct in this regard. I will do some testing tonight and produce a clean template + some code to support it elsewhere.
I recommend formatting like this -
Recommended Chat Formatting
We recommend mapping such that
messages = [
{
"role": "system",
"content": "You are a friendly chatbot who always responds in the style of a pirate",
},
{"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]
goes to --->
### System:
You are a friendly chatbot who always responds in the style of a pirate
### Instruction:
How many helicopters can a human eat in one sitting?
### Response:
...
I chose this format as the majority of the fine tuning dataset was instruction tuning and it seemed like the closest match. It might need revision, please let me know your findings.