Text Generation
Transformers
Safetensors
mistral
Not-For-All-Audiences
nsfw
text-generation-inference
Instructions to use Undi95/Toppy-M-7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Undi95/Toppy-M-7B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Undi95/Toppy-M-7B")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Undi95/Toppy-M-7B") model = AutoModelForCausalLM.from_pretrained("Undi95/Toppy-M-7B") - Inference
- Local Apps Settings
- vLLM
How to use Undi95/Toppy-M-7B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Undi95/Toppy-M-7B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Undi95/Toppy-M-7B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/Undi95/Toppy-M-7B
- SGLang
How to use Undi95/Toppy-M-7B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Undi95/Toppy-M-7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Undi95/Toppy-M-7B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Undi95/Toppy-M-7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Undi95/Toppy-M-7B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use Undi95/Toppy-M-7B with Docker Model Runner:
docker model run hf.co/Undi95/Toppy-M-7B
| license: cc-by-nc-4.0 | |
| tags: | |
| - not-for-all-audiences | |
| - nsfw | |
| <!-- description start --> | |
| ## Description | |
| This repo contains fp16 files of Toppy-M-7B, a merge I have done with the new task_arithmetic merge method from mergekit. | |
| This project was a request from [BlueNipples](https://huggingface.co/BlueNipples) : [link](https://huggingface.co/Undi95/Utopia-13B/discussions/1) | |
| <!-- description end --> | |
| <!-- description start --> | |
| ## Models and loras used | |
| - [openchat/openchat_3.5](https://huggingface.co/openchat/openchat_3.5) | |
| - [NousResearch/Nous-Capybara-7B-V1.9](https://huggingface.co/NousResearch/Nous-Capybara-7B-V1.9) | |
| - [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) | |
| - [lemonilia/AshhLimaRP-Mistral-7B](lemonilia/AshhLimaRP-Mistral-7B) | |
| - [Vulkane/120-Days-of-Sodom-LoRA-Mistral-7b](https://huggingface.co/Vulkane/120-Days-of-Sodom-LoRA-Mistral-7b) | |
| - [Undi95/Mistral-pippa-sharegpt-7b-qlora](Undi95/Mistral-pippa-sharegpt-7b-qlora) | |
| <!-- description end --> | |
| ## The sauce | |
| ``` | |
| openchat/openchat_3.5 | |
| lemonilia/AshhLimaRP-Mistral-7B (LoRA) x 0.38 | |
| NousResearch/Nous-Capybara-7B-V1.9 | |
| Vulkane/120-Days-of-Sodom-LoRA-Mistral-7b x 0.27 | |
| HuggingFaceH4/zephyr-7b-beta | |
| Undi95/Mistral-pippa-sharegpt-7b-qlora x 0.38 | |
| merge_method: task_arithmetic | |
| base_model: mistralai/Mistral-7B-v0.1 | |
| models: | |
| - model: mistralai/Mistral-7B-v0.1 | |
| - model: Undi95/zephyr-7b-beta-pippa-sharegpt | |
| parameters: | |
| weight: 0.42 | |
| - model: Undi95/Nous-Capybara-7B-V1.9-120-Days | |
| parameters: | |
| weight: 0.29 | |
| - model: Undi95/openchat_3.5-LimaRP-13B | |
| parameters: | |
| weight: 0.48 | |
| dtype: bfloat16 | |
| ``` | |
| <!-- prompt-template start --> | |
| ## Prompt template: Alpaca | |
| ``` | |
| Below is an instruction that describes a task. Write a response that appropriately completes the request. | |
| ### Instruction: | |
| {prompt} | |
| ### Response: | |
| ``` | |
| If you want to support me, you can [here](https://ko-fi.com/undiai). |