Instructions to use matchaaaaa/Honey-Yuzu-13B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use matchaaaaa/Honey-Yuzu-13B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="matchaaaaa/Honey-Yuzu-13B")# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("matchaaaaa/Honey-Yuzu-13B") model = AutoModelForMultimodalLM.from_pretrained("matchaaaaa/Honey-Yuzu-13B") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use matchaaaaa/Honey-Yuzu-13B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "matchaaaaa/Honey-Yuzu-13B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "matchaaaaa/Honey-Yuzu-13B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/matchaaaaa/Honey-Yuzu-13B
- SGLang
How to use matchaaaaa/Honey-Yuzu-13B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "matchaaaaa/Honey-Yuzu-13B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "matchaaaaa/Honey-Yuzu-13B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "matchaaaaa/Honey-Yuzu-13B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "matchaaaaa/Honey-Yuzu-13B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use matchaaaaa/Honey-Yuzu-13B with Docker Model Runner:
docker model run hf.co/matchaaaaa/Honey-Yuzu-13B
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,19 +1,48 @@
|
|
| 1 |
---
|
| 2 |
-
base_model:
|
|
|
|
| 3 |
library_name: transformers
|
| 4 |
tags:
|
| 5 |
- mergekit
|
| 6 |
- merge
|
| 7 |
-
|
|
|
|
|
|
|
| 8 |
---
|
| 9 |
|
| 10 |

|
| 11 |
|
| 12 |
# Honey-Yuzu-13B
|
| 13 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
| 15 |
|
| 16 |
-
## Merge Details
|
| 17 |
### Merge Method
|
| 18 |
|
| 19 |
This model was merged using the passthrough merge method.
|
|
@@ -21,25 +50,91 @@ This model was merged using the passthrough merge method.
|
|
| 21 |
### Models Merged
|
| 22 |
|
| 23 |
The following models were included in the merge:
|
| 24 |
-
*
|
| 25 |
-
*
|
| 26 |
-
*
|
|
|
|
|
|
|
|
|
|
|
|
|
| 27 |
|
| 28 |
-
###
|
| 29 |
|
| 30 |
The following YAML configuration was used to produce this model:
|
| 31 |
|
| 32 |
```yaml
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
dtype: float32
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 |
merge_method: passthrough
|
| 35 |
-
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
-
|
| 40 |
-
|
| 41 |
-
|
| 42 |
-
-
|
| 43 |
-
|
| 44 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 45 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
base_model:
|
| 3 |
+
- mistralai/Mistral-7B-v0.1
|
| 4 |
library_name: transformers
|
| 5 |
tags:
|
| 6 |
- mergekit
|
| 7 |
- merge
|
| 8 |
+
- roleplay
|
| 9 |
+
- text-generation-inference
|
| 10 |
+
license: cc-by-4.0
|
| 11 |
---
|
| 12 |
|
| 13 |

|
| 14 |
|
| 15 |
# Honey-Yuzu-13B
|
| 16 |
|
| 17 |
+
Meet Honey-Yuzu, a sweet lemony tea brewed by yours truly! A bit of [Chunky-Lemon-Cookie-11B](https://huggingface.co/FallenMerick/Chunky-Lemon-Cookie-11B) here for its great flavor, with a dash of [WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2) there to add some depth. I'm really proud of how it turned out, and I hope you like it too!
|
| 18 |
+
|
| 19 |
+
It's not as verbose as Chaifighter, but it still writes very well. It boasts fantastic coherence and character understanding (in my opinion) for a 13B, and it's been my daily driver for a little bit. It's a solid RP model that should generally play nice with just about anything.
|
| 20 |
+
|
| 21 |
+
## Prompt Template: Alpaca
|
| 22 |
+
|
| 23 |
+
```
|
| 24 |
+
Below is an instruction that describes a task. Write a response that appropriately completes the request.
|
| 25 |
+
|
| 26 |
+
### Instruction:
|
| 27 |
+
{prompt}
|
| 28 |
+
|
| 29 |
+
### Response:
|
| 30 |
+
```
|
| 31 |
+
|
| 32 |
+
## Recommended Settings: Universal-Light
|
| 33 |
+
|
| 34 |
+
Here are some settings ranges that tend to work for me. They aren't strict values, and there's a bit of leeway in them. Feel free to experiment a bit!
|
| 35 |
+
|
| 36 |
+
* Temperature: **1.25** ish
|
| 37 |
+
* Min-P: **0.05** to **0.1**
|
| 38 |
+
* Repetition Penalty: **1.05** *to* **1.1** (high values aren't needed and usually degrade output)
|
| 39 |
+
* Rep. Penalty Range: **256** *or* **512**
|
| 40 |
+
* *(all other samplers disabled)*
|
| 41 |
+
|
| 42 |
+
## The Deets
|
| 43 |
+
|
| 44 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
| 45 |
|
|
|
|
| 46 |
### Merge Method
|
| 47 |
|
| 48 |
This model was merged using the passthrough merge method.
|
|
|
|
| 50 |
### Models Merged
|
| 51 |
|
| 52 |
The following models were included in the merge:
|
| 53 |
+
* [Chunky-Lemon-Cookie-11B](https://huggingface.co/FallenMerick/Chunky-Lemon-Cookie-11B)
|
| 54 |
+
* [SanjiWatsuki/Kunoichi-7B](https://huggingface.co/SanjiWatsuki/Kunoichi-7B)
|
| 55 |
+
* [SanjiWatsuki/Silicon-Maid-7B](https://huggingface.co/SanjiWatsuki/Silicon-Maid-7B)
|
| 56 |
+
* [KatyTheCutie/LemonadeRP-4.5.3](https://huggingface.co/KatyTheCutie/LemonadeRP-4.5.3)
|
| 57 |
+
* [Fimbulvetr-11B-v2.1-16K](https://huggingface.co/Sao10K/Fimbulvetr-11B-v2.1-16K)
|
| 58 |
+
* [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
|
| 59 |
+
* [WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2)
|
| 60 |
|
| 61 |
+
### The Special Sauce
|
| 62 |
|
| 63 |
The following YAML configuration was used to produce this model:
|
| 64 |
|
| 65 |
```yaml
|
| 66 |
+
slices: # this is a quick float32 restack of BLC using the OG recipe
|
| 67 |
+
- sources:
|
| 68 |
+
- model: SanjiWatsuki/Kunoichi-7B
|
| 69 |
+
layer_range: [0, 24]
|
| 70 |
+
- sources:
|
| 71 |
+
- model: SanjiWatsuki/Silicon-Maid-7B
|
| 72 |
+
layer_range: [8, 24]
|
| 73 |
+
- sources:
|
| 74 |
+
- model: KatyTheCutie/LemonadeRP-4.5.3
|
| 75 |
+
layer_range: [24, 32]
|
| 76 |
+
merge_method: passthrough
|
| 77 |
+
dtype: float32
|
| 78 |
+
name: Big-Lemon-Cookie-11B
|
| 79 |
+
---
|
| 80 |
+
models: # this is a remake of CLC with the newer Fimbul v2.1 version
|
| 81 |
+
- model: Big-Lemon-Cookie-11B
|
| 82 |
+
parameters:
|
| 83 |
+
weight: 0.85
|
| 84 |
+
- model: Sao10K/Fimbulvetr-11B-v2.1-16K
|
| 85 |
+
parameters:
|
| 86 |
+
weight: 0.15
|
| 87 |
+
merge_method: linear
|
| 88 |
dtype: float32
|
| 89 |
+
name: Chunky-Lemon-Cookie-11B
|
| 90 |
+
---
|
| 91 |
+
slices: # 8 layers of WL for the splice
|
| 92 |
+
- sources:
|
| 93 |
+
- model: senseable/WestLake-7B-v2
|
| 94 |
+
layer_range: [8, 16]
|
| 95 |
+
merge_method: passthrough
|
| 96 |
+
dtype: float32
|
| 97 |
+
name: WL-splice
|
| 98 |
+
---
|
| 99 |
+
slices: # 8 layers of CLC for the splice
|
| 100 |
+
- sources:
|
| 101 |
+
- model: Chunky-Lemon-Cookie-11B
|
| 102 |
+
layer_range: [8, 16]
|
| 103 |
merge_method: passthrough
|
| 104 |
+
dtype: float32
|
| 105 |
+
name: CLC-splice
|
| 106 |
+
---
|
| 107 |
+
models: # this is the splice, a gradient merge meant to gradually and smoothly interpolate between stacks of different models
|
| 108 |
+
- model: WL-splice
|
| 109 |
+
parameters:
|
| 110 |
+
weight: [1, 1, 0.75, 0.625, 0.5, 0.375, 0.25, 0, 0] # 0.125 / 0.875 values removed here - "math gets screwy"
|
| 111 |
+
- model: CLC-splice
|
| 112 |
+
parameters:
|
| 113 |
+
weight: [0, 0, 0.25, 0.375, 0.5, 0.625, 0.75, 1, 1] # 0.125 / 0.875 values removed here - "math gets screwy"
|
| 114 |
+
merge_method: dare_linear # according to some paper, "DARE is all you need"
|
| 115 |
+
base_model: WL-splice
|
| 116 |
+
dtype: float32
|
| 117 |
+
name: splice
|
| 118 |
+
---
|
| 119 |
+
slices: # putting it all together
|
| 120 |
+
- sources:
|
| 121 |
+
- model: senseable/WestLake-7B-v2
|
| 122 |
+
layer_range: [0, 16]
|
| 123 |
+
- sources:
|
| 124 |
+
- model: splice
|
| 125 |
+
layer_range: [0, 8]
|
| 126 |
+
- sources:
|
| 127 |
+
- model: Chunky-Lemon-Cookie-11B
|
| 128 |
+
layer_range: [16, 48]
|
| 129 |
+
merge_method: passthrough
|
| 130 |
+
dtype: float32
|
| 131 |
+
name: Honey-Yuzu-13B
|
| 132 |
```
|
| 133 |
+
|
| 134 |
+
### The Thought Process
|
| 135 |
+
|
| 136 |
+
This was meant to be a simple RP-focused merge. I chose 2 well-performing RP models - [Chunky-Lemon-Cookie-11B](https://huggingface.co/FallenMerick/Chunky-Lemon-Cookie-11B) by [FallenMerick](https://huggingface.co/FallenMerick) and [WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2) by [senseable](https://huggingface.co/senseable) - and merge them using a more conventional configuration (okay, okay, a 56 layer 12.5B Mistral isn't that conventional but still) rather than trying something wild or crazy and pushing the limits. I was very pleased with the results, but I wanted to see what would happen if I remade CLC with [Fimbulvetr-11B-v2.1-16K](https://huggingface.co/Sao10K/Fimbulvetr-11B-v2.1-16K) by [Sao10K](https://huggingface.co/Sao10K). This resulted in equally nice (if not slightly better) outputs but greatly improved native context length.
|
| 137 |
+
|
| 138 |
+
|
| 139 |
+
|
| 140 |
+
Have feedback? Comments? Questions? Don't hesitate to let me know! As always, have a wonderful day, and please be nice to yourself! :)
|