Instructions to use matchaaaaa/Honey-Yuzu-13B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use matchaaaaa/Honey-Yuzu-13B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="matchaaaaa/Honey-Yuzu-13B")

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("matchaaaaa/Honey-Yuzu-13B")
model = AutoModelForMultimodalLM.from_pretrained("matchaaaaa/Honey-Yuzu-13B")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use matchaaaaa/Honey-Yuzu-13B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "matchaaaaa/Honey-Yuzu-13B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "matchaaaaa/Honey-Yuzu-13B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/matchaaaaa/Honey-Yuzu-13B

SGLang

How to use matchaaaaa/Honey-Yuzu-13B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "matchaaaaa/Honey-Yuzu-13B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "matchaaaaa/Honey-Yuzu-13B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "matchaaaaa/Honey-Yuzu-13B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "matchaaaaa/Honey-Yuzu-13B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use matchaaaaa/Honey-Yuzu-13B with Docker Model Runner:
```
docker model run hf.co/matchaaaaa/Honey-Yuzu-13B
```

matchaaaaa commited on Jul 23, 2024

Commit

054d033

verified ·

1 Parent(s): 97eb3b9

Update README.md

Browse files

Files changed (1) hide show

README.md +112 -17

README.md CHANGED Viewed

@@ -1,19 +1,48 @@
 ---
-base_model: []
 library_name: transformers
 tags:
 - mergekit
 - merge
 ---
 ![cute](https://huggingface.co/matchaaaaa/Honey-Yuzu-13B/resolve/main/honey-yuzu-cute.png)
 # Honey-Yuzu-13B
 This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
-## Merge Details
 ### Merge Method
 This model was merged using the passthrough merge method.
@@ -21,25 +50,91 @@ This model was merged using the passthrough merge method.
 ### Models Merged
 The following models were included in the merge:
-* mega\splice
-* D:/MLnonsense/models/senseable_WestLake-7B-v2
-* mega\Chunky-Lemon-Cookie-11B
-### Configuration
 The following YAML configuration was used to produce this model:
 ```yaml
 dtype: float32
 merge_method: passthrough
-slices:
-- sources:
-  - layer_range: [0, 16]
-    model: D:/MLnonsense/models/senseable_WestLake-7B-v2
-- sources:
-  - layer_range: [0, 8]
-    model: mega\splice
-- sources:
-  - layer_range: [16, 48]
-    model: mega\Chunky-Lemon-Cookie-11B
 ```

 ---
+base_model:
+- mistralai/Mistral-7B-v0.1
 library_name: transformers
 tags:
 - mergekit
 - merge
+- roleplay
+- text-generation-inference
+license: cc-by-4.0
 ---
 ![cute](https://huggingface.co/matchaaaaa/Honey-Yuzu-13B/resolve/main/honey-yuzu-cute.png)
 # Honey-Yuzu-13B
+Meet Honey-Yuzu, a sweet lemony tea brewed by yours truly! A bit of [Chunky-Lemon-Cookie-11B](https://huggingface.co/FallenMerick/Chunky-Lemon-Cookie-11B) here for its great flavor, with a dash of [WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2) there to add some depth. I'm really proud of how it turned out, and I hope you like it too!
+It's not as verbose as Chaifighter, but it still writes very well. It boasts fantastic coherence and character understanding (in my opinion) for a 13B, and it's been my daily driver for a little bit. It's a solid RP model that should generally play nice with just about anything.
+## Prompt Template: Alpaca
+```
+Below is an instruction that describes a task. Write a response that appropriately completes the request.
+### Instruction:
+{prompt}
+### Response:
+```
+## Recommended Settings: Universal-Light
+Here are some settings ranges that tend to work for me. They aren't strict values, and there's a bit of leeway in them. Feel free to experiment a bit!
+* Temperature:        **1.25** ish
+* Min-P:              **0.05** to **0.1**
+* Repetition Penalty: **1.05** *to* **1.1** (high values aren't needed and usually degrade output)
+* Rep. Penalty Range: **256** *or* **512**
+* *(all other samplers disabled)*
+## The Deets
 This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
 ### Merge Method
 This model was merged using the passthrough merge method.
 ### Models Merged
 The following models were included in the merge:
+* [Chunky-Lemon-Cookie-11B](https://huggingface.co/FallenMerick/Chunky-Lemon-Cookie-11B)
+  * [SanjiWatsuki/Kunoichi-7B](https://huggingface.co/SanjiWatsuki/Kunoichi-7B)
+  * [SanjiWatsuki/Silicon-Maid-7B](https://huggingface.co/SanjiWatsuki/Silicon-Maid-7B)
+  * [KatyTheCutie/LemonadeRP-4.5.3](https://huggingface.co/KatyTheCutie/LemonadeRP-4.5.3)
+  * [Fimbulvetr-11B-v2.1-16K](https://huggingface.co/Sao10K/Fimbulvetr-11B-v2.1-16K)
+  * [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
+* [WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2)
+### The Special Sauce
 The following YAML configuration was used to produce this model:
 ```yaml
+slices: # this is a quick float32 restack of BLC using the OG recipe
+  - sources:
+    - model: SanjiWatsuki/Kunoichi-7B
+      layer_range: [0, 24]
+  - sources:
+    - model: SanjiWatsuki/Silicon-Maid-7B
+      layer_range: [8, 24]
+  - sources:
+    - model: KatyTheCutie/LemonadeRP-4.5.3
+      layer_range: [24, 32]
+merge_method: passthrough
+dtype: float32
+name: Big-Lemon-Cookie-11B
+---
+models: # this is a remake of CLC with the newer Fimbul v2.1 version
+  - model: Big-Lemon-Cookie-11B
+    parameters:
+      weight: 0.85
+  - model: Sao10K/Fimbulvetr-11B-v2.1-16K
+    parameters:
+      weight: 0.15
+merge_method: linear
 dtype: float32
+name: Chunky-Lemon-Cookie-11B
+---
+slices: # 8 layers of WL for the splice
+  - sources:
+    - model: senseable/WestLake-7B-v2
+      layer_range: [8, 16]
+merge_method: passthrough
+dtype: float32
+name: WL-splice
+---
+slices: # 8 layers of CLC for the splice
+  - sources:
+    - model: Chunky-Lemon-Cookie-11B
+      layer_range: [8, 16]
 merge_method: passthrough
+dtype: float32
+name: CLC-splice
+---
+models: # this is the splice, a gradient merge meant to gradually and smoothly interpolate between stacks of different models
+  - model: WL-splice
+    parameters:
+      weight: [1, 1, 0.75, 0.625, 0.5, 0.375, 0.25, 0, 0] # 0.125 / 0.875 values removed here - "math gets screwy"
+  - model: CLC-splice
+    parameters:
+      weight: [0, 0, 0.25, 0.375, 0.5, 0.625, 0.75, 1, 1] # 0.125 / 0.875 values removed here - "math gets screwy"
+merge_method: dare_linear # according to some paper, "DARE is all you need"
+base_model: WL-splice
+dtype: float32
+name: splice
+---
+slices: # putting it all together
+  - sources:
+    - model: senseable/WestLake-7B-v2
+      layer_range: [0, 16]
+  - sources:
+    - model: splice
+      layer_range: [0, 8]
+  - sources:
+    - model: Chunky-Lemon-Cookie-11B
+      layer_range: [16, 48]
+merge_method: passthrough
+dtype: float32
+name: Honey-Yuzu-13B
 ```
+### The Thought Process
+This was meant to be a simple RP-focused merge. I chose 2 well-performing RP models - [Chunky-Lemon-Cookie-11B](https://huggingface.co/FallenMerick/Chunky-Lemon-Cookie-11B) by [FallenMerick](https://huggingface.co/FallenMerick) and [WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2) by [senseable](https://huggingface.co/senseable) - and merge them using a more conventional configuration (okay, okay, a 56 layer 12.5B Mistral isn't that conventional but still) rather than trying something wild or crazy and pushing the limits. I was very pleased with the results, but I wanted to see what would happen if I remade CLC with [Fimbulvetr-11B-v2.1-16K](https://huggingface.co/Sao10K/Fimbulvetr-11B-v2.1-16K) by [Sao10K](https://huggingface.co/Sao10K). This resulted in equally nice (if not slightly better) outputs but greatly improved native context length.
+Have feedback? Comments? Questions? Don't hesitate to let me know! As always, have a wonderful day, and please be nice to yourself! :)