matchaaaaa commited on
Commit
054d033
·
verified ·
1 Parent(s): 97eb3b9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +112 -17
README.md CHANGED
@@ -1,19 +1,48 @@
1
  ---
2
- base_model: []
 
3
  library_name: transformers
4
  tags:
5
  - mergekit
6
  - merge
7
-
 
 
8
  ---
9
 
10
  ![cute](https://huggingface.co/matchaaaaa/Honey-Yuzu-13B/resolve/main/honey-yuzu-cute.png)
11
 
12
  # Honey-Yuzu-13B
13
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
15
 
16
- ## Merge Details
17
  ### Merge Method
18
 
19
  This model was merged using the passthrough merge method.
@@ -21,25 +50,91 @@ This model was merged using the passthrough merge method.
21
  ### Models Merged
22
 
23
  The following models were included in the merge:
24
- * mega\splice
25
- * D:/MLnonsense/models/senseable_WestLake-7B-v2
26
- * mega\Chunky-Lemon-Cookie-11B
 
 
 
 
27
 
28
- ### Configuration
29
 
30
  The following YAML configuration was used to produce this model:
31
 
32
  ```yaml
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  dtype: float32
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
  merge_method: passthrough
35
- slices:
36
- - sources:
37
- - layer_range: [0, 16]
38
- model: D:/MLnonsense/models/senseable_WestLake-7B-v2
39
- - sources:
40
- - layer_range: [0, 8]
41
- model: mega\splice
42
- - sources:
43
- - layer_range: [16, 48]
44
- model: mega\Chunky-Lemon-Cookie-11B
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
45
  ```
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model:
3
+ - mistralai/Mistral-7B-v0.1
4
  library_name: transformers
5
  tags:
6
  - mergekit
7
  - merge
8
+ - roleplay
9
+ - text-generation-inference
10
+ license: cc-by-4.0
11
  ---
12
 
13
  ![cute](https://huggingface.co/matchaaaaa/Honey-Yuzu-13B/resolve/main/honey-yuzu-cute.png)
14
 
15
  # Honey-Yuzu-13B
16
 
17
+ Meet Honey-Yuzu, a sweet lemony tea brewed by yours truly! A bit of [Chunky-Lemon-Cookie-11B](https://huggingface.co/FallenMerick/Chunky-Lemon-Cookie-11B) here for its great flavor, with a dash of [WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2) there to add some depth. I'm really proud of how it turned out, and I hope you like it too!
18
+
19
+ It's not as verbose as Chaifighter, but it still writes very well. It boasts fantastic coherence and character understanding (in my opinion) for a 13B, and it's been my daily driver for a little bit. It's a solid RP model that should generally play nice with just about anything.
20
+
21
+ ## Prompt Template: Alpaca
22
+
23
+ ```
24
+ Below is an instruction that describes a task. Write a response that appropriately completes the request.
25
+
26
+ ### Instruction:
27
+ {prompt}
28
+
29
+ ### Response:
30
+ ```
31
+
32
+ ## Recommended Settings: Universal-Light
33
+
34
+ Here are some settings ranges that tend to work for me. They aren't strict values, and there's a bit of leeway in them. Feel free to experiment a bit!
35
+
36
+ * Temperature: **1.25** ish
37
+ * Min-P: **0.05** to **0.1**
38
+ * Repetition Penalty: **1.05** *to* **1.1** (high values aren't needed and usually degrade output)
39
+ * Rep. Penalty Range: **256** *or* **512**
40
+ * *(all other samplers disabled)*
41
+
42
+ ## The Deets
43
+
44
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
45
 
 
46
  ### Merge Method
47
 
48
  This model was merged using the passthrough merge method.
 
50
  ### Models Merged
51
 
52
  The following models were included in the merge:
53
+ * [Chunky-Lemon-Cookie-11B](https://huggingface.co/FallenMerick/Chunky-Lemon-Cookie-11B)
54
+ * [SanjiWatsuki/Kunoichi-7B](https://huggingface.co/SanjiWatsuki/Kunoichi-7B)
55
+ * [SanjiWatsuki/Silicon-Maid-7B](https://huggingface.co/SanjiWatsuki/Silicon-Maid-7B)
56
+ * [KatyTheCutie/LemonadeRP-4.5.3](https://huggingface.co/KatyTheCutie/LemonadeRP-4.5.3)
57
+ * [Fimbulvetr-11B-v2.1-16K](https://huggingface.co/Sao10K/Fimbulvetr-11B-v2.1-16K)
58
+ * [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
59
+ * [WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2)
60
 
61
+ ### The Special Sauce
62
 
63
  The following YAML configuration was used to produce this model:
64
 
65
  ```yaml
66
+ slices: # this is a quick float32 restack of BLC using the OG recipe
67
+ - sources:
68
+ - model: SanjiWatsuki/Kunoichi-7B
69
+ layer_range: [0, 24]
70
+ - sources:
71
+ - model: SanjiWatsuki/Silicon-Maid-7B
72
+ layer_range: [8, 24]
73
+ - sources:
74
+ - model: KatyTheCutie/LemonadeRP-4.5.3
75
+ layer_range: [24, 32]
76
+ merge_method: passthrough
77
+ dtype: float32
78
+ name: Big-Lemon-Cookie-11B
79
+ ---
80
+ models: # this is a remake of CLC with the newer Fimbul v2.1 version
81
+ - model: Big-Lemon-Cookie-11B
82
+ parameters:
83
+ weight: 0.85
84
+ - model: Sao10K/Fimbulvetr-11B-v2.1-16K
85
+ parameters:
86
+ weight: 0.15
87
+ merge_method: linear
88
  dtype: float32
89
+ name: Chunky-Lemon-Cookie-11B
90
+ ---
91
+ slices: # 8 layers of WL for the splice
92
+ - sources:
93
+ - model: senseable/WestLake-7B-v2
94
+ layer_range: [8, 16]
95
+ merge_method: passthrough
96
+ dtype: float32
97
+ name: WL-splice
98
+ ---
99
+ slices: # 8 layers of CLC for the splice
100
+ - sources:
101
+ - model: Chunky-Lemon-Cookie-11B
102
+ layer_range: [8, 16]
103
  merge_method: passthrough
104
+ dtype: float32
105
+ name: CLC-splice
106
+ ---
107
+ models: # this is the splice, a gradient merge meant to gradually and smoothly interpolate between stacks of different models
108
+ - model: WL-splice
109
+ parameters:
110
+ weight: [1, 1, 0.75, 0.625, 0.5, 0.375, 0.25, 0, 0] # 0.125 / 0.875 values removed here - "math gets screwy"
111
+ - model: CLC-splice
112
+ parameters:
113
+ weight: [0, 0, 0.25, 0.375, 0.5, 0.625, 0.75, 1, 1] # 0.125 / 0.875 values removed here - "math gets screwy"
114
+ merge_method: dare_linear # according to some paper, "DARE is all you need"
115
+ base_model: WL-splice
116
+ dtype: float32
117
+ name: splice
118
+ ---
119
+ slices: # putting it all together
120
+ - sources:
121
+ - model: senseable/WestLake-7B-v2
122
+ layer_range: [0, 16]
123
+ - sources:
124
+ - model: splice
125
+ layer_range: [0, 8]
126
+ - sources:
127
+ - model: Chunky-Lemon-Cookie-11B
128
+ layer_range: [16, 48]
129
+ merge_method: passthrough
130
+ dtype: float32
131
+ name: Honey-Yuzu-13B
132
  ```
133
+
134
+ ### The Thought Process
135
+
136
+ This was meant to be a simple RP-focused merge. I chose 2 well-performing RP models - [Chunky-Lemon-Cookie-11B](https://huggingface.co/FallenMerick/Chunky-Lemon-Cookie-11B) by [FallenMerick](https://huggingface.co/FallenMerick) and [WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2) by [senseable](https://huggingface.co/senseable) - and merge them using a more conventional configuration (okay, okay, a 56 layer 12.5B Mistral isn't that conventional but still) rather than trying something wild or crazy and pushing the limits. I was very pleased with the results, but I wanted to see what would happen if I remade CLC with [Fimbulvetr-11B-v2.1-16K](https://huggingface.co/Sao10K/Fimbulvetr-11B-v2.1-16K) by [Sao10K](https://huggingface.co/Sao10K). This resulted in equally nice (if not slightly better) outputs but greatly improved native context length.
137
+
138
+
139
+
140
+ Have feedback? Comments? Questions? Don't hesitate to let me know! As always, have a wonderful day, and please be nice to yourself! :)