GGUF
imatrix
conversational
Thireus commited on
Commit
85b8ba8
·
1 Parent(s): 28d5e91

Update README.md and tensors.map(.sig) files

Browse files
Files changed (3) hide show
  1. README.md +15 -1
  2. tensors.map +0 -0
  3. tensors.map.sig +0 -0
README.md CHANGED
@@ -22,7 +22,7 @@ cd ~
22
  # Make sure to install all ik_llama.cpp compilation dependencies...
23
  apt install python3-dev python3-pip python3-venv python3-wheel python3-setuptools git acl netcat-openbsd cmake # pipx
24
 
25
- # Obtain ik_llama's Thireus version - Windows builds available at https://github.com/Thireus/ik_llama.cpp/releases
26
  git clone https://github.com/Thireus/ik_llama.cpp
27
  cd ik_llama.cpp
28
  git pull
@@ -130,4 +130,18 @@ cd kitchen
130
  ../quant_downloader.sh bf16.recipe
131
  ```
132
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
133
  Enjoy optimized quantization! 🎉
 
22
  # Make sure to install all ik_llama.cpp compilation dependencies...
23
  apt install python3-dev python3-pip python3-venv python3-wheel python3-setuptools git acl netcat-openbsd cmake # pipx
24
 
25
+ # Obtain ik_llama's Thireus version - Windows/macOS/Linux builds available at https://github.com/Thireus/ik_llama.cpp/releases
26
  git clone https://github.com/Thireus/ik_llama.cpp
27
  cd ik_llama.cpp
28
  git pull
 
130
  ../quant_downloader.sh bf16.recipe
131
  ```
132
 
133
+ You can also quantize individual BF16 tensors without the need to download every BF16 .gguf shard:
134
+
135
+ BF16 model shards can also be individually quantized using a special version of ik_llama.cpp's `llama-quantize` utility which comes with the `--individual-tensors` option.
136
+
137
+ - Source code: https://github.com/Thireus/ik_llama.cpp/tree/th/quantize_individual_tensors
138
+ - Builds (macOS, Windows and Linux): https://github.com/Thireus/ik_llama.cpp/releases/tag/th-quantize_individual_tensors-b4210-7a44805
139
+
140
+ Usage example:
141
+ ```
142
+ ./llama-quantize --keep-split --imatrix imatrix_ubergarm.dat --individual-tensors 2,3,1094 Kimi-K2-Thinking-THIREUS-BF16-SPECIAL_TENSOR-00001-of-01097.gguf my_new_shards.gguf iq3_s 12
143
+ ```
144
+
145
+ For more information about how to use it: https://github.com/Thireus/GGUF-Tool-Suite/issues/45
146
+
147
  Enjoy optimized quantization! 🎉
tensors.map CHANGED
The diff for this file is too large to render. See raw diff
 
tensors.map.sig CHANGED
Binary files a/tensors.map.sig and b/tensors.map.sig differ