steampunque commited on
Commit
8863c6a
·
verified ·
1 Parent(s): c36f589

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -10,12 +10,12 @@ tags:
10
  - 6-bit
11
  ---
12
 
13
- ## Llama.cpp ultravox-v0_5-llama-3_1-8b by fixie-ai
14
 
15
  Original model: https://huggingface.co/fixie-ai/ultravox-v0_5-llama-3_1-8b
16
 
17
  This is a F16 mmproj file intented to be used in conjunction with Llama-3.1-8B-Instruct. A
18
- high performance hybrid quant of Llama-3.1-8B-Instruct is available here: https://huggingface.co/steampunque/Llama-3.1-8B-Instruct-Hybrid-GGUF
19
 
20
  Usage:
21
 
@@ -46,8 +46,8 @@ Audio benchmarks for the model will eventually be given here: https://huggingfac
46
  ## Download the file from below:
47
  | Link | Type | Size/e9 B | Notes |
48
  |------|------|-----------|-------|
49
- | [Llama-3.1-8B-Instruct.Q6_K_H.gguf](https://huggingface.co/steampunque/Llama-3.1-8B-Instruct-Hybrid-GGUF/resolve/main/Llama-3.1-8B-Instruct.Q6_K_H.gguf) | Q6_K_H | 6e9 B | 0.6B smaller than Q6_K |
50
- | [ultravox-v0_5-llama-3_1-8b.mmproj.gguf](https://huggingface.co/steampunque/ultravox-v0_5-llama-3_1-8b-Hybrid-GGUF/resolve/main/ultravox-v0_5-llama-3_1-8b.mmproj.gguf) | mmproj | 1.38e9 B | multimedia projector |
51
 
52
  A discussion thread about the hybrid layer quant approach can be found here on the llama.cpp git repository:
53
 
 
10
  - 6-bit
11
  ---
12
 
13
+ ## Mixed Precision GGUF ultravox-v0_5-llama-3_1-8b by fixie-ai
14
 
15
  Original model: https://huggingface.co/fixie-ai/ultravox-v0_5-llama-3_1-8b
16
 
17
  This is a F16 mmproj file intented to be used in conjunction with Llama-3.1-8B-Instruct. A
18
+ high performance hybrid quant of Llama-3.1-8B-Instruct is available here: https://huggingface.co/steampunque/Llama-3.1-8B-Instruct-MP-GGUF
19
 
20
  Usage:
21
 
 
46
  ## Download the file from below:
47
  | Link | Type | Size/e9 B | Notes |
48
  |------|------|-----------|-------|
49
+ | [Llama-3.1-8B-Instruct.Q6_K_H.gguf](https://huggingface.co/steampunque/Llama-3.1-8B-Instruct-MP-GGUF/resolve/main/Llama-3.1-8B-Instruct.Q6_K_H.gguf) | Q6_K_H | 6e9 B | 0.6B smaller than Q6_K |
50
+ | [ultravox-v0_5-llama-3_1-8b.mmproj.gguf](https://huggingface.co/steampunque/ultravox-v0_5-llama-3_1-8b-MP-GGUF/resolve/main/ultravox-v0_5-llama-3_1-8b.mmproj.gguf) | mmproj | 1.38e9 B | multimedia projector |
51
 
52
  A discussion thread about the hybrid layer quant approach can be found here on the llama.cpp git repository:
53