lemonyins commited on
Commit
5e1bd85
·
verified ·
1 Parent(s): 3ed6ff8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -0
README.md CHANGED
@@ -102,6 +102,22 @@ llama-server.exe ^
102
  --host 0.0.0.0
103
  ```
104
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
105
  ## Caveats
106
 
107
  - **TurboQuant is mandatory**: This model relies on TurboQuant KV cache for the listed memory figures. Standard llama.cpp builds without TurboQuant will consume significantly more VRAM.
 
102
  --host 0.0.0.0
103
  ```
104
 
105
+ ## 🧠 Intelligence (Perplexity) Comparison
106
+
107
+ Test using Chinese novel:
108
+ | Model Version | Perplexity (PPL) | Quality Drop |
109
+ | :--- | :--- | :--- |
110
+ | Q4_K_M | 13.1909 +/- 0.06037 | Baseline |
111
+ | IQ4_XS | 13.2138 +/- 0.06054 | 0.17% |
112
+ | IQ4_XS-FFN-IQ3_S | 13.6056 +/- 0.06159 | 3.14% |
113
+
114
+ Test with code:
115
+ | Model Version | Perplexity (PPL) | Quality Drop |
116
+ | :--- | :--- | :--- |
117
+ | IQ4_XS | 1.2217 +/- 0.00156 | Baseline |
118
+ | IQ4_XS-FFN-IQ3_S | 1.2324 +/- 0.00158 | 0.87% |
119
+
120
+
121
  ## Caveats
122
 
123
  - **TurboQuant is mandatory**: This model relies on TurboQuant KV cache for the listed memory figures. Standard llama.cpp builds without TurboQuant will consume significantly more VRAM.