Smoffyy commited on
Commit
a262d18
·
verified ·
1 Parent(s): 432850b

Better wording

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -19,7 +19,7 @@ These are quantized versions of the official [Qwen 3.5 35B A3B](https://huggingf
19
 
20
  They are **completely unmodified**, no edits, just direct quantizations of the original weights. That's why they're called **pure GGUFs**.
21
 
22
- This model is a [Mixture-Of-Experts](https://huggingface.co/blog/moe) model, meaning that out of the total **35 billion parameters**, only **3 billion** are active during inference. *This, in theory, should increase token speed but still uses alot of VRAM.*
23
 
24
  > This uses **ALOT of VRAM**, so if you're looking for a lighter weight option please use **[Qwen 3.5 9B](https://huggingface.co/Smoffyy/Qwen3.5-9B-Instruct-Pure-GGUF)**.
25
 
 
19
 
20
  They are **completely unmodified**, no edits, just direct quantizations of the original weights. That's why they're called **pure GGUFs**.
21
 
22
+ This is a [Mixture-of-Experts](https://huggingface.co/blog/moe) (MoE) model: of its **35 billion total parameters**, only **~3 billion** are active per inference step. In practice, this means faster token generation at the cost of higher VRAM usage compared to a dense model of similar active size.
23
 
24
  > This uses **ALOT of VRAM**, so if you're looking for a lighter weight option please use **[Qwen 3.5 9B](https://huggingface.co/Smoffyy/Qwen3.5-9B-Instruct-Pure-GGUF)**.
25