caiovicentino1
/

Qwen3.5-9B-Neo-HLWQ-Q5

Text Generation

kv-cache-compression

Model card Files Files and versions

Qwen3.5-9B-Neo-HLWQ-Q5

6.27 GB

Ctrl+K

Ctrl+K

1 contributor

History: 15 commits

caiovicentino1's picture

Remove legacy polar_config.json

3355841 verified 2 months ago

.gitattributes

1.57 kB
Upload folder using huggingface_hub 2 months ago
README.md

4.13 kB
HLWQ rebrand: title, tags, notice, self-links 2 months ago
chat_template.jinja

4.05 kB
Upload folder using huggingface_hub 2 months ago
compression.png

58.6 kB
Upload compression.png with huggingface_hub 2 months ago
config.json

2.04 kB
fix: quant_method polar -> polarengine for vLLM compatibility 2 months ago
family.png

44.8 kB
Upload family.png with huggingface_hub 2 months ago
hlwq_config.json

261 Bytes
Add hlwq_config.json (rename from polar_config.json) 2 months ago
kv_speed.png

35.5 kB
Upload kv_speed.png with huggingface_hub 2 months ago
model_int4.pt
Detected Pickle imports (14)
- "torchao.quantization.quant_primitives.ZeroPointDomain",
- "torch.int32",
- "torch.IntStorage",
- "torch.bfloat16",
- "torch.serialization._get_layout",
- "torch.BFloat16Storage",
- "collections.OrderedDict",
- "torch.device",
- "torchao.dtypes.affine_quantized_tensor.AffineQuantizedTensor",
- "torchao.dtypes.uintx.tensor_core_tiled_layout.TensorCoreTiledLayout",
- "torch._utils._rebuild_tensor_v2",
- "torch._tensor._rebuild_from_type_v2",
- "torch._utils._rebuild_wrapper_subclass",
- "torchao.dtypes.uintx.tensor_core_tiled_layout.TensorCoreTiledAQTTensorImpl"
How to fix it?
6.25 GB
xet

Upload model_int4.pt with huggingface_hub 2 months ago
tokenizer.json

20 MB
xet

Upload folder using huggingface_hub 2 months ago
tokenizer_config.json

1.17 kB
Upload folder using huggingface_hub 2 months ago