--- base_model: - meta-llama/Llama-2-7b-hf base_model_relation: quantized license: llama2 --- # Model Card - Base model: `meta-llama/Llama-2-7b-hf` - Quantization method: Memory constrained MSQ with Q-Palette - Target bit-width: 3 - Backend kernel: Q-Palette kernel - Calibration data: RedPajama ([Hessian](https://huggingface.co/relaxml/Hessians-Llama-2-7b-6144)) # How to run - Follow the instruction in https://github.com/snu-mllab/Q-Palette. # References - [Model Paper](https://arxiv.org/abs/2509.20214)