--- library_name: llama.cpp tags: - security - gguf - model-file-vulnerability license: mit --- # llama.cpp CLIP mmproj GGUF array validation PoC Tested revision: `4c6595503fe45d5a39f88d194e270f64c7424677` (llama.cpp, 2026-06-11) Proof-of-concept for heap buffer overreads (out-of-bounds reads) in llama.cpp's multimodal vision loader (`tools/mtmd/clip.cpp`) when parsing GGUF metadata for a vision mmproj file. ## What this demonstrates Two related but distinct validation failures in `load_hparams`: **Primary (`poc_mmproj.gguf`) - unchecked element-type handling** A successfully parsed GGUF v3 file declares `clip.vision.feature_layer` as INT8[8]. `get_arr_int()` uses `gguf_get_arr_n()` but does not verify the declared element type before casting to `int32_t`. It interprets the storage as eight int32_t elements, causing it to read 32 bytes from an 8-byte allocation. ASan: READ at clip.cpp:3020. **Secondary (`poc_image_mean.gguf`, reproduced) - missing expected-length check** Declares `clip.vision.image_mean` as a correctly typed F32[1]. `load_hparams` reads three floats unconditionally at clip.cpp:1251 without requiring `gguf_get_arr_n() >= 3`. ASan: READ at clip.cpp:1251. Length-validation bug, not element-type mismatch. Release (primary, no sanitizers): out-of-bounds read without sanitizer abort; `load_hparams` completes until later tensor-load failure. Invalid reads only; no demonstrated writes or downstream impact. ## Files - `poc_mmproj.gguf` - primary (feature_layer element-type handling) - `poc_image_mean.gguf` - secondary (image_mean length validation) - `gen_clip_poc.py`, `clip_poc.cpp`, `build_clip_poc.sh`, `REPORT.md` ## Reproduction (WSL / Linux) ``` python3 gen_clip_poc.py bash build_clip_poc.sh ASAN_OPTIONS=detect_leaks=0:abort_on_error=1 ./clip_poc poc_mmproj.gguf python3 gen_clip_poc.py --variant image_mean -o poc_image_mean.gguf ASAN_OPTIONS=detect_leaks=0:abort_on_error=1 ./clip_poc poc_image_mean.gguf ``` ## Safety Sanitizer-reported out-of-bounds reads under ASan. Isolated test environments only.