LocateAnything-3B โ€” iMatrix GGUF

iMatrix GGUF quantizations of nvidia/LocateAnything-3B โ€” the first GGUF available for this model.

LocateAnything-3B is NVIDIA's 3B visual grounding model โ€” it locates and identifies objects in images given natural language descriptions. Designed for on-device deployment and robotics applications.

These GGUFs use importance matrix (iMatrix) calibration on 2M tokens of wikitext-103: iMatrix runs calibration text through the model, measures which weights activate most, and protects them during quantization. Result: noticeably better coherence at Q2/Q3/Q4 โ€” same file size, better output.


Quick Start

llama.cpp

llama-cli -hf liodon-ai/LocateAnything-3B-imatrix-GGUF:Q4_K_M

LM Studio / Jan

Search liodon-ai/LocateAnything-3B-imatrix-GGUF and pick your quant.


Available Quants

Quant Size VRAM Notes
IQ2_M 1.28 GB 2 GB ultra-tiny + iMatrix โ€” better than standard Q2
IQ3_M 1.62 GB 2.5 GB tiny + iMatrix โ€” sharper than standard Q3
IQ4_XS 1.91 GB 3 GB small + iMatrix โ€” rivals Q5 at Q4 size
Q2_K 1.38 GB 2 GB tiniest standard โ€” runs almost anywhere, iMatrix-improved
Q3_K_M 1.73 GB 2.5 GB great for 4GB VRAM, iMatrix-improved
Q4_K_M 2.11 GB 3 GB sweet spot (recommended), iMatrix-improved
Q5_K_M 2.44 GB 4 GB high quality, iMatrix-improved
Q6_K 2.80 GB 4 GB near-lossless, iMatrix-improved
Q8_0 3.62 GB 5 GB basically full quality

What is iMatrix?

Standard quantization rounds all weights equally. iMatrix:

  1. Runs calibration text through the full-precision model
  2. Measures which weights activate most (the "importance matrix")
  3. Allocates more precision to important weights, less to unimportant ones

Same file size. Better output. Most noticeable at Q2/Q3/Q4.


Calibration

Importance matrix computed from 2M tokens of wikitext-103 โ€” 128 calibration chunks.


Source Model

  • Original: nvidia/LocateAnything-3B
  • Architecture: 3B visual grounding model
  • Strengths: Object localization, visual grounding, on-device robotics
  • License: NVIDIA Open Model License
Downloads last month
209
GGUF
Model size
3B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for liodon-ai/LocateAnything-3B-imatrix-GGUF

Base model

Qwen/Qwen2.5-3B
Quantized
(18)
this model