Zen5 Nano 2B

Edge / mobile / on-device tier of the Zen5 family. Multimodal (image + text in → text out) dense model at the 2B-parameter scale, tuned for fast on-device inference on phones, laptops, and Raspberry Pi class hardware.

Part of the Zen5 nano sub-family — pick by hardware budget:

SKU	Params	Fit
`zen5-nano-0.8B`	0.9B	Raspberry Pi 5 / phone CPU / browser WASM
`zen5-nano-2B`	2B	low-end iGPU / 8 GB RAM laptop
`zen5-nano-4B`	5B	16 GB RAM laptop / mobile NPU
`zen5-nano-9B`	10B	24 GB+ unified RAM / consumer GPU

The larger tiers (zen5-flash, zen5, zen5-pro, zen5-max, zen5-coder) cover everything from agentic-default to frontier-quality multi-GPU.

Weights

GGUF mirror is staged. Until it lands here, the recommended path is the hosted zen5-nano-2B endpoint below, or any community 2B-class multimodal GGUF placed in a local model/ directory.

Run

Hosted via the Hanzo gateway (api.hanzo.ai) as zen5-nano-2B.

Local with llama.cpp or any GGUF runtime:

llama-cli -m model.gguf --mmproj mmproj.gguf -p "What is shown in this image?"

Downloads last month: 84

Safetensors

Model size

2B params

Tensor type

F32

BF16

Collection including zenlm/zen-5-nano-2B-gguf

Zen5 Nano (edge / on-device)

Collection

Multimodal edge tier, 0.8B / 2B / 4B / 9B. • 4 items • Updated 28 days ago