Zen5 Nano (edge / on-device)
Collection
Multimodal edge tier, 0.8B / 2B / 4B / 9B. • 4 items • Updated
Edge / mobile / on-device tier of the Zen5 family. Multimodal (image + text in → text out) dense model at the 2B-parameter scale, tuned for fast on-device inference on phones, laptops, and Raspberry Pi class hardware.
Part of the Zen5 nano sub-family — pick by hardware budget:
| SKU | Params | Fit |
|---|---|---|
zen5-nano-0.8B |
0.9B | Raspberry Pi 5 / phone CPU / browser WASM |
zen5-nano-2B |
2B | low-end iGPU / 8 GB RAM laptop |
zen5-nano-4B |
5B | 16 GB RAM laptop / mobile NPU |
zen5-nano-9B |
10B | 24 GB+ unified RAM / consumer GPU |
The larger tiers (zen5-flash, zen5, zen5-pro, zen5-max, zen5-coder) cover everything from agentic-default to frontier-quality multi-GPU.
GGUF mirror is staged. Until it lands here, the recommended path is the hosted zen5-nano-2B endpoint below, or any community 2B-class multimodal GGUF placed in a local model/ directory.
Hosted via the Hanzo gateway (api.hanzo.ai) as zen5-nano-2B.
Local with llama.cpp or any GGUF runtime:
llama-cli -m model.gguf --mmproj mmproj.gguf -p "What is shown in this image?"