How to use from
Hermes Agent
Start the llama.cpp server
# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf batiai/Qwen3.5-27B-GGUF:IQ4_XS
Configure Hermes
# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default batiai/Qwen3.5-27B-GGUF:IQ4_XS
Run Hermes
hermes
Quick Links

Qwen 3.5 27B GGUF โ€” Quantized by BatiAI

BatiFlow Ollama

IQ4_XS quantization of Qwen/Qwen3.5-27B for on-device AI on Mac. Built and verified by BatiAI for BatiFlow.

Quick Start

ollama pull batiai/qwen3.5-27b:iq4

Available Quantizations

Quant Size VRAM M4 Max (128GB) Recommended For
IQ4_XS 14GB 28GB 17.0 t/s 32GB+ Mac

Benchmarks โ€” M4 Max (128GB)

Metric IQ4_XS
Token generation 17.0 t/s
Korean โœ…
Tool call JSON โœ…
VRAM 28 GB

vs Other Qwen 3.5 Models

Model Size VRAM Speed Min Mac
batiai/qwen3.5-9b:q4 5.2GB ~8GB 12.5 t/s 16GB
batiai/qwen3.5-27b:iq4 14GB 28GB 17.0 t/s 32GB
batiai/qwen3.5-35b:iq4 17GB 23GB 26.6 t/s 36GB

For 36GB+ Mac, consider batiai/qwen3.5-35b โ€” MoE architecture, faster and less VRAM.

Technical Details

  • Original Model: Qwen/Qwen3.5-27B
  • Architecture: Hybrid (Gated DeltaNet + GQA + MoE)
  • Context Window: 262K tokens
  • License: Apache 2.0
  • Quantized with: llama.cpp (build 400ac8e)

About BatiFlow

BatiFlow โ€” free, on-device AI automation for Mac. 5MB app, 100% local, unlimited.

License

Quantized from Qwen/Qwen3.5-27B. License: Apache 2.0.

Benchmarks

Machine Quant Cold start Prompt eval Token gen Tested
MacBook Pro M4 Max 128GB IQ4_XS 4.827s 82.37 t/s 11.71 t/s 2026-05-03
Downloads last month
20
GGUF
Model size
27B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for batiai/Qwen3.5-27B-GGUF

Base model

Qwen/Qwen3.5-27B
Quantized
(210)
this model

Collection including batiai/Qwen3.5-27B-GGUF