jedisct1/Qwen-AgentWorld-35B-A3B-oQ8-MLX

This is an oMLX MLX quantization of Qwen/Qwen-AgentWorld-35B-A3B.

The goal of this package is practical local agent use on Apple Silicon. Tool calling was treated as the release gate, so the chat template is kept both as chat_template.jinja and embedded in tokenizer_config.json, with tool_parser_type set to qwen3_coder.

Quantization

Quantization: oQ8 (8-bit affine, mixed precision)
Global bits: 8
Group size: 64
Mode: affine
Safetensor shards: 8
Tensor count: 1677
Total safetensor size: 36.81 GB

The upstream config declares an MTP layer, but the upstream checkpoint published for Qwen/Qwen-AgentWorld-35B-A3B does not include mtp.* tensors. These artifacts therefore publish a self-consistent non-MTP config instead of advertising a missing draft head.

Compatibility

Expected local targets:

oMLX 0.4.4 or newer
LM Studio with MLX model loading and OpenAI-compatible tool calls

Use greedy decoding for strict tool use and eval runs:

{"temperature": 0, "top_p": 1}

Tool-Calling Verification

Swival core tool suite: 5/5 passed on 2026-06-24 with deterministic greedy decoding.

Swival all-built-ins suite: 5/5 passed on 2026-06-24 with deterministic greedy decoding.

The Swival suites exercise real file, edit, command, planning, checklist, snapshot, grep, outline, URL fetch, batched reads, and shell-tool dispatch through an OpenAI-compatible server. A direct /v1/chat/completions smoke also checks that the server returns structured tool_calls, not just plain XML text.

Downloads last month: 414

Safetensors

Model size

10B params

Tensor type

BF16

U32

MLX

Hardware compatibility

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jedisct1/Qwen-AgentWorld-35B-A3B-oQ8-MLX

Base model

Qwen/Qwen3.5-35B-A3B-Base

Finetuned

Qwen/Qwen-AgentWorld-35B-A3B

Quantized

(31)

this model

Collection including jedisct1/Qwen-AgentWorld-35B-A3B-oQ8-MLX

Qwen-Agentworld

Collection

4 items • Updated 3 days ago