---
license: apache-2.0
base_model:
  - Qwen/Qwen3.6-27B
tags:
  - qwen3.6
  - gguf
  - tq3_4s
  - turboquant
  - vision
  - multimodal
pipeline_tag: image-text-to-text
language:
  - en
  - zh
  - multilingual
---

# Qwen3.6-27B-TQ3_4S

<img width="400px" src="https://qianwen-res.oss-accelerate.aliyuncs.com/Qwen3.6/logo.png">

[![Qwen Chat](https://img.shields.io/badge/%F0%9F%92%9C%EF%B8%8F%20Qwen%20Chat%20-536af5)](https://chat.qwen.ai)

> [!Note]
> This repository contains model weights and configuration files for the post-trained model in GGUF format.
>
> These artifacts are intended for llama.cpp-style runtimes and other GGUF-compatible inference stacks.

Following the February release of the Qwen3.5 series, we're pleased to share a `TQ3_4S` GGUF release of Qwen3.6-27B. Built on the upstream Qwen3.6-27B model and converted through the `unsloth/Qwen3.6-27B-GGUF` path, this release is aimed at strong local inference efficiency while preserving the stability and real-world coding utility of the base model.

## Qwen3.6 Highlights

This release delivers substantial upgrades, particularly in

- **Agentic Coding:** the model now handles frontend workflows and repository-level reasoning with greater fluency and precision.
- **Thinking Preservation:** Qwen introduced the option to retain reasoning context from historical messages, reducing overhead during iterative work.

![Benchmark Results](https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen3.6/Figures/qwen3.6_27b_score.png)

For the full upstream write-up, see the Qwen blog post: [Qwen3.6-27B](https://qwen.ai/blog?id=qwen3.6-27b).

## Model Overview

- Type: Causal Language Model with Vision Encoder
- Training Stage: Pre-training and Post-training
- Architecture: `qwen35`
- Parameters: `27B`
- Layers: `64`
- Embedding dimension: `5120`
- FFN dimension: `17408`
- Hidden layout: `16 × (3 × (Gated DeltaNet -> FFN) -> 1 × (Gated Attention -> FFN))`
- Gated DeltaNet heads: `48` for `V`, `16` for `QK`, head dim `128`
- Gated Attention heads: `24` for `Q`, `4` for `KV`, head dim `256`
- RoPE dim: `64`
- Native context: `262,144`

## Benchmark Results

For the full upstream benchmark tables, refer to the official Qwen model card:

- [Qwen/Qwen3.6-27B](https://huggingface.co/Qwen/Qwen3.6-27B)

Selected upstream headline results for the base model:

- `SWE-bench Verified`: `77.2`
- `Terminal-Bench 2.0`: `59.3`
- `SkillsBench Avg5`: `48.2`
- `GPQA Diamond`: `87.8`
- `AIME26`: `94.1`
- `MMMU`: `82.9`
- `AndroidWorld`: `70.3`

These are upstream base-model results, not local GGUF quant results.

## TQ3_4S Release

This repository packages the model as a TurboQuant `TQ3_4S` GGUF for local deployment.

## Files

| File | Quant | Size |
| --- | --- | ---: |
| `Qwen3.6-27B-TQ3_4S.gguf` | TQ3_4S | ~13.0 GB |
| `chat_template.jinja` | chat template | text |
| `thumbnail.png` | model card image | png |

## Local Validation

Hardware:

- RTX 5060 Ti 16 GB

Prompt processing:

- `llama-perplexity --chunks 10 -c 2048`
- `PPL = 6.2452 +/- 0.16138`
- `prompt eval = 712.02 tok/s`

## Runtime Notes

- Use a TurboQuant-capable llama.cpp build for best performance.
- The upstream family is multimodal-capable, but the public 27B repos used here do not currently expose a separate GGUF `mmproj` artifact.
- For llama.cpp chat usage, keep `--jinja` enabled so the bundled chat template is honored.
- Upstream guidance recommends keeping at least `128K` context when possible for reasoning-heavy workloads. On smaller local GPUs, reduce context as needed to fit memory.
- Upstream default sampling guidance differs between thinking and non-thinking mode; follow the official Qwen card if you are trying to reproduce base-model behavior.

## Example

```bash
llama-cli \
  -m Qwen3.6-27B-TQ3_4S.gguf \
  --jinja \
  -ngl 99 \
  -c 4096
```

## Sources

- Upstream base model: [Qwen/Qwen3.6-27B](https://huggingface.co/Qwen/Qwen3.6-27B)
- Upstream GGUF source used for conversion: [unsloth/Qwen3.6-27B-GGUF](https://huggingface.co/unsloth/Qwen3.6-27B-GGUF)
- Upstream blog and benchmark context: [Qwen3.6-27B model card](https://huggingface.co/Qwen/Qwen3.6-27B)