---
license: apache-2.0
base_model: Qwen/Qwen3.6-35B-A3B-FP8
pipeline_tag: image-text-to-text
library_name: transformers
tags:
  - qwen
  - qwen-3.6
  - moe
  - fp8
  - reference-card
---

# Qwen3.6-35B-A3B-FP8

## Summary

Reference wrapper around [Qwen/Qwen3.6-35B-A3B-FP8](https://huggingface.co/Qwen/Qwen3.6-35B-A3B-FP8) — the official FP8 release. **This repository carries no weights**; it exists only to anchor the FP8 variant inside the `majentik/*` family navigation.

## Why this variant

Pick this for Hopper / Ada / Blackwell GPUs where FP8 is natively supported and you want the closest-to-bf16 fidelity with ~50% memory savings. For additional compression pick one of the 4-bit variants below.

## Hardware compatibility

| Device | VRAM | Recommendation |
| --- | --- | --- |
| H100 / H200 | 80–141 GB | native |
| RTX 4090 | 24 GB | does not fit full precision — use 4-bit |
| RTX 5090 | 32 GB | native |

## Reproduce

```bash
# No re-quantization needed — use the upstream weights directly.
huggingface-cli download Qwen/Qwen3.6-35B-A3B-FP8
```

## Evaluation

_Benchmarks pending — populated after the eval-harness workstream lands._

## Family

- **bf16** — [Qwen/Qwen3.6-35B-A3B](https://huggingface.co/Qwen/Qwen3.6-35B-A3B)
- **FP8 (this)** — [Qwen/Qwen3.6-35B-A3B-FP8](https://huggingface.co/Qwen/Qwen3.6-35B-A3B-FP8)
- **RotorQuant family** — [majentik/Qwen3.6-35B-A3B-RotorQuant](https://huggingface.co/majentik/Qwen3.6-35B-A3B-RotorQuant)
- **TurboQuant family** — [majentik/Qwen3.6-35B-A3B-TurboQuant](https://huggingface.co/majentik/Qwen3.6-35B-A3B-TurboQuant)

## Provenance

Card-only. No weights stored.

## License

Released under `apache-2.0`. Upstream license of the base model applies.