---
title: Qwen-AgentWorld-35B-A3B
emoji: 🌍
colorFrom: indigo
colorTo: purple
sdk: gradio
sdk_version: "5.9.1"
app_file: app.py
python_version: "3.12"
pinned: false
license: apache-2.0
short_description: Free ZeroGPU demo of Qwen-AgentWorld-35B-A3B (4-bit)
---

# Qwen-AgentWorld-35B-A3B — ZeroGPU Space

Free GPU demo of [`Qwen/Qwen-AgentWorld-35B-A3B`](https://hf.co/Qwen/Qwen-AgentWorld-35B-A3B)
running on **Hugging Face ZeroGPU**. The 35B MoE is loaded **4-bit (nf4)** so it
fits in a ZeroGPU slot.

## Why this is "free"
- ZeroGPU compute is free; an **HF Pro** account gets the **largest daily quota**.
- No always-on server, no per-hour billing (unlike Inference Endpoints).

## Deploy
1. Create a new Space → SDK **Gradio**.
2. In **Settings → Hardware**, select **ZeroGPU** (free with Pro).
3. Push `app.py`, `requirements.txt`, and this `README.md`.

Or push from the CLI (see `push_space.py` in this folder).

## Notes
- `size`/`duration` are tuned in `app.py`; lower `max_new_tokens` = less quota used.
- ZeroGPU's backing GPU and per-slot VRAM change over time — if 4-bit ever stops
  fitting, switch `MODEL_ID` to a pre-quantized mirror.