Instructions to use AMAImedia/Qwen3-VL-2B-UI-Venus-NOESIS-NF4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use AMAImedia/Qwen3-VL-2B-UI-Venus-NOESIS-NF4 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="AMAImedia/Qwen3-VL-2B-UI-Venus-NOESIS-NF4") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("AMAImedia/Qwen3-VL-2B-UI-Venus-NOESIS-NF4") model = AutoModelForImageTextToText.from_pretrained("AMAImedia/Qwen3-VL-2B-UI-Venus-NOESIS-NF4") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use AMAImedia/Qwen3-VL-2B-UI-Venus-NOESIS-NF4 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "AMAImedia/Qwen3-VL-2B-UI-Venus-NOESIS-NF4" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AMAImedia/Qwen3-VL-2B-UI-Venus-NOESIS-NF4", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/AMAImedia/Qwen3-VL-2B-UI-Venus-NOESIS-NF4
- SGLang
How to use AMAImedia/Qwen3-VL-2B-UI-Venus-NOESIS-NF4 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "AMAImedia/Qwen3-VL-2B-UI-Venus-NOESIS-NF4" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AMAImedia/Qwen3-VL-2B-UI-Venus-NOESIS-NF4", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "AMAImedia/Qwen3-VL-2B-UI-Venus-NOESIS-NF4" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AMAImedia/Qwen3-VL-2B-UI-Venus-NOESIS-NF4", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use AMAImedia/Qwen3-VL-2B-UI-Venus-NOESIS-NF4 with Docker Model Runner:
docker model run hf.co/AMAImedia/Qwen3-VL-2B-UI-Venus-NOESIS-NF4
- NOESIS-Qwen3-VL-2B-UI-Venus-NF4 (NOESIS DHCF-FNO bundle)
NOESIS-Qwen3-VL-2B-UI-Venus-NF4 (NOESIS DHCF-FNO bundle)
NF4 quantization derivative of
inclusionAI/UI-Venus-1.5-2B— end-to-end GUI agent trained via 4-stage post-training pipeline (Mid-Train → Offline-RL → Online-RL → Model-Merge) on top of Qwen3-VL-2B. NF4-quantized viabitsandbytes 0.49.2(double_quant + bf16 compute) from the intermediateUI-Venus-1.5-2B-NOESIS-BF16AMAImedia BF16 repack sibling.Used inside the NOESIS DHCF-FNO stack as the FALLBACK 2B agent on the public
ui-agent.amaimedia.comsubdomain (browser DOM automation), providing alternative-pipeline cross-validation against the PRIMARY MAI-UI 8B NF4 path perR-AGENT-PRIMARY-MAI-UI-8B-NF4.
✅ APACHE 2.0 — COMMERCIAL USE PERMITTED. End-to-end clean lineage (Alibaba Cloud / Qwen Team Apache 2.0 → Inclusion AI / Ant Group Venus Team Apache 2.0 → AMAImedia BF16 repack Apache 2.0 → AMAImedia NF4 Apache 2.0). Standard
transformers.from_pretrainedloading withdevice_map={"": 0}(NF4 requirement per CLAUDE.md GOLDEN RULE 2).
Released as part of the NOESIS Professional Multilingual Dubbing Automation Platform (framework: DHCF-FNO — Deterministic Hybrid Control Framework for Frozen Neural Operators).
- Founder: Ilia Bolotnikov
- Organization: AMAImedia.com
- X (Twitter): @AMAImediacom
- LinkedIn: Ilia Bolotnikov
- Telegram: @AMAImediacom
- NOESIS version: v15.8
- Quantization date: 2026-05-21 08:32:59
- Renamed from:
NOESIS-UI-Venus-2B-NF4(per R-NOESIS-FOLDER-NAMING-PREFIX-QWEN3-VL)
NOESIS role — fallback 2B agent on ui-agent.amaimedia.com
Browser DOM automation agent mounted on ui-agent.amaimedia.com
(Phase 2 desktop agent / auto-clipper UI nav subdomain) as the
FALLBACK tier in a 3-tier hierarchy. Provides cross-validation
against MAI-UI's grounding output through a fundamentally different
training pipeline (4-stage RFT vs self-evolving data + device-cloud
collab).
ui-agent.amaimedia.com (browser DOM automation)
│
├── PRIMARY : NOESIS-Qwen3-VL-8B-MAI-UI-NF4 (Tongyi MAI-UI 8B, ~5 GB VRAM)
│ R-AGENT-PRIMARY-MAI-UI-8B-NF4
│
├── SECONDARY: NOESIS-Qwen3-VL-2B-MAI-UI-NF4 (Tongyi MAI-UI 2B, ~1.6 GB VRAM)
│ low-VRAM fallback for primary
│
└── FALLBACK : NOESIS-Qwen3-VL-2B-UI-Venus-NF4 (this, Inclusion AI Venus 1.5 2B)
• alternative 4-stage RFT pipeline
• cross-validation against MAI-UI
• ~1.2 GB target / 3.45 GB load peak VRAM
The Venus variant exists as the cross-validation track — when MAI-UI gives ambiguous click coordinates or fails to ground an element, Venus provides a second opinion from an entirely separate training pipeline (RFT-based, not RL-from-environment).
| Property | Value |
|---|---|
| Immediate parent | UI-Venus-1.5-2B-NOESIS-BF16 (AMAImedia BF16 repack of inclusionAI/UI-Venus-1.5-2B) |
| Upstream lineage | Qwen/Qwen3-VL-2B (Apache 2.0) → inclusionAI/UI-Venus-1.5-2B (Apache 2.0) → AMAImedia BF16 repack → AMAImedia NF4 |
| Architecture | Qwen3VLForConditionalGeneration (multimodal, vision tower retained) |
| Text hidden | 2 048 / 28 layers / 16 heads (GQA 2 : 1, 8 kv heads) |
| Vision tower | depth 24, hidden 1024, patch 16, deepstack at layers [5,11,17] |
| Vocab size | 151 936 |
| Context | 262 144 (mRoPE [24,20,20] interleaved, rope_theta 5M) |
| Format | NF4 (bnb 4-bit, double-quant, bf16 compute) |
| Bundle size on disk | 2.19 GB (single safetensors) |
| VRAM target (inference) | 1.2 GB ✅ RTX 3060 6 GB |
| VRAM peak (load) | 3.45 GB |
| License | Apache 2.0 (commercial-ok) |
| Project page | https://ui-venus.github.io/UI-Venus-1.5 |
| Papers | arxiv:2602.09082 (Venus 1.5), arxiv:2508.10833 (Venus RFT) |
Upstream Venus Team documentation (preserved)
UI-Venus-1.5-2B — End-to-end GUI Agent
UI-Venus-1.5 is a unified end-to-end GUI Agent designed for robust real-world applications. The model family includes two dense variants (2B / 8B) and one MoE variant (30B-A3B). This folder hosts the 2B dense variant, NF4-quantized.
Training pipeline (4 stages)
Stage 1 — Mid-Training
10B tokens across 30+ GUI datasets
Foundational GUI semantics
Stage 2 — Offline-RL
Task-specific optimization:
• grounding (ScreenSpot / OSWorld-G / VenusBench-GD)
• mobile (AndroidWorld / AndroidLab / VenusBench-Mobile)
• web (WebVoyager / OSWorld-W)
Stage 3 — Online-RL
Full-trajectory rollouts for long-horizon dynamic navigation
RFT (Reinforcement Fine-Tuning) per arXiv 2508.10833
Stage 4 — Model Merge
Unifying specialists into single deployable checkpoint
Benchmarks (per upstream Venus Team report)
| Benchmark | 30B-A3B variant | This 2B variant (proportional) |
|---|---|---|
| ScreenSpot-Pro | 69.6% | 57.7% |
| VenusBench-GD | 75.0% | (scales with size) |
| OSWorld-G-R | 76.4% | (scales with size) |
| OSWorld-G | 70.6% | (scales with size) |
| UI-Vision | 54.7% | (scales with size) |
| AndroidWorld | 77.6% | (scales with size) |
| AndroidLab | 55.1% / 68.1% | (scales with size) |
| VenusBench-Mobile | 21.5% | (scales with size) |
| WebVoyager | 76.0% | (scales with size) |
Numbers per upstream Venus Team report; 2B variant proportionally lower per "Consistent Scaling Gains" note in upstream README.
Quantization details (NOESIS-side)
| Parameter | Value |
|---|---|
| Library | bitsandbytes 0.49.2 |
| Method | NF4 (Normalized Float 4-bit) |
bnb_4bit_use_double_quant |
True (saves ~5% via nested quant) |
bnb_4bit_compute_dtype |
bfloat16 |
| Device map | {"": 0} (R-NF4-DEVICE-MAP-EXPLICIT) |
| Source dir | D:\models\vlm-gui-mot\UI-Venus-1.5-2B-NOESIS-BF16 |
| Output disk size | 2.19 GB (single safetensors) |
| VRAM target (inference) | 1.2 GB |
| VRAM peak (load) | 3.45 GB |
| Quant date | 2026-05-21 08:32:59 |
Higher load peak (3.45 GB vs MAI-UI 2B 1.6 GB) reflects Venus's internal layer-precision retention strategy — the working set settles to 1.2 GB after warmup, but initialization touches more parameters in higher precision.
Quick start
import torch
from transformers import AutoProcessor, Qwen3VLForConditionalGeneration
bundle = "B:/Downloads/Portable/NOESIS-VC-ONE/models/llm/NOESIS-Qwen3-VL-2B-UI-Venus-NF4"
processor = AutoProcessor.from_pretrained(bundle)
model = Qwen3VLForConditionalGeneration.from_pretrained(
bundle,
device_map={"": 0}, # NEVER "auto" with NF4
torch_dtype=torch.bfloat16,
).eval()
# Browser DOM screenshot grounding example
messages = [
{
"role": "user",
"content": [
{"type": "image", "image": "screenshot.png"},
{"type": "text", "text": "Click the 'Subscribe' button."},
],
},
]
inputs = processor.apply_chat_template(
messages, tokenize=True, add_generation_prompt=True,
return_tensors="pt",
).to(0)
with torch.no_grad():
outputs = model.generate(**inputs, max_new_tokens=128, do_sample=False)
print(processor.decode(outputs[0], skip_special_tokens=True))
# → predicted bounding box / click coordinates for the Subscribe button
NOESIS ui-agent.amaimedia.com wiring
| Endpoint tier | Backend | VRAM | Role |
|---|---|---|---|
| PRIMARY | NOESIS-Qwen3-VL-8B-MAI-UI-NF4 |
~5 GB | Canonical agent — full SOTA quality |
| SECONDARY | NOESIS-Qwen3-VL-2B-MAI-UI-NF4 |
~1.6 GB | Low-VRAM fallback for low-spec clients |
| FALLBACK | THIS bundle | ~1.2 GB target / 3.45 GB load peak | Cross-validation via alternative 4-stage RFT pipeline |
When to invoke FALLBACK (Venus) over PRIMARY/SECONDARY (MAI-UI):
- MAI-UI returns ambiguous coordinates (low confidence)
- Mobile-specific tasks (Venus has dedicated AndroidWorld/AndroidLab specialists)
- A/B-comparison runs for grounding-quality regression tests
- Independent second-opinion before committing destructive UI action
Sealed rules (NOESIS DHCF-FNO)
R-APACHE-CLEAN— Apache 2.0 preserved end-to-end (Qwen Team → Inclusion AI Venus Team → AMAImedia BF16 repack → AMAImedia NF4 quant).R-NF4-DEVICE-MAP-EXPLICIT— must load withdevice_map={"": 0}; neverdevice_map="auto"with NF4.R-AGENT-PRIMARY-MAI-UI-8B-NF4— MAI-UI 8B NF4 is PRIMARY, MAI-UI 2B NF4 is SECONDARY, this Venus 2B is FALLBACK (alternative pipeline cross-validation).R-UI-VENUS-FALLBACK-TO-MAI-UI— Venus is fallback / cross-validation track; MAI-UI is canonical agent path.R-VENUS-15-4STAGE-PIPELINE— 4-stage post-training: Mid-Train (10B GUI tok, 30+ datasets) → Offline-RL → Online-RL → Model Merge.R-VENUS-RFT-TRAINED— Reinforcement Fine-Tuning (RFT) per arXiv 2508.10833.R-QWEN3-VL-MROPE-INTERLEAVED— mRoPE [24, 20, 20] interleaved with rope_theta 5M (text); 256K context capable.R-UI-AGENT-PRODUCT-SCOPE— mounted onui-agent.amaimedia.com(browser DOM automation), NOT the dubbing pipeline core path.R-VENDORED-INTERNAL— plainLICENSEpreserved (BF16-tier NOTICE blocks) alongsideLICENSE.md(NF4-tier NOTICE).R-THIRD-PARTY-WRAPPERS-ONLY— Phase 1 SCOPE LOCK — third-party + wrappers only, no own training.R-VISION-TOWER-RETAINED— full Qwen3-VL ViT preserved (depth 24, deepstack at [5,11,17]) — required for screenshot grounding.R-QWEN-VOCAB-151936— compatible within Qwen3 family.R-UI-AGENT-OUT-OF-SCOPE— NOT in NOESIS dubbing-pipeline core path. Reserved for Phase 2 desktop agent / auto-clipper UI navigation experiments.
NOESIS provenance
| Step | Source / output |
|---|---|
| Base architecture | Qwen/Qwen3-VL-2B (© Alibaba Cloud / Qwen Team 2025-2026, Apache 2.0) |
| GUI agent fine-tune | inclusionAI/UI-Venus-1.5-2B (© Inclusion AI / Ant Group Venus Team 2025-2026, Apache 2.0) |
| Training pipeline | 4-stage: Mid-Train (10B GUI tok) → Offline-RL → Online-RL → Merge (RFT per arXiv 2508.10833) |
| BF16 dtype-repack (intermediate) | UI-Venus-1.5-2B-NOESIS-BF16 (© AMAImedia 2026, Apache 2.0, 4.6 GB) |
| NF4 quantization | bitsandbytes 0.49.2 + double-quant + bf16 compute |
| Local file | model.safetensors (2.19 GB) + config.json + processor + tokenizer |
| Quant date | 2026-05-21 08:32:59 |
| NOESIS version | v15.8 |
| Renamed from | NOESIS-UI-Venus-2B-NF4 (per R-NOESIS-FOLDER-NAMING-PREFIX-QWEN3-VL) |
| Production endpoint | ui-agent.amaimedia.com (Phase 2 subdomain) |
Reference docs:
- NOESIS CLAUDE.md GOLDEN RULE 2 (NF4 device_map={"":0})
- NOESIS sealed rule
R-AGENT-PRIMARY-MAI-UI-8B-NF4 NOESIS_NF4_MANIFEST.jsonin this folder- arXiv 2602.09082 (Venus 1.5 Technical Report)
- arXiv 2508.10833 (Venus RFT Technical Report)
Citation
@misc{venusteam2026uivenus15technicalreport,
title={UI-Venus-1.5 Technical Report},
author={Venus Team and Changlong Gao and Zhangxuan Gu and Yulin Liu
and Xinyu Qiu and Shuheng Shen and Yue Wen and Tianyu Xia
and Zhenyu Xu and Zhengwen Zeng and Beitong Zhou and
Xingran Zhou and Weizhi Chen and Sunhao Dai and Jingya Dou
and Yichen Gong and Yuan Guo and Zhenlin Guo and Feng Li
and Qian Li and Jinzhen Lin and Yuqi Zhou and Linchao Zhu
and Liang Chen and Zhenyu Guo and Changhua Meng and
Weiqiang Wang},
year={2026},
eprint={2602.09082},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2602.09082}
}
@misc{gu2025uivenustechnicalreportbuilding,
title={UI-Venus Technical Report: Building High-performance UI Agents
with RFT},
author={Zhangxuan Gu and Zhengwen Zeng and Zhenyu Xu and Xingran Zhou
and Shuheng Shen and Yunfei Liu and Beitong Zhou and Changhua
Meng and Tianyu Xia and Weizhi Chen and Yue Wen and Jingya Dou
and Fei Tang and Jinzhen Lin and Yulin Liu and Zhenlin Guo
and Yichen Gong and Heng Jia and Changlong Gao and Yuan Guo
and Yong Deng and Zhenyu Guo and Liang Chen and Weiqiang Wang},
year={2025},
eprint={2508.10833},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2508.10833}
}
@misc{noesis2026qwen3vl2buivenusnf4,
title = {NOESIS DHCF-FNO :: Qwen3-VL-2B-UI-Venus NF4 — FALLBACK
agent on ui-agent.amaimedia.com},
author = {Bolotnikov, Ilia and AMAImedia},
year = {2026},
note = {NF4 (bitsandbytes) quantization derivative of
inclusionAI/UI-Venus-1.5-2B (via AMAImedia BF16 repack),
Apache 2.0. 2.19 GB on disk, 1.2 GB VRAM target on RTX 3060.},
url = {https://amaimedia.com}
}
License
Apache License 2.0. Qwen3-VL base architecture © Alibaba Cloud / Qwen Team. UI-Venus-1.5-2B 4-stage RFT fine-tune © Inclusion AI / Ant Group (Venus Team). BF16 dtype-repack + NF4 quantization + NOESIS bundling + sealed-rule wiring: © AMAImedia (NOESIS DHCF-FNO project) 2026.
Commercial use is permitted subject to the standard Apache 2.0
preservation requirements (copyright + LICENSE + NOTICE-equivalent
attribution must travel with redistributions). See LICENSE (plain
text, BF16-tier NOTICE blocks) and LICENSE.md (Markdown, NF4-tier
NOTICE) in this folder for the full attribution chain.
Author
- Founder: Ilia Bolotnikov
- Organization: AMAImedia.com
- X (Twitter): @AMAImediacom
- LinkedIn: Ilia Bolotnikov
- Telegram: @AMAImediacom
- NOESIS version: v15.8
- Quantization date: 2026-05-21 08:32:59
- Parent BF16 source:
UI-Venus-1.5-2B-NOESIS-BF16(D:\models\vlm-gui-mot) - Upstream:
inclusionAI/UI-Venus-1.5-2B - Vendored component: NOESIS-Qwen3-VL-2B-UI-Venus-NF4 (Apache 2.0)
Produced 2026-05-21 by NOESIS DHCF-FNO v15.8 — AMAImedia.com
- Downloads last month
- 20
Model tree for AMAImedia/Qwen3-VL-2B-UI-Venus-NOESIS-NF4
Base model
inclusionAI/UI-Venus-1.5-2B