ILSVRC/imagenet-1k
Viewer β’ Updated β’ 1.43M β’ 80.8k β’ 809
How to use humbleakh/qwen2.5-vl-3b-8bit-chain-of-zoom with Transformers:
# Use a pipeline as a high-level helper
# Warning: Pipeline type "image-to-text" is no longer supported in transformers v5.
# You must load the model directly (see below) or downgrade to v4.x with:
# 'pip install "transformers<5.0.0'
from transformers import pipeline
pipe = pipeline("image-to-text", model="humbleakh/qwen2.5-vl-3b-8bit-chain-of-zoom") # Load model directly
from transformers import AutoModelForImageTextToText
model = AutoModelForImageTextToText.from_pretrained("humbleakh/qwen2.5-vl-3b-8bit-chain-of-zoom", dtype="auto")Qwen2.5-VL-3B optimized with 8-bit quantization for Chain-of-Zoom super-resolution pipeline. Provides high-quality prompt generation for context-aware super-resolution.
This is a 8-bit quantized version of the VLM component for the Chain-of-Zoom super-resolution pipeline, specifically optimized for production deployment while maintaining exceptional quality.
Chain-of-Zoom achieves extreme super-resolution (8x-32x) through intelligent autoregressive scaling:
Input Image β VLM Analysis β Enhanced Prompts β Diffusion SR β Output Image
β β β β β
ββββ RAM Tags ββββ LoRA Adapt ββββ Scale Chain ββββ Iterate
# Install requirements
pip install transformers diffusers torch accelerate bitsandbytes
# Load VLM model
from transformers import AutoModel, BitsAndBytesConfig
import torch
# Configure quantization
quantization_config = BitsAndBytesConfig(
load_in_8bit=True,
llm_int8_threshold=6.0
)
# Load quantized model
model = AutoModel.from_pretrained(
"humbleakh/qwen2.5-vl-3b-8bit-chain-of-zoom",
quantization_config=quantization_config,
device_map="auto",
torch_dtype=torch.bfloat16
)
| Metric | Original | 8-bit Quantized | Improvement |
|---|---|---|---|
| Memory Usage | 6.0GB | 3.0GB | 50% reduction |
| Parameters | 3B (FP16) | 3B (8-bit) | Same functionality |
| Quality Score | 100% | 95%+ | Minimal degradation |
| Inference Speed | 1.0x | 2.5x | Faster processing |
| Colab Compatible | β (OOM) | β (T4 GPU) | Production ready |
# VLM Integration
from chain_of_zoom import ChainOfZoom8BitOptimal
# Initialize pipeline
pipeline = ChainOfZoom8BitOptimal()
# Load your image
from PIL import Image
image = Image.open("low_res_image.jpg")
# Run super-resolution
results = pipeline.chain_of_zoom(image, target_scale=8)
final_image = results[-1]['image']
final_image.save("super_resolved_8x.jpg")
torch>=2.0.0
transformers>=4.36.0
diffusers>=0.21.0
bitsandbytes>=0.46.0
accelerate>=0.20.0
pillow>=9.0.0
numpy>=1.21.0
Licensed under Apache 2.0. See LICENSE file for full terms.
@misc{chain_of_zoom_vlm_8_bit,
title={Chain-of-Zoom VLM 8-bit Quantized Model},
author={Chain-of-Zoom Team},
year={2024},
howpublished={\url{https://huggingface.co/humbleakh/qwen2.5-vl-3b-8bit-chain-of-zoom}},
note={Optimal quantization for super-resolution pipeline}
}
Base model
Qwen/Qwen2.5-VL-3B-Instruct