humbleakh commited on
Commit
7d33284
ยท
verified ยท
1 Parent(s): a0a9a96

Upload 4-bit quantized Qwen2.5-VL-3B for Chain-of-Zoom

Browse files
README.md CHANGED
@@ -11,14 +11,14 @@ base_model: Qwen/Qwen2.5-VL-3B-Instruct
11
  license: apache-2.0
12
  language:
13
  - en
14
- pipeline_tag: vision-language-understanding
15
  ---
16
 
17
  # Qwen2.5-VL-3B 4-bit Quantized for Chain-of-Zoom
18
 
19
  ## ๐Ÿ“‹ Model Description
20
 
21
- 4-bit quantized Vision-Language Model optimized for super-resolution prompt generation
22
 
23
  This model is part of the **Chain-of-Zoom 4-bit Quantized Pipeline** - a memory-optimized version of the original Chain-of-Zoom super-resolution framework.
24
 
@@ -68,11 +68,11 @@ bnb_config = BitsAndBytesConfig(
68
 
69
  ## ๐Ÿ”ง Technical Specifications
70
 
71
- - **Created**: 2025-06-08 16:28:34
72
  - **Quantization Library**: BitsAndBytes
73
  - **Framework**: PyTorch + Transformers
74
  - **Precision**: 4-bit NF4
75
- - **Model Size**: 2899.8801851272583 MB
76
 
77
  ## ๐Ÿ“ Citation
78
 
 
11
  license: apache-2.0
12
  language:
13
  - en
14
+ pipeline_tag: image-text-to-text
15
  ---
16
 
17
  # Qwen2.5-VL-3B 4-bit Quantized for Chain-of-Zoom
18
 
19
  ## ๐Ÿ“‹ Model Description
20
 
21
+ 4-bit quantized Vision-Language Model optimized for Chain-of-Zoom super-resolution
22
 
23
  This model is part of the **Chain-of-Zoom 4-bit Quantized Pipeline** - a memory-optimized version of the original Chain-of-Zoom super-resolution framework.
24
 
 
68
 
69
  ## ๐Ÿ”ง Technical Specifications
70
 
71
+ - **Created**: 2025-06-08 17:10:40
72
  - **Quantization Library**: BitsAndBytes
73
  - **Framework**: PyTorch + Transformers
74
  - **Precision**: 4-bit NF4
75
+ - **Model Size**: 2899.8802061080933 MB
76
 
77
  ## ๐Ÿ“ Citation
78
 
config.json CHANGED
@@ -112,6 +112,7 @@
112
  "spatial_patch_size": 14,
113
  "temporal_patch_size": 2,
114
  "tokens_per_second": 2,
 
115
  "window_size": 112
116
  },
117
  "vision_end_token_id": 151653,
 
112
  "spatial_patch_size": 14,
113
  "temporal_patch_size": 2,
114
  "tokens_per_second": 2,
115
+ "torch_dtype": "bfloat16",
116
  "window_size": 112
117
  },
118
  "vision_end_token_id": 151653,
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5e79c18e6f20fd0e15d9e522b68ab5c9357933809a1e97df2abd0082171f0afe
3
  size 3024861693
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:de593e892f76e8b97f2344896ad5a1b8db248be9b92cdfde7bcd2d231dcee6a6
3
  size 3024861693
preprocessor_config.json CHANGED
@@ -18,7 +18,7 @@
18
  "merge_size": 2,
19
  "min_pixels": 3136,
20
  "patch_size": 14,
21
- "processor_class": "Qwen2_5_VLProcessor",
22
  "resample": 3,
23
  "rescale_factor": 0.00392156862745098,
24
  "size": {
 
18
  "merge_size": 2,
19
  "min_pixels": 3136,
20
  "patch_size": 14,
21
+ "processor_class": "Qwen2VLProcessor",
22
  "resample": 3,
23
  "rescale_factor": 0.00392156862745098,
24
  "size": {
tokenizer_config.json CHANGED
@@ -201,7 +201,7 @@
201
  "extra_special_tokens": {},
202
  "model_max_length": 131072,
203
  "pad_token": "<|endoftext|>",
204
- "processor_class": "Qwen2_5_VLProcessor",
205
  "split_special_tokens": false,
206
  "tokenizer_class": "Qwen2Tokenizer",
207
  "unk_token": null
 
201
  "extra_special_tokens": {},
202
  "model_max_length": 131072,
203
  "pad_token": "<|endoftext|>",
204
+ "processor_class": "Qwen2VLProcessor",
205
  "split_special_tokens": false,
206
  "tokenizer_class": "Qwen2Tokenizer",
207
  "unk_token": null
video_preprocessor_config.json CHANGED
@@ -73,7 +73,7 @@
73
  "merge_size"
74
  ],
75
  "patch_size": 14,
76
- "processor_class": "Qwen2_5_VLProcessor",
77
  "resample": 3,
78
  "rescale_factor": 0.00392156862745098,
79
  "size": {
 
73
  "merge_size"
74
  ],
75
  "patch_size": 14,
76
+ "processor_class": "Qwen2VLProcessor",
77
  "resample": 3,
78
  "rescale_factor": 0.00392156862745098,
79
  "size": {