Image-to-Video
Diffusers
Diffusion Single File
English
Chinese
i2v
video generation
comfyui
distillation
LoRA
quantization
nvfp4
Instructions to use InsecureErasure/Wan2.1-I2V-14B-480P-StepDistill-CfgDistill-Lightx2v-NVFP4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use InsecureErasure/Wan2.1-I2V-14B-480P-StepDistill-CfgDistill-Lightx2v-NVFP4 with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline from diffusers.utils import load_image, export_to_video # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("InsecureErasure/Wan2.1-I2V-14B-480P-StepDistill-CfgDistill-Lightx2v-NVFP4", dtype=torch.bfloat16, device_map="cuda") pipe.to("cuda") prompt = "A man with short gray hair plays a red electric guitar." image = load_image( "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/guitar-man.png" ) output = pipe(image=image, prompt=prompt).frames[0] export_to_video(output, "output.mp4") - Diffusion Single File
How to use InsecureErasure/Wan2.1-I2V-14B-480P-StepDistill-CfgDistill-Lightx2v-NVFP4 with Diffusion Single File:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
Update README.md
Browse filesExpand quantization section with layer assignment table, sensitivity analysis methodology, and convert_to_quant parameters. Revise overview, license, and acknowledgements.
README.md
CHANGED
|
@@ -27,9 +27,9 @@ base_model_relation: quantized
|
|
| 27 |
<p>
|
| 28 |
|
| 29 |
## Overview
|
| 30 |
-
This is a **partial NVFP4 quantization** of [Wan2.1-I2V-14B-480P-StepDistill-CfgDistill-Lightx2v](https://huggingface.co/lightx2v/Wan2.1-I2V-14B-480P-StepDistill-CfgDistill-Lightx2v) by lightx2v.
|
| 31 |
|
| 32 |
-
Wan2.1-I2V-14B-480P-StepDistill-CfgDistill-Lightx2v is an
|
| 33 |
|
| 34 |
<div style="display: flex; align-items: center; gap: 16px;">
|
| 35 |
<img src="assets/wan21_input_cat.png" width="45%"/>
|
|
@@ -38,10 +38,30 @@ Wan2.1-I2V-14B-480P-StepDistill-CfgDistill-Lightx2v is an advanced image-to-vide
|
|
| 38 |
</div>
|
| 39 |
|
| 40 |
## Quantization
|
| 41 |
-
The model weights have been partially quantized to **NVFP4** (NVIDIA Floating Point 4-bit), a quantization format supported on NVIDIA Blackwell architecture GPUs.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 42 |
|
| 43 |
## License Agreement
|
| 44 |
-
|
| 45 |
|
| 46 |
## Acknowledgements
|
| 47 |
-
|
|
|
|
| 27 |
<p>
|
| 28 |
|
| 29 |
## Overview
|
| 30 |
+
This is a **partial NVFP4 quantization** of [Wan2.1-I2V-14B-480P-StepDistill-CfgDistill-Lightx2v](https://huggingface.co/lightx2v/Wan2.1-I2V-14B-480P-StepDistill-CfgDistill-Lightx2v) by lightx2v, produced using [convert_to_quant](https://github.com/silveroxides/convert_to_quant) by [silveroxides](https://huggingface.co/silveroxides).
|
| 31 |
|
| 32 |
+
[Wan2.1-I2V-14B-480P-StepDistill-CfgDistill-Lightx2v](https://huggingface.co/lightx2v/Wan2.1-I2V-14B-480P-StepDistill-CfgDistill-Lightx2v) is an image-to-video generation model built on [Wan2.1-I2V-14B-480P](https://huggingface.co/Wan-AI/Wan2.1-I2V-14B-480P). It applies step distillation and classifier-free guidance distillation to reduce inference to **4 steps** without CFG, cutting generation time substantially while preserving output quality.
|
| 33 |
|
| 34 |
<div style="display: flex; align-items: center; gap: 16px;">
|
| 35 |
<img src="assets/wan21_input_cat.png" width="45%"/>
|
|
|
|
| 38 |
</div>
|
| 39 |
|
| 40 |
## Quantization
|
| 41 |
+
The model weights have been partially quantized to **NVFP4** (NVIDIA Floating Point 4-bit), a quantization format supported on NVIDIA Blackwell architecture GPUs. Out of the 480 layers eligible for quantization, only a subset has been quantized to NVFP4; the remaining eligible layers are quantized to **FP8** to preserve output quality.
|
| 42 |
+
|
| 43 |
+
The quantization format assigned to each layer is based on a sensitivity analysis performed with a custom script, which scores each weight tensor using excess kurtosis, dynamic range, and aspect ratio. Thresholds are derived automatically from the model's own score distribution.
|
| 44 |
+
|
| 45 |
+
The analysis yields the following `convert_to_quant` parameters. This conversion takes about 140 minutes on an RTX 5060 resulting in a 11.11 GiB safetensors file.
|
| 46 |
+
```bash
|
| 47 |
+
$ convert_to_quant \
|
| 48 |
+
-i Wan2.1-I2V-14B-480P-StepDistill-CfgDistill-Lightx2v-bf16.safetensors \
|
| 49 |
+
--nvfp4 --wan --comfy_quant --save-quant-metadata \
|
| 50 |
+
--custom-layers "blocks\.(0|1|2|3)\.cross_attn\.k\.weight|blocks\.(0|1|2|3)\.cross_attn\.v\.weight|blocks\.(0|1|2|3)\.cross_attn\.q\.weight|blocks\.(0|1|2|3)\.cross_attn\.o\.weight|blocks\.(0|1|2|3)\.cross_attn\.v_img\.weight|blocks\.(4|5|6|7|8|9|10|11|12|13|14|15|16|17|18|19|20|21|22|23|24|25|26|27|28|29|30|31|32|33|34|35)\.cross_attn\.v_img\.weight|blocks\.(0|1|2|3)\.self_attn\.k\.weight|blocks\.(4|5|6|7|8|9|10|11|12|13|14|15|16|17|18|19|20|21|22|23|24|25|26|27|28|29|30|31|32|33|34|35)\.self_attn\.k\.weight|blocks\.(36|37|38|39)\.self_attn\.k\.weight|blocks\.(0|1|2|3)\.self_attn\.v\.weight|blocks\.(0|1|2|3)\.self_attn\.o\.weight|blocks\.(4|5|6|7|8|9|10|11|12|13|14|15|16|17|18|19|20|21|22|23|24|25|26|27|28|29|30|31|32|33|34|35)\.self_attn\.o\.weight|blocks\.(0|1|2|3)\.ffn\.0\.weight|blocks\.(36|37|38|39)\.ffn\.0\.weight|blocks\.(4|5|6|7|8|9|10|11|12|13|14|15|16|17|18|19|20|21|22|23|24|25|26|27|28|29|30|31|32|33|34|35)\.ffn\.2\.weight" \
|
| 51 |
+
--custom-type fp8 \
|
| 52 |
+
-o Wan2.1-I2V-14B-480P-StepDistill-CfgDistill-Lightx2v-nvfp4_p68.safetensors
|
| 53 |
+
```
|
| 54 |
+
The table below details the quantization format applied per layer type across block ranges:
|
| 55 |
+
| Blocks | self_attn.q | self_attn.k | self_attn.v | self_attn.o | cross_attn.q | cross_attn.k | cross_attn.v | cross_attn.o | cross_attn.k_img | cross_attn.v_img | ffn.0 | ffn.2 |
|
| 56 |
+
|--------|-------------|-------------|-------------|-------------|--------------|--------------|--------------|--------------|------------------|------------------|-------|-------|
|
| 57 |
+
| 0–3 | NVFP4 | FP8 | FP8 | FP8 | FP8 | FP8 | FP8 | FP8 | NVFP4 | FP8 | NVFP4 | NVFP4 |
|
| 58 |
+
| 4–9 | NVFP4 | FP8 | NVFP4 | FP8/NVFP4 (50/50) | NVFP4 | FP8 | FP8 | NVFP4 | NVFP4 | FP8 | NVFP4 | FP8/NVFP4 (50/50) |
|
| 59 |
+
| 10–15 | NVFP4 | FP8 | NVFP4 | NVFP4 | NVFP4 | FP8 | FP8/NVFP4 (50/50) | NVFP4 | NVFP4 | FP8 | FP8/NVFP4 (50/50) | FP8/NVFP4 (67/33) |
|
| 60 |
+
| 16–22 | NVFP4 | FP8 | NVFP4 | FP8/NVFP4 (29/71) | NVFP4 | FP8 | NVFP4 | NVFP4 | NVFP4 | NVFP4 | FP8/NVFP4 (57/43) | FP8/NVFP4 (43/57) |
|
| 61 |
+
| 23–39 | NVFP4 | FP8 | NVFP4 | FP8/NVFP4 (12/88) | NVFP4 | FP8 | NVFP4 | NVFP4 | NVFP4 | NVFP4 | FP8/NVFP4 (35/65) | NVFP4 |
|
| 62 |
|
| 63 |
## License Agreement
|
| 64 |
+
This model is licensed under the [Apache 2.0 License](LICENSE.txt). You retain full ownership of your generated content, but are solely responsible for its use in compliance with the license terms and applicable laws.
|
| 65 |
|
| 66 |
## Acknowledgements
|
| 67 |
+
Big kudos to the contributors to the [Wan2.1](https://huggingface.co/Wan-AI/Wan2.1-T2V-14B) and [Self-Forcing](https://huggingface.co/gdhe17/Self-Forcing/tree/main) repositories for their open research, and to [silveroxides](https://huggingface.co/silveroxides) for their quantization tools.
|