Commit ·
0a7bbdd
0
Parent(s):
Super-squash branch 'main' using huggingface_hub
Browse files- .gitattributes +37 -0
- README.md +48 -0
- dit_int8_full.pt +3 -0
- vae_full.pt +3 -0
.gitattributes
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
*.7z filter=lfs diff=lfs merge=lfs -text
|
| 2 |
+
*.arrow filter=lfs diff=lfs merge=lfs -text
|
| 3 |
+
*.bin filter=lfs diff=lfs merge=lfs -text
|
| 4 |
+
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
| 5 |
+
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
| 6 |
+
*.ftz filter=lfs diff=lfs merge=lfs -text
|
| 7 |
+
*.gz filter=lfs diff=lfs merge=lfs -text
|
| 8 |
+
*.h5 filter=lfs diff=lfs merge=lfs -text
|
| 9 |
+
*.joblib filter=lfs diff=lfs merge=lfs -text
|
| 10 |
+
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
| 11 |
+
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
| 12 |
+
*.model filter=lfs diff=lfs merge=lfs -text
|
| 13 |
+
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
| 14 |
+
*.npy filter=lfs diff=lfs merge=lfs -text
|
| 15 |
+
*.npz filter=lfs diff=lfs merge=lfs -text
|
| 16 |
+
*.onnx filter=lfs diff=lfs merge=lfs -text
|
| 17 |
+
*.ot filter=lfs diff=lfs merge=lfs -text
|
| 18 |
+
*.parquet filter=lfs diff=lfs merge=lfs -text
|
| 19 |
+
*.pb filter=lfs diff=lfs merge=lfs -text
|
| 20 |
+
*.pickle filter=lfs diff=lfs merge=lfs -text
|
| 21 |
+
*.pkl filter=lfs diff=lfs merge=lfs -text
|
| 22 |
+
*.pt filter=lfs diff=lfs merge=lfs -text
|
| 23 |
+
*.pth filter=lfs diff=lfs merge=lfs -text
|
| 24 |
+
*.rar filter=lfs diff=lfs merge=lfs -text
|
| 25 |
+
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
| 26 |
+
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
| 27 |
+
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
| 28 |
+
*.tar filter=lfs diff=lfs merge=lfs -text
|
| 29 |
+
*.tflite filter=lfs diff=lfs merge=lfs -text
|
| 30 |
+
*.tgz filter=lfs diff=lfs merge=lfs -text
|
| 31 |
+
*.wasm filter=lfs diff=lfs merge=lfs -text
|
| 32 |
+
*.xz filter=lfs diff=lfs merge=lfs -text
|
| 33 |
+
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
+
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
+
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
onnx/dit_fp16.onnx.data filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
onnx/dit_int8.onnx.data filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
|
@@ -0,0 +1,48 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
base_model: stepfun-ai/Step1X-Edit
|
| 4 |
+
tags:
|
| 5 |
+
- depth-estimation
|
| 6 |
+
- normal-estimation
|
| 7 |
+
- quantized
|
| 8 |
+
- int8
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
+
# FE2E INT8 (Pre-quantized for CPU)
|
| 12 |
+
|
| 13 |
+
Pre-quantized INT8 model for [FE2E](https://github.com/AMAP-ML/FE2E) (CVPR 2026) depth + normal estimation.
|
| 14 |
+
|
| 15 |
+
## Files
|
| 16 |
+
|
| 17 |
+
| File | Size | Description |
|
| 18 |
+
|------|------|-------------|
|
| 19 |
+
| dit_int8_full.pt | 11.6 GB | Step1X-Edit DiT + LDRN LoRA merged, dynamic INT8 quantized (torch.save full model) |
|
| 20 |
+
| vae_full.pt | 320 MB | VAE decoder, FP32 (torch.save full model) |
|
| 21 |
+
| dit_int8.pt | 11.6 GB | State dict only (for advanced use) |
|
| 22 |
+
| vae_fp32.pt | 320 MB | State dict only (for advanced use) |
|
| 23 |
+
|
| 24 |
+
## How it was made
|
| 25 |
+
|
| 26 |
+
1. Loaded FP8 base model () on GPU
|
| 27 |
+
2. Cast to FP32 on CPU
|
| 28 |
+
3. Merged LDRN LoRA () in full precision
|
| 29 |
+
4. Applied (INT8, Linear layers)
|
| 30 |
+
5. Saved full model via
|
| 31 |
+
|
| 32 |
+
## Usage
|
| 33 |
+
|
| 34 |
+
|
| 35 |
+
|
| 36 |
+
Load with to avoid doubling memory. Requires ~12 GB RAM.
|
| 37 |
+
|
| 38 |
+
## Performance
|
| 39 |
+
|
| 40 |
+
- GPU (RTX 5090, FP8): 2.1s per image
|
| 41 |
+
- CPU (HF Space, INT8): ~29 min for 768x1024
|
| 42 |
+
- Single denoise step, outputs depth + normal simultaneously
|
| 43 |
+
|
| 44 |
+
## Credits
|
| 45 |
+
|
| 46 |
+
- [FE2E](https://github.com/AMAP-ML/FE2E) (CVPR 2026)
|
| 47 |
+
- [Step1X-Edit](https://github.com/stepfun-ai/Step1X-Edit) base model
|
| 48 |
+
- [rkfg/Step1X-Edit-FP8](https://huggingface.co/rkfg/Step1X-Edit-FP8) FP8 quantization
|
dit_int8_full.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:92c6886eaa8ead7f7c1dcf34464f9e79a5e99c25151b81900f0e9cb3c05254c6
|
| 3 |
+
size 12438035751
|
vae_full.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8aab5c70ac1a0a52bf29e34f945ad45661e98bee886f2925e931b47c6ebd47d3
|
| 3 |
+
size 335398161
|