HSIGene / README.md

Add files using upload-large-folder tool

7f25226 verified 4 months ago

4.22 kB

	---
	license: apache-2.0
	library_name: diffusers
	tags:
	- hsigene
	- hyperspectral
	- latent-diffusion
	- controlnet
	- arxiv:2409.12470
	pipeline_tag: image-to-image
	---

	> [!WARNING] we do not have a full checkpoint conversion validation, if you encounter pipeline loading failure and unsidered output, please contact me via bili_sakura@zju.edu.cn

	# BiliSakura/HSIGene

	Hyperspectral image generation — HSIGene converted to diffusers format. Supports task-specific conditioning with local controls (HED, MLSD, sketch, segmentation), global controls (content or text), or metadata embeddings. Outputs 48-band hyperspectral images (256×256 pixels).

	> Source: [HSIGene](https://arxiv.org/abs/2409.12470). Converted to diffusers format; model dir is self-contained (no external project for inference).

	## Repository Structure (after conversion)

	\| Component \| Path \|
	\|------------------------\|--------------------------\|
	\| UNet (LocalControlUNet)\| `unet/` \|
	\| VAE \| `vae/` \|
	\| Text encoder (CLIP) \| `text_encoder/` \|
	\| Local adapter \| `local_adapter/` \|
	\| Global content adapter\| `global_content_adapter/`\|
	\| Global text adapter \| `global_text_adapter/` \|
	\| Metadata encoder \| `metadata_encoder/` \|
	\| Scheduler \| `scheduler/` \|
	\| Pipeline \| `pipeline_hsigene.py` \|
	\| Config \| `model_index.json` \|

	## Usage

	Inference Demo (`DiffusionPipeline.from_pretrained`)

	```python
	from diffusers import DiffusionPipeline
	pipe = DiffusionPipeline.from_pretrained(
	"/path/to/BiliSakura/HSIGene",
	trust_remote_code=True,
	custom_pipeline="path/to/pipeline_hsigene.py",
	model_path="path/to/BiliSakura/HSIGene"
	)
	pipe = pipe.to("cuda")
	```

	Dependencies: `pip install diffusers transformers torch einops safetensors`

	### Per-Condition Inference Demos (Not Combined)

	`local_conditions` shape: `(B, 18, H, W)`; `global_conditions` shape: `(B, 768)`; `metadata` shape: `(7,)` or `(B, 7)`.

	```python
	# HED condition
	output = pipe(prompt="", local_conditions=hed_local, global_conditions=None, metadata=None)
	```

	```python
	# MLSD condition
	output = pipe(prompt="", local_conditions=mlsd_local, global_conditions=None, metadata=None)
	```

	```python
	# Sketch condition
	output = pipe(prompt="", local_conditions=sketch_local, global_conditions=None, metadata=None)
	```

	```python
	# Segmentation condition
	output = pipe(prompt="", local_conditions=seg_local, global_conditions=None, metadata=None)
	```

	```python
	# Content condition (global)
	output = pipe(prompt="", local_conditions=None, global_conditions=content_global, metadata=None)
	```

	```python
	# Text condition
	output = pipe(prompt="Wasteland", local_conditions=None, global_conditions=None, metadata=None)
	```

	```python
	# Metadata condition
	output = pipe(prompt="", local_conditions=None, global_conditions=None, metadata=metadata_vec)
	```

	## Model Sources

	- Paper: [HSIGene: A Foundation Model For Hyperspectral Image Generation](https://arxiv.org/abs/2409.12470)
	- Checkpoint: [GoogleDrive](https://drive.google.com/file/d/1euJAbsxCgG1wIu_Eh5nPfmiSP9suWsR4/view?usp=drive_link)
	- Annotators: [BaiduNetdisk](https://pan.baidu.com/s/1K1Y__blA6uJVV9l1QG7QvQ?pwd=98f1) (code: 98f1) → `data_prepare/annotator/ckpts`

	## Citation

	```bibtex
	@article{pangHSIGeneFoundationModel2026,
	title = {{{HSIGene}}: {{A Foundation Model}} for {{Hyperspectral Image Generation}}},
	shorttitle = {{{HSIGene}}},
	author = {Pang, Li and Cao, Xiangyong and Tang, Datao and Xu, Shuang and Bai, Xueru and Zhou, Feng and Meng, Deyu},
	year = 2026,
	month = jan,
	journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
	volume = {48},
	number = {1},
	pages = {730--746},
	issn = {1939-3539},
	doi = {10.1109/TPAMI.2025.3610927},
	urldate = {2026-01-02},
	keywords = {Adaptation models,Computational modeling,Controllable generation,deep learning,diffusion model,Diffusion models,Foundation models,hyperspectral image synthesis,Hyperspectral imaging,Image synthesis,Noise reduction,Reliability,Superresolution,Training}
	}

	```