SwiFT (4D Swin fMRI Transformer, Mamba-free predecessor of NeuroSTORM) -- SwiFT contrastive-pretrained backbone

Description

SwiFT (Kim, Kwon, Moon, Cha et al., arXiv:2307.05916) is a 4D Swin Transformer for fMRI BOLD volumes. The architecture is the Mamba-free predecessor of NeuroSTORM -- same 4-stage Swin topology (depths [2, 2, 6, 2], channels [36, 72, 144, 288], 4D window [4, 4, 4, 4]) but using conventional WindowAttention4D multi-head self-attention as the per-window mixer instead of NeuroSTORM's Mamba selective-scan SSM.

v0 ships two variants:

  • contrastive -- the contrastive-pretraining checkpoint (SimCLR-style backbone trained on a multi-cohort fMRI corpus). Use as a frozen feature extractor for downstream tasks.
  • hcp-sex -- the supervised fine-tune on HCP-YA sex classification. Same architecture as contrastive; weights have been further fine-tuned end-to-end. Use as a starting point for additional fine-tuning or as an HCP-specific feature extractor.

The bundle stores only the backbone (SwinTransformer4D) -- the consumer-side SimCLR projection (emb_mlp) and the downstream task heads (clf.head, reg.head) are training- time plumbing and are dropped at extract.

Intended use

fMRI backbone for representation learning. Input: (1, 96, 96, 96, 20) MNI152 BOLD clip (consumer handles registration + normalisation). Output: deepest-stage backbone feature map (288, 2, 2, 2, 20) suitable for downstream linear probes or MLP heads. The bundle ships model.norm + model.head as static fields for state-dict round-trip but JAX call does NOT invoke them (matching the upstream's encoder-only forward).

Usage

from ilex.models.swift import SwiFT
model = SwiFT.from_pretrained('ilex-hub/swift.contrastive.1')

Authors

Kim P. Y., Kwon J., Moon T., Cha J. (Seoul National University M.IN.D Lab + Connectome Lab)

Citation

Kim P. Y., Kwon J., Moon T., Cha J. et al. (2023). SwiFT -- Swin 4D fMRI Transformer. arXiv 2307.05916.

References

  • Kim P. Y., Kwon J., Moon T., Cha J. et al. (2023). SwiFT -- Swin 4D fMRI Transformer. arXiv:2307.05916.
  • Liu Z. et al. (2021). Swin Transformer -- Hierarchical Vision Transformer using Shifted Windows. arXiv:2103.14030.
  • Upstream code + weights -- github.com/Transconnectome/SwiFT (Apache-2.0).

License

HF Hub license tag: apache-2.0

Effective terms: Apache-2.0 (both upstream code and the in-tree pretrained .ckpt checkpoints at github.com/Transconnectome/SwiFT). The ilex JAX / Equinox port code is separately Apache-2.0 / GPL-3.0.

Upstream license reference: https://www.apache.org/licenses/LICENSE-2.0

Copyright

SwiFT is copyright (c) Transconnectome / Seoul National University 2023, Apache-2.0-licensed on the upstream code + the in-tree pretrained checkpoints (github.com/Transconnectome/SwiFT). The ilex JAX / Equinox port code is separately licensed under Apache-2.0 / GPL-3.0.

Upstream source

Original weights / reference implementation: https://github.com/Transconnectome/SwiFT

Provenance

This artefact was produced by ilex's save/load pipeline. The architecture is implemented in ilex.models.swift.SwiFT and the weights have been converted from their upstream format. See the upstream source above for the canonical reference.

Downloads last month
12
Safetensors
Model size
4.32M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Papers for ilex-hub/swift.contrastive.1