25 GB
6,055 files
Updated 6 days ago
Ctrl+K
| Name | Size | Uploaded | Xet hash |
|---|---|---|---|
| manifests | 2 items | ||
| logs | 1 items | ||
| cache | 6,050 items | ||
| README.md | 2.86 kB xet | 90ee7bb4 | |
| .gitattributes | 2.5 kB xet | 738f1125 |
BrainAge Golden Preprocessed Cache
6,050 preprocessed brain MRI tensors ready for training a brain-age prediction model. Skip the 40+ hour preprocessing step and jump straight to model training.
What's inside
Each .pt file (one per subject) contains:
| Key | Type | Shape | Description |
|---|---|---|---|
volume |
float16 | (128, 144, 112) | Z-normed T1w brain in MNI space, trilinear-resized |
tab |
float32 | (86,) | 70 regional volumes (log1p/12) + 3 sex one-hot + 13 site one-hot |
age |
float32 | scalar | Chronological age in years |
meta |
dict | — | subject_id, site, sex, age, split |
Stats
| Metric | Value |
|---|---|
| Total subjects | 6,050 |
| Age range | 0 – 86 years |
| Source datasets | 12 (BCP, Calgary, ds002726, ds000248, PTBP, IXI, MPI-Leipzig, AOMIC, NKI-Rockland, ABIDE-I, ABIDE-II, ADHD-200) |
| Volume shape | 128 × 144 × 112 (D × H × W) |
| Tabular dim | 86 (70 regions + 3 sex + 13 site) |
| File size | ~4 MB each |
| Total size | ~24 GB |
Preprocessing pipeline applied
Raw T1w NIfTI
→ HD-BET skull-strip (GPU)
→ N4 bias correction (ANTs)
→ Affine registration to MNI152 1mm
→ Z-score intensity normalization
→ Harvard-Oxford atlas segmentation (69 regions)
→ Volume measurement + rescaling to native space
→ Tensor packaging (.pt)
Quick start
from huggingface_hub import snapshot_download
import torch
# Download (~24 GB)
snapshot_download(
"bilalahmad176176/BrainAge-Golden-Preprocessed",
repo_type="dataset",
local_dir="cache/"
)
# Load one subject
data = torch.load("cache/cache/IXI002.pt", weights_only=False)
print(data["volume"].shape) # (128, 144, 112) float16
print(data["tab"].shape) # (86,) float32
print(data["age"]) # e.g. 36.2
print(data["meta"]) # {'subject_id': 'IXI002', 'site': 'DataSet-6_IXI', ...}
Train a model
# Generate split
python -m pipeline_v2.data_split \
--manifests Golden-0-to-25/manifest.csv Golden-25plus/manifest.csv \
--out cache/split.csv
# Train
python -m pipeline_v2.train \
--cache_dir cache/cache \
--split_csv cache/split.csv \
--out_ckpt brainage_sfcn.pt \
--epochs 60 --batch 4
Related
- Raw dataset: bilalahmad176176/BrainAge-Golden-Raw
- 3D Viewer demo: bilalahmad176176/BrainAge-3D-Viewer
Citation
Please cite the original source studies listed in the raw dataset manifests.
- Total size
- 25 GB
- Files
- 6,055
- Last updated
- Jun 22
- Pre-warmed CDN
- US EU US EU