symbolic_mutations / README_stage5.md
RFTSystems's picture
Create README_stage5.md
19f5c80 verified
|
Raw
History Blame
2.88 kB

Stage Five — ViT-Small/B32 (ImageNet Subset) Energy-Scaling Validation

Rendered Frame Theory (RFT)
Author: Liam S. Grinstead
Date: Oct‑2025


📄 Abstract

Stage Five scales RFT from ViT‑Tiny to ViT‑Small/B32, testing whether coherence‑linked efficiency persists at higher depth and embedding dimension. Using a consistent telemetry schema (drift, flux, E_ret, coherence, J/step, ΔT), RFT (DCLR + Ψ–Ω) is compared with Adam under matched conditions. Results show reduced energy per step and stable drift/flux at comparable accuracy, confirming that RFT’s efficiency gains hold as model capacity increases.


🎯 Objective

Validate that RFT’s energy and stability advantages generalise to ViT‑Small/B32 by measuring J/step, drift, flux, and accuracy on an ImageNet‑like workload, with bf16 autocast where available and identical hyperparameters across modes.


⚙️ Methodology

  • Model: ViT‑Small, patch size 32, dim 384, depth 12, heads 6, MLP ratio 4
  • Data: ImageNet‑subset via ImageFolder (recommended), or synthetic fallback for quick verification
  • Setup: Python 3.10, PyTorch ≥ 2.1, A100/H100 (bf16 autocast if available), seed 1234
  • Metrics: Loss, accuracy, J/step (NVML if present; proxy otherwise), drift, flux, energy‑retention (E_ret), coherence (coh), ΔT
  • Parity: Same batch size, learning rate, and number of steps across RFT and BASE
  • Orbital Coupler: Ψ–Ω drift/flux synchronisation each iteration
  • Optimisers: DCLR (RFT) vs Adam (BASE)

📊 Results

  • RFT (DCLR + Ψ–Ω): Reduced energy per step compared to Adam, with tightly bounded drift and smooth flux.
  • Baseline (Adam): Higher J/step and less stable drift/flux behaviour at matched accuracy.
  • Synthetic fallback: Reproduced the same qualitative efficiency pattern, confirming that gains arise from optimiser–telemetry dynamics rather than dataset artefacts.

💡 Discussion

Scaling from ViT‑Tiny to ViT‑Small/B32 preserves RFT’s advantages in attention‑heavy architectures. The energy reduction with stable drift/flux strengthens the claim that coherence‑linked control is architecture‑agnostic and scales with depth and embedding dimension.


✅ Conclusion

RFT maintains its efficiency and stability benefits at ViT‑Small/B32 scale, validating the energy‑scaling hypothesis and setting the stage for ViT‑Base and multi‑modal fusion in later stages.


📂 Reproducibility

  • Script: stage5.py
  • Log Output: stage5_vit_small_b32.jsonl
  • Seed: 1234
  • Hardware: A100/H100 (CPU fallback supported)
  • Sealing: All runs sealed with SHA‑512 hashes

🚀 Usage

  • RFT mode:
    python stage5.py --mode RFT --steps 1000 --batch 256 --lr 5e-4 --data_dir /path/to/imagenet_subset