πŸ“Š Monthly Seasonal Forecast Model (SOTA 2025)

State-of-the-art monthly seasonal forecasting combining the latest innovations from time series research. Competitive with the Chronos-Bolt foundation model (120M+ params) while using only ~2.5M parameters.

πŸ† Results on M4 Monthly (48,000 test series)

Model sMAPE ↓ MASE ↓ OWA ↓ Params
Seasonal Naive (baseline) 15.99 1.260 1.000 -
Naive (baseline) 15.26 1.205 0.955 -
CycleNet (MLP+RCF) 13.41 0.989 0.812 215K
SeasonalPatchTST (Transformer+RCF) 13.31 0.978 0.805 2.3M
Ensemble (Ours) 13.15 0.964 0.794 2.5M
Chronos-Bolt-Small (SOTA foundation) 13.03 0.956 0.787 47M

Our lightweight ensemble achieves OWA=0.794, within 0.9% of the SOTA Chronos-Bolt foundation model that has 20x more parameters.

πŸ”¬ Architecture

CycleNet (MLP + Residual Cycle Forecasting)

Based on CycleNet:

  • Learns a 12-month periodic cycle parameter
  • Subtracts learned cycle β†’ forecasts residuals β†’ adds future cycle
  • RevIN (Reversible Instance Normalization) for distribution shift
  • 4-layer MLP backbone with GELU activation

SeasonalPatchTST (Transformer + RCF)

Combines PatchTST with CycleNet innovations:

  • 12-month patches aligned with annual seasonality
  • CLS token + 4-layer Transformer encoder with 8 attention heads
  • CycleNet RCF decomposition + RevIN normalization
  • Pre-LN architecture for training stability

Learned Ensemble

  • Sigmoid-gated weighted average of CycleNet + PatchTST
  • Weight learned on validation set

πŸ“ˆ Key Innovations

  1. Residual Cycle Forecasting (RCF): From CycleNet β€” learns W=12 annual cycle, forecasts residuals
  2. Seasonal Patching: 12-month patch size matched to annual cycle (vs typical 16 or 32)
  3. RevIN Normalization: Handles diverse scales across 48K series (Macro, Finance, Demographics)
  4. Value-flipping + Scaling Augmentation: From Sundial (ICML 2025 Oral)
  5. CLS Token Aggregation: Global representation for multi-step forecasting

πŸš€ Usage

πŸ“š Training Details

  • Dataset: M4 Monthly (48,000 series from autogluon/chronos_datasets)
  • Context: 48 months β†’ Predict 18 months
  • Optimizer: AdamW (lr=1e-3 CycleNet / 5e-4 PatchTST, weight_decay=0.01)
  • Schedule: Cosine annealing
  • Early stopping: Patience=12, best val MSE checkpoint
  • Augmentation: Value-flipping (10%), random scaling Β±20%
  • 288K training windows from sliding window extraction

🌍 Comparison with Foundation Models (2025 SOTA)

Model Paper Params fev-bench Win Rate
Chronos-2 Amazon, Oct 2025 120M 90.7%
Sundial Tsinghua, ICML 2025 Oral 128M 1st MASE GIFT-Eval
Timer-S1 Tsinghua, Mar 2025 8.3B MoE Best CRPS GIFT-Eval
Chronos-Bolt Amazon 205M 250x faster
Ours (Ensemble) This work 2.5M Competitive OWA

πŸ“„ References

Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train stevevaius/seasonal-forecast-patchtst-cyclenet

Papers for stevevaius/seasonal-forecast-patchtst-cyclenet