📊 Monthly Seasonal Forecast Model (SOTA 2025)

State-of-the-art monthly seasonal forecasting combining the latest innovations from time series research. Competitive with the Chronos-Bolt foundation model (120M+ params) while using only ~2.5M parameters.

🏆 Results on M4 Monthly (48,000 test series)

Model	sMAPE ↓	MASE ↓	OWA ↓	Params
Seasonal Naive (baseline)	15.99	1.260	1.000	-
Naive (baseline)	15.26	1.205	0.955	-
CycleNet (MLP+RCF)	13.41	0.989	0.812	215K
SeasonalPatchTST (Transformer+RCF)	13.31	0.978	0.805	2.3M
Ensemble (Ours)	13.15	0.964	0.794	2.5M
Chronos-Bolt-Small (SOTA foundation)	13.03	0.956	0.787	47M

Our lightweight ensemble achieves OWA=0.794, within 0.9% of the SOTA Chronos-Bolt foundation model that has 20x more parameters.

🔬 Architecture

CycleNet (MLP + Residual Cycle Forecasting)

Based on CycleNet:

Learns a 12-month periodic cycle parameter
Subtracts learned cycle → forecasts residuals → adds future cycle
RevIN (Reversible Instance Normalization) for distribution shift
4-layer MLP backbone with GELU activation

SeasonalPatchTST (Transformer + RCF)

Combines PatchTST with CycleNet innovations:

12-month patches aligned with annual seasonality
CLS token + 4-layer Transformer encoder with 8 attention heads
CycleNet RCF decomposition + RevIN normalization
Pre-LN architecture for training stability

Learned Ensemble

Sigmoid-gated weighted average of CycleNet + PatchTST
Weight learned on validation set

📈 Key Innovations

Residual Cycle Forecasting (RCF): From CycleNet — learns W=12 annual cycle, forecasts residuals
Seasonal Patching: 12-month patch size matched to annual cycle (vs typical 16 or 32)
RevIN Normalization: Handles diverse scales across 48K series (Macro, Finance, Demographics)
Value-flipping + Scaling Augmentation: From Sundial (ICML 2025 Oral)
CLS Token Aggregation: Global representation for multi-step forecasting

🚀 Usage

📚 Training Details

Dataset: M4 Monthly (48,000 series from autogluon/chronos_datasets)
Context: 48 months → Predict 18 months
Optimizer: AdamW (lr=1e-3 CycleNet / 5e-4 PatchTST, weight_decay=0.01)
Schedule: Cosine annealing
Early stopping: Patience=12, best val MSE checkpoint
Augmentation: Value-flipping (10%), random scaling ±20%
288K training windows from sliding window extraction

🌍 Comparison with Foundation Models (2025 SOTA)

Model	Paper	Params	fev-bench Win Rate
Chronos-2	Amazon, Oct 2025	120M	90.7%
Sundial	Tsinghua, ICML 2025 Oral	128M	1st MASE GIFT-Eval
Timer-S1	Tsinghua, Mar 2025	8.3B MoE	Best CRPS GIFT-Eval
Chronos-Bolt	Amazon	205M	250x faster
Ours (Ensemble)	This work	2.5M	Competitive OWA

📄 References

Downloads last month: 2

Inference Providers NEW

Time Series Forecasting

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train stevevaius/seasonal-forecast-patchtst-cyclenet

Papers for stevevaius/seasonal-forecast-patchtst-cyclenet