Tube Next-Hop Policy

A GATv2 encoder + MLP decoder trained to predict optimal next-hop routing decisions on the London Underground graph. Given a current station and a destination, the model outputs a probability distribution over adjacent stations.

Greedy rollout achieves 1.09× Dijkstra ratio (optimal shortest travel time) with 100.0% success rate across all 239,610 origin–destination pairs.

Intended Use

This model is a research artifact and demonstration of learned graph routing. It predicts the next station to visit on a shortest-time path between any two London Underground stations, using only the graph topology and travel-time edge weights from TfL's GTFS timetable feed.

Potential applications include learned routing components in transit planning systems, GNN routing benchmarks, and educational demonstrations of graph-attention-based policy learning.

Architecture

Component Details
Encoder 16-layer GATv2, d=512, 8 heads
Decoder 3-layer MLP with adjacency masking
Graph 490 stations, 19 lines, 1752 directed edges
Parameters 10,168,299
Training signal KL divergence vs. Q-soft targets from Floyd–Warshall
Label smoothing 0.1
Scheduled sampling 0.5

Evaluation Results

Evaluated on a held-out set of 23,961 OD pairs (489,981 next-hop steps), with greedy rollout to completion.

Metric Value
Rollout success rate 100.0%
Dijkstra ratio (travel time vs. optimal) 1.09
Step accuracy (single-step top-1) 79.1%
Length ratio (hops vs. optimal hops) inf

Note on step accuracy: The 79.1% top-1 step accuracy reflects that many nodes have multiple equally-optimal next hops. The model distributes probability across these alternatives, which is correct behavior — the rollout metrics confirm optimal routing.

Training Data

The graph topology and edge travel times are extracted from Transport for London's GTFS timetable feed. Floyd–Warshall all-pairs shortest paths provide the Q-value supervision signal.

  • 239,610 OD pairs → 4,891,856 next-hop training steps
  • 90/10 OD-pair split (215,649 train / 23,961 val)
  • Batch size: 4,096 steps

Training Details

  • Optimizer: AdamW, lr=3e-04
  • Compiled with torch.compile(mode='default')
  • Training time: ~58 minutes
  • Hardware: single GPU

Limitations

  • The model is specific to the London Underground graph topology at the time of the GTFS snapshot. It will not generalize to other transit networks without retraining.
  • Edge weights represent scheduled travel times, not real-time conditions.
  • The adjacency mask is fixed at inference time — the model cannot handle station closures or service disruptions without mask modification.

Usage

pip install tubeulator-models[inference]
from tubeulator_models import TubeRouter

router = TubeRouter.from_pretrained("permutans/tube-nexthop-policy")
route = router.route("West Ham", "Shoreditch")
print(route)
# West Ham
#   → [district] Bromley-by-Bow (2.0m)
#   ...
# ✓ 8 hops · 2 lines · 1 transfer · 18.0 min

# With waypoints
route = router.route("Camden Town", "Canary Wharf", via=["King's Cross"])

For CLI usage:

pip install tubeulator-models[cli]
tm-infer policy --model permutans/tube-nexthop-policy -o "West Ham" -d Shoreditch

Links

  • Code: tubeulator-models
  • Companion model: Distance field (value head) for travel-time estimation

Part of the Model Trains series. Trained on GTFS timetable data from Transport for London.

Downloads last month
1
Safetensors
Model size
10.4M params
Tensor type
F32
·
BOOL
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Evaluation results

  • Rollout Success Rate on TfL GTFS Timetable (London Underground)
    self-reported
    1.000
  • Dijkstra Ratio on TfL GTFS Timetable (London Underground)
    self-reported
    1.088
  • Step Accuracy on TfL GTFS Timetable (London Underground)
    self-reported
    0.791
  • Length Ratio on TfL GTFS Timetable (London Underground)
    self-reported
    inf