Instructions to use rowandwhelan/tensorrt-scatterelements-oob-poc with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- TensorRT
How to use rowandwhelan/tensorrt-scatterelements-oob-poc with TensorRT:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
# No code snippets available yet for this library.
# To use this model, check the repository files and the library's documentation.
# Want to help? PRs adding snippets are welcome at:
# https://github.com/huggingface/huggingface.jsTensorRT ScatterElements GPU and mapped-host OOB write PoC
This repository demonstrates that the stock TensorRT ScatterElements plugin uses runtime indices in GPU pointer arithmetic without normalization or bounds validation.
Tested configurations:
- TensorRT
11.1.0.106on Windows - TensorRT
10.16.1.11on Linux - Triton Inference Server
2.69.0with TensorRT10.16.1.11 - NVIDIA RTX 3080 Laptop GPU
The artifacts are synthetic. They do not execute host code or access files.
Local primitive
Build the engine and run the local proof:
python make_engine.py
python run_probe.py
The proof checks three cases:
- An out-of-range positive index updates a separate CUDA allocation.
- The documented-valid index
-1writes before the output buffer instead of selecting its last element. - An in-range control leaves the marker allocation unchanged.
host_pinned_probe.py additionally targets a benign cudaHostAllocPortable marker through its CUDA device mapping:
python host_pinned_probe.py
Expected result:
CROSS_ALLOCATION_WRITE=True
VALID_NEGATIVE_INDEX_OOB_WRITE=True
CONTROL_CLEAN=True
MAPPED_HOST_WRITE=True
Triton cross-model proof
The Linux plans were built with TensorRT 10.16.1.11:
scatter_elements_trt10.planvictim_slow_trt10.plan
start_triton.sh expects the official Triton 2.69.0 standalone bundle under ~/triton-2.69/server/tritonserver and the Python environment under ~/triton-2.69/venv.
Run the control and attack:
./start_triton.sh
python triton_setup.py
python triton_race_probe.py --skip-load 0 200 0.0
./start_triton.sh
python triton_setup.py
python triton_race_probe.py --skip-load 740 200 0.0
triton_setup.py uses Triton's model-load API to automate operator setup. With --skip-load, triton_race_probe.py makes no model-control request; the attack phase consists solely of ordinary inference requests to the deployed scatter_writer model while a co-resident model is executing.
Expected result:
# control
victim_before=31343.25
victim_after=31343.25
TARGET_CHANGED=False
# attack
victim_before=31343.25
victim_after=32680.5
TARGET_CHANGED=True
The fixed index 740 was reproduced against fresh, uninstrumented Triton processes. cuda_trace.c is included only to document how the relative GPU offset was initially measured; it is not loaded for the final control or attack runs.
The same primitive reaches Triton's CUDA-pinned host pool. triton_host_metadata_output.txt records a remote inference request changing a live pool metadata word at 0x204c00030 from 0x10000000 to 0x44a72800. GDB was used only to read the before/after bytes; the server and attack request were uninstrumented.
Evidence
windows_stock_output.txt: local TensorRT 11.1 prooflinux_stock_output.txt: local TensorRT 10.16 prooftriton_control_output.txt: 200 in-range inference requeststriton_attack_repeatability.txt: three fresh Triton attack runswindows_host_pinned_output.txtandlinux_host_pinned_output.txt: mapped host-marker writestriton_host_metadata_output.txt: remote write into live Triton pinned-pool metadata
SHA-256
02cc96cfa0c55c21058c0266ec85f72326f0c19cf245f01dac95f858d20c16fd scatter_elements.engine
37640bd4f3d153cb23e3e147f8127d7ce0558065fd09de21593d9a26f52bc4ae scatter_elements_trt10.plan
1f51adc48f98615d548a5a3a7f0f320e3c1b23e7f48205b21554ba850379d7b0 victim_slow_trt10.plan
- Downloads last month
- -
# Gated model: Login with a HF token with gated access permission hf auth login