TLB Shootdown IPI-Storm Slippage Playbook

2026-03-17 · finance

TLB Shootdown IPI-Storm Slippage Playbook

Why this exists

Execution systems can show stable network telemetry while still suffering sudden latency-tail blowups.

One under-modeled cause is TLB shootdown storms: frequent page-table changes force cross-core IPIs (inter-processor interrupts), stalling critical threads at the worst moments.

If strategy, risk, and gateway threads share NUMA nodes/cores with memory-churn-heavy processes, these shootdowns can create bursty dispatch delays that look like "random slippage" unless you model them explicitly.


Core failure mode

TLB shootdowns occur when memory mapping metadata changes (unmap/remap, page migration, aggressive allocator behavior, THP split/defrag paths). The kernel sends IPIs so other cores invalidate stale TLB entries.

During stressed windows:

Result: arrival-to-send drift rises exactly when microstructure is least forgiving.


Slippage decomposition with shootdown term

For parent order (i):

[ IS_i = C_{delay} + C_{impact} + C_{miss} + C_{tlb} ]

Where:

[ C_{tlb} = C_{ipi-preempt} + C_{dispatch-jitter} + C_{burst-recovery} ]


Feature set (production-ready)

1) Kernel/CPU pressure features

Collect per-host, per-core, and per-NUMA metrics:

2) Memory-churn context features

3) Execution-path timing features

4) Microstructure outcome features


Labeling scheme for supervised overlay

Create host-time regime labels:

Use hysteresis (entry/exit thresholds) + minimum dwell time to avoid state flapping.


Modeling architecture

Use a two-layer design:

  1. Baseline slippage model

    • your existing impact/fill/deadline model conditioned on market state
  2. Kernel-interference uplift model

    • predicts incremental uplift:
      • delta_is_mean
      • delta_is_q95
      • delta_miss_prob

Final estimate:

[ \hat{IS}{final} = \hat{IS}{baseline} + \Delta\hat{IS}_{tlb} ]

Train with matched controls (same symbol/session/liquidity/volatility bins) so host-side interference uplift is disentangled from market regime changes.


Online controller policy

State CLEAN

State WATCH

State STORM

State SAFE_DEGRADED

Add cooldown before returning to aggressive tactics.


Desk metrics to monitor

Slice by host, symbol-liquidity bucket, strategy type, and session segment.


Mitigation ladder (practical)

  1. Isolation first

    • pin critical execution threads to shielded cores
    • isolate noisy background services from execution NUMA domains
  2. Memory-path hygiene

    • reduce high-frequency map/unmap behavior in latency-critical processes
    • avoid allocator patterns that trigger mapping churn
    • tune/contain auto-NUMA migration where harmful
  3. Execution dampers

    • bounded recovery pacing after detected stalls
    • prevent immediate backlog flush that destroys queue priority
  4. Failover discipline

    • route high-urgency flow away from storming hosts
    • keep host health score in pre-trade routing logic

Validation drills

  1. Historical replay drill

    • replay known IPI-storm windows and verify early WATCH detection
  2. Counterfactual dispatch drill

    • compare naive catch-up vs bounded recovery pacing
  3. Confounder drill

    • separate kernel-interference effects from exchange/network-wide events
  4. Failover drill

    • validate stateful reroute to healthy hosts without tactic thrash

Anti-patterns


Bottom line

TLB shootdown IPI storms are a real execution tax: not loud enough to look like outages, but large enough to erode basis points through timing quality decay.

If you do not model host-interference uplift explicitly, your slippage controller will misattribute losses to market noise and overfit the wrong levers.