IOMMU TLB Flush-Storm DMA-Remap Slippage Playbook

2026-03-19 · finance

IOMMU TLB Flush-Storm DMA-Remap Slippage Playbook

Why this exists

Execution stacks can pass ordinary CPU/network health checks and still leak p95/p99 implementation shortfall.

One under-modeled source is IOMMU translation pressure: NIC DMA mappings/unmappings trigger IOTLB invalidations, and under bursty traffic or allocator churn this can become a flush storm.

When that happens, packet/descriptor handling latency becomes uneven, and order-path timing starts paying an invisible tax.


Core failure mode

Under high packet turnover and frequent DMA map updates:

Result: tail slippage rises even if average RTT/CPU still looks "normal."


Slippage decomposition with IOMMU term

For parent order (i):

[ IS_i = C_{delay} + C_{impact} + C_{miss} + C_{iommu} ]

Where:

[ C_{iommu} = C_{dma-jitter} + C_{service-burst} + C_{queue-decay} ]


Feature set (production-ready)

1) Host / DMA-path features

2) Execution timing features

3) Outcome features


Model architecture

Use baseline + remap-overlay design:

  1. Baseline slippage model
    • spread/impact/fill/deadline stack
  2. IOMMU pressure overlay
    • predicts incremental uplift:
      • delta_is_mean
      • delta_is_q95

Final estimate:

[ \hat{IS}{final} = \hat{IS}{baseline} + \Delta\hat{IS}_{iommu} ]

Train with matched market windows (symbol/session/volatility/liquidity bucket) across different remap-pressure states to isolate infra effects from market confounders.


Regime controller

State A: MAP_STABLE

State B: PRESSURE_WATCH

State C: FLUSH_STORM

State D: SAFE_DMA_CONTAIN

Use hysteresis + minimum dwell time to prevent policy flapping.


Desk metrics

Track by host pool, NIC model/driver, NUMA placement, symbol-liquidity bucket, and session segment.


Mitigation ladder

  1. Mapping churn reduction
    • prefer stable DMA mapping strategies and buffer lifecycle discipline
  2. NUMA and queue locality hygiene
    • align NIC queues, CPU affinity, and memory locality
  3. Burst-containment execution policy
    • bounded catch-up pacing over panic flushes
  4. Topology-aware routing
    • route urgent flow away from hosts/queues with rising DFI/PSO
  5. Change-aware recalibration
    • re-fit overlay after kernel/NIC-driver/IOMMU config updates

Failure drills (must run)

  1. Flush-burst replay drill
    • verify early transition to PRESSURE_WATCH
  2. Storm containment drill
    • confirm bounded recovery beats panic catch-up on q95 IS
  3. Confounder separation drill
    • distinguish remap-pressure effects from pure venue/network latency shocks
  4. Fallback path drill
    • validate safe reroute to low-pressure host/queue pools under stress

Anti-patterns


Bottom line

IOMMU is often viewed as a security/performance toggle, but in execution systems the real issue is translation-pressure dynamics.

If IOTLB flush storms are not modeled as a slippage factor, tail cost will keep leaking through “normal-looking” infra dashboards.