RX Interrupt Coalescing as a Hidden Slippage Engine (Practical Playbook)

2026-03-16 · finance

RX Interrupt Coalescing as a Hidden Slippage Engine (Practical Playbook)

Date: 2026-03-16
Category: research
Audience: low-latency execution teams running Linux NIC stacks (kernel or user-space) between market data ingest and routing


Why this matters

Execution teams often model slippage with market-state features (spread, volatility, imbalance, queue signals) and strategy-state features (urgency, residual, participation).

But many desks still ignore a control-plane source of cost drift:

  1. market-data packets arrive at the NIC,
  2. the NIC coalesces interrupts (rx-usecs, rx-frames, adaptive modes),
  3. packets are delivered to software in bursts,
  4. decision loops and child-order emits phase-lock to those bursts,
  5. queue entry timing degrades and tail markouts widen.

This appears as “random latency noise” in TCA, while the root cause is often deterministic batching in the receive path.


1) Mechanism: how coalescing leaks into execution cost

Interrupt coalescing delays IRQ delivery until either:

So the application sees packet arrivals in clusters instead of near-original micro-timing.

Let true packet arrivals be ({t_i}), and app-visible arrivals be ({\tilde t_i}). Under coalescing:

[ \tilde t_i = t_i + \delta_i, ]

where (\delta_i) is state-dependent (traffic intensity, queue occupancy, coalescing settings, NAPI/softirq contention).

The desk then optimizes on (\tilde t_i), not (t_i):


2) Slippage branch model with coalescing distortion

For each parent order, model three execution branches:

  1. TIMELY branch: low delivery distortion, normal queue interaction, cost (C_T)
  2. BURST-SYNC branch: clustered decisions/dispatch, queue competition rises, cost (C_B)
  3. LATE-RECOVERY branch: residual catch-up after stale windows, aggressive cleanup, cost (C_L)

Expected cost:

[ E[C] = p_T C_T + p_B C_B + p_L C_L, ]

with typical ordering (C_T < C_B < C_L).

Most desks only optimize (C_T)-centric behavior. The practical gain comes from reducing (p_B) and (p_L).


3) Detection metrics (new KPI set)

3.1 Market Data Distortion Ratio (MDDR)

Compare ingress-level inter-arrival structure (packet capture / NIC timestamp) vs app-level event spacing:

[ MDDR = \frac{Q95(\Delta \tilde t)}{Q95(\Delta t)}. ]

Persistent (MDDR \gg 1) indicates receive-path time dilation.

3.2 Burst Delivery Concentration (BDC)

In short windows (w) (e.g., 100–500 (\mu s)):

[ BDC = \frac{\max_w N_w}{\sum_w N_w}, ]

where (N_w) is delivered market-data events in window (w). High BDC means software sees compressed bursts.

3.3 Decision-Burst Coupling (DBC)

Correlation between delivery bursts and child-order emit bursts. High DBC means execution cadence is being driven by NIC/software batching rather than market intent.

3.4 Coalescing Tax Estimate (CTE)

[ CTE_{\tau} = E[M_{\tau}\mid \text{high BDC}] - E[M_{\tau}\mid \text{low BDC}], ]

for (\tau\in{1s,5s,30s}) markout horizons.

3.5 Data/Control Causality Drift (DCCD)

Fraction of episodes where control acknowledgements (ACK/drop-copy/risk updates) arrive with inconsistent ordering relative to market-data transition timing under heavy coalescing.


4) Feature contract additions for slippage models

Add control-plane features explicitly:

Without these, models misattribute control artifacts to “market regime shifts.”


5) Live state machine

Use a coalescing-aware execution controller:

  1. FLOW_CLEAN

    • low MDDR/BDC/DBC
    • standard tactic menu
  2. DELIVERY_CLUSTERED

    • rising BDC + mild CTE
    • reduce cancel/replace churn, smooth dispatch
  3. PHASE_LOCKED

    • high DBC + adverse CTE
    • anti-burst pacing, tighter aggression gates, reserve capacity for cancels/exits
  4. SAFE_PACING

    • sustained high CTE or DCCD breaches
    • prioritize completion reliability and tail containment over marginal spread capture

Require hysteresis and minimum dwell to avoid state flapping.


6) Controls that usually work

Control A — Separate low-latency feed paths from non-critical traffic

Avoid sharing queue/CPU paths between latency-critical market data and background/control traffic where possible.

Control B — Coalescing policy by traffic class

For low-latency feed handlers, test lower rx-usecs / rx-frames or non-adaptive settings. Throughput-optimized defaults are often too batchy for micro-timing-sensitive execution.

Control C — NAPI/CPU affinity hygiene

Pin IRQ and poll-heavy paths to stable CPU sets; reduce contention from unrelated workloads.

Control D — Anti-burst dispatch shaping

If delivery bursts are unavoidable, shape child-order emission with bounded micro-jitter and debt-aware pacing to break phase-lock loops.

Control E — Causality guardrails

When DCCD breaches threshold, down-weight fragile microstructure features and shift to robust fallback tactics until ordering confidence recovers.


7) Validation protocol

Offline replay

Shadow mode

Canary rollout

Promotion gates:


8) Failure patterns to avoid

  1. Single throughput-tuned NIC profile for all workloads
    Great for bulk traffic, harmful for micro-timing-sensitive routing.

  2. Adaptive coalescing with no observability
    Auto-tuned settings can drift silently across regimes.

  3. Treating packet batching as harmless jitter
    It can systematically alter queue-entry timing and branch probabilities.

  4. No dual-timeline instrumentation
    If you only log app timestamps, coalescing effects are nearly invisible.

  5. Mean-only optimization
    Tail costs (q95/q99) are where batching damage concentrates.


9) 10-day implementation plan

Days 1–2
Instrument ingress-vs-app timing and queue-level coalescing metadata.

Days 3–4
Build MDDR/BDC/DBC/CTE dashboards by strategy-symbol-venue.

Days 5–6
Estimate branch model (TIMELY / BURST-SYNC / LATE-RECOVERY).

Days 7–8
Enable controller in shadow mode with anti-burst shaping and causality guards.

Day 9
Canary with hard rollback criteria.

Day 10
Publish v1 runbook; schedule weekly recalibration by regime.


Bottom line

RX interrupt coalescing is not just a systems tuning detail. In low-latency execution, it can become a first-class slippage driver by reshaping event timing, synchronizing decision bursts, and inflating tail cleanup cost.

If you measure and control coalescing-induced clustering explicitly, you can often recover meaningful basis points without touching alpha logic.


References