Net DIM Adaptive-Interrupt Oscillation Slippage Playbook
Date: 2026-03-22
Category: research
Scope: How dynamic interrupt moderation (DIM) mode-flips in Linux NIC drivers create tail-latency bursts and execution slippage
Why this matters
Most low-latency teams tune NIC coalescing once and move on.
But when adaptive moderation (Net DIM / driver-specific DIM logic) is enabled, the NIC+driver can continuously move between interrupt profiles (e.g., low-usec/high-irq vs high-usec/low-irq). Under mixed burst regimes, this can become a flip-flop control loop instead of stable optimization.
For execution stacks, that looks like:
- sudden p95/p99 decision-to-wire delay expansion,
- clustered child-order dispatch instead of smooth cadence,
- queue-position decay on passive flow,
- urgency overshoot and higher implementation shortfall.
This is not a hard outage. It is a state-dependent latency tax that hides inside βnormal adaptive behavior.β
Failure mechanism (one timeline)
- Market-data/order-ack load alternates between microbursts and lulls.
- DIM classifier marks one interval as throughput-favoring (more moderation).
- Next interval flips to latency-favoring (less moderation).
- Profile keeps hopping (left/right in moderation space) before the system settles.
- Softirq/NAPI service-time variance increases; packet release becomes bursty.
- Strategy reacts to stale or phase-shifted market snapshots.
- Slippage tails rise even if median latency barely moves.
Pathology: oscillatory moderation creates cadence distortion.
Extend slippage decomposition with DIM oscillation tax
[ IS = IS_{market} + IS_{impact} + IS_{timing} + IS_{fees} + \underbrace{IS_{dim}}_{\text{adaptive moderation tax}} ]
Practical approximation:
[ IS_{dim,t} \approx a\cdot PFS_t + b\cdot FRS_t + c\cdot JI95_t + d\cdot SDR_t ]
Where:
- (PFS): profile-flip score (how often moderation profile changes),
- (FRS): flip-reversal score (AβBβA churn rate),
- (JI95): inter-interrupt/jitter p95,
- (SDR): send-debatch ratio (actual dispatch clumping vs intended cadence).
What to measure (production features)
1) Profile Flip Score (PFS)
[ PFS = \frac{#(profile\ changes)}{\Delta t} ]
Use per RX/TX queue and aggregate with traffic-weighted average.
2) Flip Reversal Score (FRS)
[ FRS = \frac{#(A\to B\to A\ \text{within short window})}{\Delta t} ]
Separates healthy adaptation from unstable oscillation.
3) Moderation Span (MS)
Difference between max and min effective coalescing usec visited in a short bucket.
High MS + high PFS is usually the most toxic combination.
4) Inter-Interrupt Jitter p95/p99 (JI95/JI99)
Compute per queue from IRQ timestamp deltas. Rising tails indicate unstable pacing.
5) NAPI Cycle Skew (NCS)
Variance of packets-per-poll and time-per-poll across consecutive cycles.
6) Dispatch Clump Factor (DCF)
[ DCF = \frac{\text{p95 child-order inter-send gap}}{\text{p50 child-order inter-send gap}} ]
Inflates when dispatch becomes lumpy.
7) Decision-to-Wire Tail (DWT95/DWT99)
Critical execution latency metric; should be conditioned on DIM regime.
8) DIM Stress Markout Gap (DSMG)
Matched-cohort markout delta between dim_stress=1 and baseline windows.
Minimal model architecture
Stage 1: DIM stress classifier
Inputs:
- PFS, FRS, MS,
- JI95/JI99,
- NCS,
- softirq load asymmetry,
- queue occupancy burstiness.
Output:
- (P(\text{DIM_STRESS}))
Stage 2: Conditional execution cost model
Predict:
- (E[IS]), (q95(IS)), completion risk conditioned on DIM stress state.
Key interaction:
[ \Delta IS \sim \beta_1,urgency + \beta_2,DIM_STRESS + \beta_3,(urgency \times DIM_STRESS) ]
Interpretation: urgent flow pays disproportionately during oscillation regimes.
Controller state machine
GREEN β DIM_STABLE
- Low PFS/FRS, normal jitter tails
- Default execution policy
YELLOW β DIM_DRIFT
- PFS rising, occasional reversals
- Actions:
- mild child-size trim (e.g., -5% to -10%),
- add micro-jitter to avoid synchronized sends,
- increase sensitivity of stale-signal guards.
ORANGE β DIM_OSCILLATION
- High PFS + high FRS + DWT95 expansion
- Actions:
- cap urgency catch-up,
- reduce fan-out concurrency,
- prefer simpler tactics less sensitive to sub-ms timing,
- temporarily pin fixed coalescing profile on critical queues (if operationally safe).
RED β DIM_UNSTABLE_TAIL
- Persistent oscillation + markout degradation
- Actions:
- containment mode for non-urgent parent flow,
- tighter participation caps,
- incident workflow: queue-level diagnostics + profile retune.
Use hysteresis + minimum dwell to avoid control flapping.
Engineering mitigations (highest ROI first)
Separate critical vs non-critical queues
Keep latency-sensitive execution traffic away from queues carrying noisy bulk/background flow.Bound adaptation range
Narrow allowable moderation-profile span on execution-critical queues to prevent wide oscillation.Tune sampling/decision cadence
DIM reacts to sample deltas; overly reactive settings can chase noise.Queue-local policy, not host-global policy
Different queues have different burst structure; one-size adaptive policy is fragile.Couple DIM telemetry into execution controller
Treat DIM regime as first-class model feature, not a postmortem artifact.Run controlled burst-replay tests before rollout
Validate profile stability under synthetic open/close and event-driven bursts.
Validation protocol
- Label
dim_stresswindows with thresholds over PFS/FRS/JI95. - Build matched cohorts by symbol, spread, volatility, participation, and time bucket.
- Estimate (\Delta E[IS]), (\Delta q95(IS)), completion-risk uplift.
- Shadow controller actions (no-trade-impact mode) first.
- Promote only if out-of-sample tails improve without throughput collapse.
Practical observability checklist
- Queue-level IRQ timestamps and profile index history
- Per-queue coalescing settings snapshots over time
- /proc/interrupts and softirq imbalance metrics
- NAPI poll cycle duration/packet-count distribution
- Decision-to-wire latency by queue and host
- Fill quality conditioned on DIM regime labels
Success criterion: tail execution stability under bursty flow, not just higher average throughput.
Pseudocode sketch
features = collect_dim_features() # PFS, FRS, MS, JI95, NCS, DWT95
p_stress = dim_stress_model.predict_proba(features)
state = decode_dim_state(p_stress, features)
if state == "GREEN":
params = normal_policy()
elif state == "YELLOW":
params = mild_clip_trim_and_send_jitter()
elif state == "ORANGE":
params = cap_urgency_catchup_reduce_fanout()
else: # RED
params = containment_policy_with_tail_budget()
execute_with(params)
log(state=state, p_stress=p_stress)
Bottom line
Adaptive interrupt moderation is useful, but in mixed burst regimes it can become a control-loop instability source.
If you do not model DIM oscillation, you will misclassify infra-induced timing error as market randomness. Treat moderation state transitions as a live slippage feature, and attach explicit guardrails before tails eat your edge.
References
- Linux kernel docs β Net DIM (Generic Network Dynamic Interrupt Moderation):
https://docs.kernel.org/networking/net_dim.html - Linux kernel docs β NAPI:
https://docs.kernel.org/networking/napi.html - Linux kernel docs β Scaling in the networking stack (RSS/RPS/RFS/XPS):
https://docs.kernel.org/networking/scaling.html - Linux kernel docs β Amazon ENA driver (adaptive moderation notes):
https://www.kernel.org/doc/html/latest/networking/device_drivers/ethernet/amazon/ena.html - ethtool manual (coalescing visibility/control):
https://man7.org/linux/man-pages/man8/ethtool.8.html