Net DIM Adaptive-Interrupt Oscillation Slippage Playbook

Date: 2026-03-22
Category: research
Scope: How dynamic interrupt moderation (DIM) mode-flips in Linux NIC drivers create tail-latency bursts and execution slippage

Why this matters

Most low-latency teams tune NIC coalescing once and move on.

But when adaptive moderation (Net DIM / driver-specific DIM logic) is enabled, the NIC+driver can continuously move between interrupt profiles (e.g., low-usec/high-irq vs high-usec/low-irq). Under mixed burst regimes, this can become a flip-flop control loop instead of stable optimization.

For execution stacks, that looks like:

sudden p95/p99 decision-to-wire delay expansion,
clustered child-order dispatch instead of smooth cadence,
queue-position decay on passive flow,
urgency overshoot and higher implementation shortfall.

This is not a hard outage. It is a state-dependent latency tax that hides inside “normal adaptive behavior.”

Failure mechanism (one timeline)

Market-data/order-ack load alternates between microbursts and lulls.
DIM classifier marks one interval as throughput-favoring (more moderation).
Next interval flips to latency-favoring (less moderation).
Profile keeps hopping (left/right in moderation space) before the system settles.
Softirq/NAPI service-time variance increases; packet release becomes bursty.
Strategy reacts to stale or phase-shifted market snapshots.
Slippage tails rise even if median latency barely moves.

Pathology: oscillatory moderation creates cadence distortion.

Extend slippage decomposition with DIM oscillation tax

[ IS = IS_{market} + IS_{impact} + IS_{timing} + IS_{fees} + \underbrace{IS_{dim}}_{\text{adaptive moderation tax}} ]

Practical approximation:

[ IS_{dim,t} \approx a\cdot PFS_t + b\cdot FRS_t + c\cdot JI95_t + d\cdot SDR_t ]

Where:

(PFS): profile-flip score (how often moderation profile changes),
(FRS): flip-reversal score (A→B→A churn rate),
(JI95): inter-interrupt/jitter p95,
(SDR): send-debatch ratio (actual dispatch clumping vs intended cadence).

What to measure (production features)

1) Profile Flip Score (PFS)

[ PFS = \frac{#(profile\ changes)}{\Delta t} ]

Use per RX/TX queue and aggregate with traffic-weighted average.

2) Flip Reversal Score (FRS)

[ FRS = \frac{#(A\to B\to A\ \text{within short window})}{\Delta t} ]

Separates healthy adaptation from unstable oscillation.

3) Moderation Span (MS)

Difference between max and min effective coalescing usec visited in a short bucket.

High MS + high PFS is usually the most toxic combination.

4) Inter-Interrupt Jitter p95/p99 (JI95/JI99)

Compute per queue from IRQ timestamp deltas. Rising tails indicate unstable pacing.

5) NAPI Cycle Skew (NCS)

Variance of packets-per-poll and time-per-poll across consecutive cycles.

6) Dispatch Clump Factor (DCF)

[ DCF = \frac{\text{p95 child-order inter-send gap}}{\text{p50 child-order inter-send gap}} ]

Inflates when dispatch becomes lumpy.

7) Decision-to-Wire Tail (DWT95/DWT99)

Critical execution latency metric; should be conditioned on DIM regime.

8) DIM Stress Markout Gap (DSMG)

Matched-cohort markout delta between dim_stress=1 and baseline windows.

Minimal model architecture

Stage 1: DIM stress classifier

Inputs:

PFS, FRS, MS,
JI95/JI99,
NCS,
softirq load asymmetry,
queue occupancy burstiness.

Output:

(P(\text{DIM_STRESS}))

Stage 2: Conditional execution cost model

Predict:

(E[IS]), (q95(IS)), completion risk conditioned on DIM stress state.

Key interaction:

[ \Delta IS \sim \beta_1,urgency + \beta_2,DIM_STRESS + \beta_3,(urgency \times DIM_STRESS) ]

Interpretation: urgent flow pays disproportionately during oscillation regimes.

Controller state machine

GREEN — DIM_STABLE

Low PFS/FRS, normal jitter tails
Default execution policy

YELLOW — DIM_DRIFT

PFS rising, occasional reversals
Actions:
- mild child-size trim (e.g., -5% to -10%),
- add micro-jitter to avoid synchronized sends,
- increase sensitivity of stale-signal guards.

ORANGE — DIM_OSCILLATION

High PFS + high FRS + DWT95 expansion
Actions:
- cap urgency catch-up,
- reduce fan-out concurrency,
- prefer simpler tactics less sensitive to sub-ms timing,
- temporarily pin fixed coalescing profile on critical queues (if operationally safe).

RED — DIM_UNSTABLE_TAIL

Persistent oscillation + markout degradation
Actions:
- containment mode for non-urgent parent flow,
- tighter participation caps,
- incident workflow: queue-level diagnostics + profile retune.

Use hysteresis + minimum dwell to avoid control flapping.

Engineering mitigations (highest ROI first)

Separate critical vs non-critical queues
Keep latency-sensitive execution traffic away from queues carrying noisy bulk/background flow.
Bound adaptation range
Narrow allowable moderation-profile span on execution-critical queues to prevent wide oscillation.
Tune sampling/decision cadence
DIM reacts to sample deltas; overly reactive settings can chase noise.
Queue-local policy, not host-global policy
Different queues have different burst structure; one-size adaptive policy is fragile.
Couple DIM telemetry into execution controller
Treat DIM regime as first-class model feature, not a postmortem artifact.
Run controlled burst-replay tests before rollout
Validate profile stability under synthetic open/close and event-driven bursts.

Validation protocol

Label dim_stress windows with thresholds over PFS/FRS/JI95.
Build matched cohorts by symbol, spread, volatility, participation, and time bucket.
Estimate (\Delta E[IS]), (\Delta q95(IS)), completion-risk uplift.
Shadow controller actions (no-trade-impact mode) first.
Promote only if out-of-sample tails improve without throughput collapse.

Practical observability checklist

Queue-level IRQ timestamps and profile index history
Per-queue coalescing settings snapshots over time
/proc/interrupts and softirq imbalance metrics
NAPI poll cycle duration/packet-count distribution
Decision-to-wire latency by queue and host
Fill quality conditioned on DIM regime labels

Success criterion: tail execution stability under bursty flow, not just higher average throughput.

Pseudocode sketch

features = collect_dim_features()  # PFS, FRS, MS, JI95, NCS, DWT95
p_stress = dim_stress_model.predict_proba(features)
state = decode_dim_state(p_stress, features)

if state == "GREEN":
    params = normal_policy()
elif state == "YELLOW":
    params = mild_clip_trim_and_send_jitter()
elif state == "ORANGE":
    params = cap_urgency_catchup_reduce_fanout()
else:  # RED
    params = containment_policy_with_tail_budget()

execute_with(params)
log(state=state, p_stress=p_stress)

Bottom line

Adaptive interrupt moderation is useful, but in mixed burst regimes it can become a control-loop instability source.

If you do not model DIM oscillation, you will misclassify infra-induced timing error as market randomness. Treat moderation state transitions as a live slippage feature, and attach explicit guardrails before tails eat your edge.

References

Linux kernel docs — Net DIM (Generic Network Dynamic Interrupt Moderation):
https://docs.kernel.org/networking/net_dim.html
Linux kernel docs — NAPI:
https://docs.kernel.org/networking/napi.html
Linux kernel docs — Scaling in the networking stack (RSS/RPS/RFS/XPS):
https://docs.kernel.org/networking/scaling.html
Linux kernel docs — Amazon ENA driver (adaptive moderation notes):
https://www.kernel.org/doc/html/latest/networking/device_drivers/ethernet/amazon/ena.html
ethtool manual (coalescing visibility/control):
https://man7.org/linux/man-pages/man8/ethtool.8.html