SoftIRQ Backlog + NAPI Budget-Overrun Slippage Playbook

2026-03-19 · finance

SoftIRQ Backlog + NAPI Budget-Overrun Slippage Playbook

Why this exists

Execution hosts can pass basic health checks (CPU%, median latency, no packet loss alarms) and still leak p95/p99 implementation shortfall.

One frequent blind spot is receive-path backlog pressure in Linux networking:

If this is not modeled explicitly, desks often label it as “random market turbulence” while the host is injecting a repeatable timing tax.


Core failure mode

Under bursty feed conditions, the kernel receive path can become phase-unstable:

  1. NIC interrupts/NAPI polls deliver packets faster than softirq can drain.
  2. Per-CPU softnet backlog rises (/proc/net/softnet_stat).
  3. Polling cycles hit netdev_budget / netdev_budget_usecs limits.
  4. Work is deferred to next softirq rounds (or ksoftirqd under stress).
  5. Data age distribution widens; stale packets are processed “too late but still valid.”
  6. Child-order timing clusters and misses best-queue windows.

Result: tail slippage inflation with seemingly normal medians.


Slippage decomposition with backlog term

For parent order (i):

[ IS_i = C_{delay} + C_{impact} + C_{miss} + C_{rx-backlog} ]

Where:

[ C_{rx-backlog} = C_{stale-signal} + C_{dispatch-phase} + C_{queue-reset} ]


Feature set (production-ready)

1) Kernel/network pressure features

2) Execution timing features

3) Outcome features


Practical metrics

Track by host, CPU isolation profile, NIC queue mapping, and session segment.


Model architecture

Use a baseline + infra-overlay design:

  1. Baseline slippage model
    • spread/impact/urgency/deadline under healthy infra assumptions
  2. RX backlog overlay
    • predicts incremental mean/tail IS uplift from SBI/TSE/FA95/DBI

Final estimator:

[ \hat{IS}{final} = \hat{IS}{baseline} + \Delta\hat{IS}_{rx-backlog} ]

Train in matched windows (symbol liquidity, volatility regime, session phase) to avoid confounding infra stress with market-state shifts.


Regime controller

State A: CLEAR

State B: PRESSURED

State C: SATURATED

State D: SAFE_CONTAIN

Use hysteresis + minimum dwell time to prevent policy flapping.


Mitigation ladder

  1. Queue/CPU topology hygiene
    • align RSS queueing, IRQ affinity, and execution thread pinning
  2. NAPI budget tuning by host class
    • adjust netdev_budget / netdev_budget_usecs with canary guardrails
  3. Softirq isolation strategy
    • keep noisy workloads off critical RX CPUs
  4. Backpressure-aware execution pacing
    • cap self-inflicted burstiness when SBI rises
  5. Post-change recalibration
    • retrain overlay after kernel/NIC driver/queue-map changes

Failure drills (must run)

  1. Burst replay drill
    • synthetic feed spikes to validate state transitions (CLEAR -> PRESSURED -> SATURATED)
  2. Budget-sensitivity drill
    • A/B test budget settings and compare FA95/DBI/RUL tails
  3. Queue-map drift drill
    • verify routing still degrades gracefully after IRQ/RSS changes
  4. Containment reroute drill
    • prove deterministic fallback to low-pressure hosts under sustained saturation

Anti-patterns


Bottom line

Softirq backlog and NAPI budget exhaustion are not mere OS tuning trivia in low-latency execution.

They are regime variables that reshape feed freshness, dispatch cadence, and queue outcomes. Modeling them explicitly converts invisible infra drag into measurable, controllable slippage risk.