SoftIRQ Backlog + NAPI Budget-Overrun Slippage Playbook
Why this exists
Execution hosts can pass basic health checks (CPU%, median latency, no packet loss alarms) and still leak p95/p99 implementation shortfall.
One frequent blind spot is receive-path backlog pressure in Linux networking:
- NET_RX softirq work accumulates,
- NAPI polling hits budget/time ceilings,
- packet processing spills into later cycles,
- market-data freshness decays in bursts,
- decision and dispatch timing dephase,
- queue priority decays before the strategy notices.
If this is not modeled explicitly, desks often label it as “random market turbulence” while the host is injecting a repeatable timing tax.
Core failure mode
Under bursty feed conditions, the kernel receive path can become phase-unstable:
- NIC interrupts/NAPI polls deliver packets faster than softirq can drain.
- Per-CPU softnet backlog rises (
/proc/net/softnet_stat). - Polling cycles hit
netdev_budget/netdev_budget_usecslimits. - Work is deferred to next softirq rounds (or ksoftirqd under stress).
- Data age distribution widens; stale packets are processed “too late but still valid.”
- Child-order timing clusters and misses best-queue windows.
Result: tail slippage inflation with seemingly normal medians.
Slippage decomposition with backlog term
For parent order (i):
[ IS_i = C_{delay} + C_{impact} + C_{miss} + C_{rx-backlog} ]
Where:
[ C_{rx-backlog} = C_{stale-signal} + C_{dispatch-phase} + C_{queue-reset} ]
- Stale-signal cost: decisions made on aged book snapshots
- Dispatch-phase cost: bursty send cadence from delayed processing
- Queue-reset cost: extra cancel/reprice/cross actions after timing miss
Feature set (production-ready)
1) Kernel/network pressure features
- softnet backlog growth/decay by CPU (
softnet_statrows) time_squeezeevent rate (budget/time exhaustion proxy)- dropped packet counters at softnet/NIC ring levels
- ksoftirqd runtime share vs direct softirq execution share
- NAPI poll cycle depth and per-cycle packet drain estimate
2) Execution timing features
- feed event age-at-decision quantiles (p50/p95/p99)
- decision-to-send latency quantiles by burst bucket
- cancel/replace ACK drift during elevated
time_squeeze - inter-dispatch burstiness index (DBI)
- open/close session-edge stress interaction terms
3) Outcome features
- passive fill ratio change vs backlog regime
- short-horizon markout ladder (10ms/100ms/1s/5s)
- completion shortfall under constant urgency target
- regime label:
CLEAR,PRESSURED,SATURATED,SAFE_CONTAIN
Practical metrics
- SBI (Softirq Backlog Index): normalized backlog pressure score
- TSE (Time Squeeze Events): budget/time overrun intensity
- FA95 (Feed Age p95): market-data age at decision p95
- DBI (Dispatch Burst Index): child-order cadence clustering score
- RUL (RX-path Uplift Loss): realized IS minus baseline IS during backlog stress
Track by host, CPU isolation profile, NIC queue mapping, and session segment.
Model architecture
Use a baseline + infra-overlay design:
- Baseline slippage model
- spread/impact/urgency/deadline under healthy infra assumptions
- RX backlog overlay
- predicts incremental mean/tail IS uplift from SBI/TSE/FA95/DBI
Final estimator:
[ \hat{IS}{final} = \hat{IS}{baseline} + \Delta\hat{IS}_{rx-backlog} ]
Train in matched windows (symbol liquidity, volatility regime, session phase) to avoid confounding infra stress with market-state shifts.
Regime controller
State A: CLEAR
- low SBI/TSE, tight feed-age distribution
- normal tactic selection and pacing
State B: PRESSURED
- rising backlog and intermittent time-squeeze
- reduce unnecessary cancel/reprice churn, apply mild pacing smoothing
State C: SATURATED
- sustained squeezes, widened FA95/DBI
- stricter burst caps, queue-preserving behavior, conservative urgency escalation
State D: SAFE_CONTAIN
- persistent saturation + deadline risk
- route urgent flow to clean hosts/queues, throttle tactical complexity, prioritize safe completion over optimistic queue capture
Use hysteresis + minimum dwell time to prevent policy flapping.
Mitigation ladder
- Queue/CPU topology hygiene
- align RSS queueing, IRQ affinity, and execution thread pinning
- NAPI budget tuning by host class
- adjust
netdev_budget/netdev_budget_usecswith canary guardrails
- adjust
- Softirq isolation strategy
- keep noisy workloads off critical RX CPUs
- Backpressure-aware execution pacing
- cap self-inflicted burstiness when SBI rises
- Post-change recalibration
- retrain overlay after kernel/NIC driver/queue-map changes
Failure drills (must run)
- Burst replay drill
- synthetic feed spikes to validate state transitions (
CLEAR->PRESSURED->SATURATED)
- synthetic feed spikes to validate state transitions (
- Budget-sensitivity drill
- A/B test budget settings and compare FA95/DBI/RUL tails
- Queue-map drift drill
- verify routing still degrades gracefully after IRQ/RSS changes
- Containment reroute drill
- prove deterministic fallback to low-pressure hosts under sustained saturation
Anti-patterns
- Trusting average CPU or median latency as sufficient health signal
- Ignoring
time_squeezebecause packet drops look low - Overfitting execution tactics while data-age distribution is unstable
- Applying one global kernel budget profile to all host roles
Bottom line
Softirq backlog and NAPI budget exhaustion are not mere OS tuning trivia in low-latency execution.
They are regime variables that reshape feed freshness, dispatch cadence, and queue outcomes. Modeling them explicitly converts invisible infra drag into measurable, controllable slippage risk.