XPS TX-Queue Polarization & Wire-Cadence Slippage Playbook

Why this exists

Execution stacks can look healthy on classic dashboards (median decision latency, low drop rate, acceptable CPU headroom) while still bleeding p95/p99 implementation shortfall.

A frequent blind spot is TX-path queue polarization:

CPU→TX queue mappings (XPS) drift away from thread/IRQ reality,
a subset of TX rings becomes chronically hot,
qdisc -> driver -> NIC dequeue cadence turns bursty,
cancel/replace/order-send timing loses microstructure phase alignment,
queue-priority outcomes decay in tails.

This is an infra-originated slippage tax that is often mislabeled as "alpha decay" or "random venue noise."

Core failure mode

XPS maps too many active senders onto a small set of TX queues.
Hot TX queues absorb bursty enqueue pressure; cold queues idle.
Driver/NIC completion locality diverges from application locality, increasing lock/cache contention.
Wire-time spacing becomes uneven (packet bunching + short droughts).
Order/cancel cadence phase-shifts versus true order-book replenishment cadence.
Passive queue-capture probability falls; corrective aggression rises.

Result: tail slippage inflation with deceptively stable medians.

Slippage decomposition with TX-polarization term

For parent order (i):

[ IS_i = C_{impact} + C_{timing} + C_{routing} + C_{tx-pol} ]

Where:

[ C_{tx-pol} = C_{wire-jitter} + C_{completion-drift} + C_{queue-miss} ]

(C_{wire-jitter}): submit→wire timing variance from TX ring hot spots
(C_{completion-drift}): ACK/completion timing distortion from poor locality/lock contention
(C_{queue-miss}): adverse queue-priority outcomes after cadence mismatch

Production feature set

1) TX queue / kernel features

XPS maps per queue:
- /sys/class/net/<dev>/queues/tx-<n>/xps_cpus
- /sys/class/net/<dev>/queues/tx-<n>/xps_rxqs
per-queue TX packet/byte counters (ethtool -S <dev>)
per-CPU NET_TX softirq load (/proc/softirqs)
TX completion IRQ distribution (/proc/interrupts)
qdisc backlog/dequeue/requeue stats (tc -s qdisc show dev <dev>)
BQL pressure hints (/sys/class/net/<dev>/queues/tx-<n>/byte_queue_limits/*, where supported)

2) execution-timing features

decision→send syscall latency quantiles
send syscall→NIC timestamp/wire proxy quantiles
cancel/replace submit spacing CV and burst scores
completion/ACK delay conditioned on TX-queue-hotness regime

3) outcome features

passive fill ratio in BALANCED vs POLARIZED windows
short-horizon markout ladder (10ms/100ms/1s/5s)
incremental IS by urgency bucket under equal market state

Practical metrics (new)

TQCI (TX Queue Concentration Index) [ TQCI = \frac{\max_q \lambda^{tx}_q}{\frac{1}{Q}\sum_q \lambda^{tx}_q} - 1 ] where (\lambda^{tx}_q) is per-queue TX packet rate.
XMD (XPS Map Drift) distance between configured CPU→queue map and observed sender/IRQ affinity reality.
WCV95 (Wire Cadence Variability p95) p95 coefficient-of-variation of inter-send/inter-wire spacing in rolling windows.
CDI (Completion Drift Index) divergence between enqueue-time and completion-time locality/timing distributions.
RPU-TX (Realized Polarization Uplift, TX) matched-window tail-IS uplift attributable to TX polarization.

Track by host, NIC/driver, kernel version, XPS profile, qdisc profile, and strategy cohort.

Identification strategy (causal, not just correlation)

Use a matched-window design:

Match on spread, volatility, participation, urgency, and session segment.
Compare high-TQCI/XMD windows vs low-TQCI/XMD windows within same host class.
Add host and strategy fixed effects with interactions (TQCI × urgency, XMD × volatility).
Run controlled canaries by rebalancing XPS maps (CPU and/or RXQ-based) while holding strategy logic constant.

If tail IS improves after map rebalance and cadence metrics normalize, the uplift is infra-causal.

Regime controller

State A: `TX_BALANCED`

low TQCI/XMD/WCV95
normal pacing and placement policy

State B: `TX_DRIFT`

moderate concentration + rising cadence variance
reduce cancel churn, modestly smooth child cadence

State C: `TX_POLARIZED`

sustained hot rings + completion drift
tighter burst caps, shorter passive horizon, stronger queue-risk penalties

State D: `TX_CONTAIN`

persistent polarization + deadline stress
reroute urgent flow to cleaner hosts/queues; prioritize certainty over queue capture

Use hysteresis + minimum dwell times to avoid policy flapping.

Mitigation ladder (ops + model)

Audit and flatten XPS mapping
- rebalance xps_cpus / xps_rxqs so active sender sets are not over-collapsed.
Align completion locality
- verify TX completion IRQ affinity and app thread pinning coherence.
Control qdisc pacing behavior
- validate fq (or chosen qdisc) settings (quantum, maxrate, pacing mode) against burst envelope.
Validate BQL behavior under burst load
- detect queue overfill/underfill oscillation and retune driver/stack knobs where possible.
Elevate TX-polarization features into live policy
- downshift tactical aggressiveness when TQCI/XMD/WCV95 breach guardrails.
Recalibrate after kernel/driver/qdisc changes
- infra upgrades invalidate old coefficients and thresholds.

Failure drills (must run)

Synthetic TX-map skew drill
- intentionally collapse many active CPUs into few TX queues in staging.
Completion-affinity drift drill
- perturb IRQ affinity and validate CDI detection + containment response.
Burst replay drill
- replay high-burst sessions and verify regime transitions suppress tail IS.
Rollback drill
- prove deterministic return to baseline XPS/qdisc profile and stable cadence.

Anti-patterns

Treating XPS as one-time bring-up config
Monitoring only aggregate TX throughput, ignoring per-queue concentration
Using average send latency while ignoring inter-send cadence tails
Tuning strategy logic without TX queue observability

Bottom line

RX-path balance is only half the story.

If TX queue polarization is left unmodeled, you get hidden cadence distortion that quietly taxes queue priority and markouts.

Make TX concentration and cadence first-class slippage features (TQCI/XMD/WCV95/CDI), and you convert a "mysterious tail" into an observable, controllable execution risk budget.

References

Linux networking scaling guide (RPS/RFS/XPS, sysfs maps): https://docs.kernel.org/networking/scaling.html
SMP IRQ affinity documentation: https://docs.kernel.org/core-api/irq/irq-affinity.html
ethtool manual (per-queue stats and queue config visibility): https://man7.org/linux/man-pages/man8/ethtool.8.html
tc-fq manual (pacing/qdisc behavior): https://man7.org/linux/man-pages/man8/tc-fq.8.html
LWN: XPS background (xps-mp / xps) https://lwn.net/Articles/409862/ and https://lwn.net/Articles/412062/
BQL background and queue-latency intuition: https://www.coverfire.com/articles/queueing-in-the-linux-network-stack/

XPS TX-Queue Polarization & Wire-Cadence Slippage Playbook

XPS TX-Queue Polarization & Wire-Cadence Slippage Playbook

Why this exists

Core failure mode

Slippage decomposition with TX-polarization term

Production feature set

1) TX queue / kernel features

2) execution-timing features

3) outcome features

Practical metrics (new)

Identification strategy (causal, not just correlation)

Regime controller

State A: TX_BALANCED

State B: TX_DRIFT

State C: TX_POLARIZED

State D: TX_CONTAIN

Mitigation ladder (ops + model)

Failure drills (must run)

Anti-patterns

Bottom line

References

State A: `TX_BALANCED`

State B: `TX_DRIFT`

State C: `TX_POLARIZED`

State D: `TX_CONTAIN`