TCP Autocorking & Not-Sent Queue Drain-Burst Slippage Playbook

2026-03-23 · finance

TCP Autocorking & Not-Sent Queue Drain-Burst Slippage Playbook

Date: 2026-03-23
Category: research
Scope: How Linux TCP autocorking + not-sent queue buildup creates hidden freeze→flush child-order cadence and tail slippage

Why this matters

Many execution stacks optimize for median wire latency and miss a transport behavior that quietly taxes tails: small writes are intentionally coalesced, then flushed in bursts.

On Linux, tcp_autocorking may hold tiny writes briefly to improve packet efficiency. Combined with application pacing jitter or busy sender threads, this can accumulate bytes in the TCP not-sent queue. When flush conditions trigger, child orders can leave in a compressed burst.

That burstiness degrades queue priority and increases impact/markout risk exactly when strategies think they are sending "smoothly".


Failure mechanism (operator timeline)

  1. Strategy emits frequent small writes (child-order/control frames).
  2. Kernel autocork logic defers immediate packetization for efficiency.
  3. Not-sent queue depth grows during micro-pauses or scheduling hiccups.
  4. Flush trigger fires (ACK progression, send-path wakeup, uncork-like condition).
  5. Accumulated bytes drain rapidly across few scheduling quanta.
  6. Venue sees clustered child arrivals instead of intended cadence.
  7. Queue-age and adverse-selection penalties rise.

This is a transport pacing artifact, not pure market microstructure drift.


Extend slippage decomposition with a cork/queue term

[ IS = IS_{market} + IS_{impact} + IS_{timing} + IS_{fees} + \underbrace{IS_{cork}}_{\text{autocork + not-sent drain tax}} ]

Operational approximation:

[ IS_{cork,t} \approx a\cdot NSQ95_t + b\cdot FCI_t + c\cdot BDR_t + d\cdot CJD_t + e\cdot NUD_t ]

Where:


Production metrics to add

1) Not-Sent Queue p95 (NSQ95)

[ NSQ95 = p95(\text{tcpi_notsent_bytes}) ]

Track by strategy, symbol bucket, and urgency tier.

2) Flush Compression Index (FCI)

[ FCI = \frac{p95(\Delta N_{wire}/\Delta t)}{p50(\Delta N_{wire}/\Delta t)} ]

Measures how sharply transmission rate spikes during drain windows.

3) Burst-Drain Ratio (BDR)

[ BDR = \frac{\text{bytes drained in top 1% drain windows}}{\text{total sent bytes}} ]

High BDR means most outbound flow is concentrated in short bursts.

4) Child Join Delay (CJD)

[ CJD = p95(t_{wire}-t_{decision}) - p50(t_{wire}-t_{decision}) ]

Captures coalescing-driven tail expansion beyond baseline jitter.

5) Not-Sent Urgency Divergence (NUD)

Urgency-weighted average not-sent bytes. Detects when high-priority flow is delayed behind transport accumulation.

6) Compression-Conditioned Markout Delta (CCMD)

Matched-cohort post-fill markout delta between COMPRESSED_DRAIN vs SMOOTH_DRAIN windows.


Modeling architecture

Stage 1: transport-compression regime detector

Inputs:

Output:

Stage 2: conditional slippage model

Predict (E[IS]), tail IS, and completion risk conditioned on regime probability.

Key interaction:

[ \Delta IS \sim \beta_1 urgency + \beta_2 comp + \beta_3(urgency \times comp) ]

Urgent schedules usually pay the largest tax when compression is high.


Controller state machine

GREEN — SMOOTH_DRAIN

YELLOW — ACCUMULATION_RISK

ORANGE — COMPRESSED_DRAIN

RED — CONTAINMENT

Use hysteresis + minimum dwell time to avoid policy thrash.


Engineering mitigations (high ROI first)

  1. Measure before toggling knobs
    Add socket-level tcpi_notsent_bytes telemetry and decision→wire traces first.

  2. Stabilize application write cadence
    Avoid pathological tiny-write storms; use deterministic micro-batching rather than accidental coalescing.

  3. Review tcp_autocorking behavior in low-latency paths
    Consider targeted disable/override testing for latency-critical sockets.

  4. Tune tcp_notsent_lowat with guardrails
    Prevent excessive hidden backlog while preserving throughput where needed.

  5. Pair TCP_NODELAY decisions with pacing discipline
    Blindly disabling Nagle/autocork can shift cost into packet-rate overhead; validate end-to-end.

  6. Add compression-aware routing penalties
    Down-rank routes/hosts showing repeated compressed-drain signatures.


Validation protocol

  1. Label COMPRESSED_DRAIN windows via NSQ95 + FCI + BDR thresholds.
  2. Build matched cohorts by symbol, spread, volatility, participation, and urgency.
  3. Estimate uplift in mean and q95 slippage plus completion shortfall.
  4. Canary mitigations (cadence control, knob changes) on subset traffic.
  5. Promote only if tail improvements persist without unacceptable throughput regression.

Practical observability checklist

Success criterion: stable tail execution quality under sender-side coalescing pressure, not just good median send latency.


Pseudocode sketch

features = collect_transport_compression_features()  # NSQ95, FCI, BDR, CJD, NUD
p_comp = compression_model.predict_proba(features)
state = decode_compression_state(p_comp, features)

if state == "GREEN":
    params = baseline_policy()
elif state == "YELLOW":
    params = tighten_pacing_and_reduce_fragmentation()
elif state == "ORANGE":
    params = controlled_micro_batch_release()
else:  # RED
    params = containment_with_hard_tail_budget()

execute_with(params)
log(state=state, p_comp=p_comp)

Bottom line

Autocorking and not-sent queue dynamics can make a strategy look smooth in application logs while arriving bursty at the venue. Model this transport-compression branch explicitly, or tail slippage will keep masquerading as market noise.


References