Path-MTU Black-Hole & MSS-Collapse Slippage Playbook

Date: 2026-03-22
Category: research
Scope: How PMTU discovery failure (ICMP/PTB blind spots) creates hidden decision-to-fill latency tails and slippage drift

Why this matters

Many execution systems treat transport latency as a smooth background process.

But PMTU failure creates a branching transport regime:

larger packets get dropped on a path bottleneck,
sender waits for retransmission/timeout and eventually shrinks MSS,
send cadence shifts from smooth flow to stall→burst behavior.

In practice, this looks like random microstructure toxicity, while root cause is often path-level packetization failure.

Failure mechanism (operator timeline)

Route path includes a lower-MTU segment (overlay, tunnel, VPN, middlebox path).
Sender transmits packets sized for a larger MTU belief.
PMTU signal is missing, filtered, delayed, or distrusted (classic black-hole condition).
Larger packets repeatedly fail; retransmission and RTO pressure rise.
Stack falls back to smaller effective MSS (or probes down/up slowly).
Order-flow dispatch cadence becomes discontinuous; child-order timing drifts.
Queue priority decays and deadline urgency overpays into thinner books.

This is not a strategy bug; it is a transport-state regime shift.

Extend slippage decomposition with PMTU-blackhole term

[ IS = IS_{market} + IS_{impact} + IS_{timing} + IS_{fees} + \underbrace{IS_{pmtu}}_{\text{PMTU black-hole tax}} ]

Operational approximation:

[ IS_{pmtu,t} \approx a\cdot LSR_t + b\cdot RTO95_t + c\cdot MFD_t + d\cdot PRL_t + e\cdot SBC_t ]

Where:

(LSR): large-segment retransmission rate,
(RTO95): p95 retransmission-timeout burden,
(MFD): MSS fallback depth,
(PRL): probe recovery latency (time to regain stable MSS),
(SBC): send-burst compression after stall windows.

What to measure in production

1) Large-Segment Retransmission Rate (LSR)

[ LSR = \frac{#(retransmissions;on;segments;>;MSS_{safe})}{#(all;segments;>;MSS_{safe})} ]

Rising LSR with stable exchange-side health is a strong PMTU stress hint.

2) MSS Fallback Depth (MFD)

[ MFD = 1 - \frac{MSS_{effective}}{MSS_{baseline}} ]

Large MFD indicates costly downshift from expected wire efficiency.

3) Probe Recovery Latency (PRL)

Time from first black-hole signature to restored stable effective MSS.

Long PRL means prolonged degraded execution cadence.

4) Send-Burst Compression (SBC)

[ SBC = \frac{p95(\Delta t_{child_send})}{p50(\Delta t_{child_send})} ]

SBC expansion captures stall→flush packetization behavior leaking into execution timing.

5) Decision-to-Wire Tail Expansion (DWT95/99)

Primary KPI for policy impact. Compare PMTU_STABLE vs PMTU_STRESS windows by cohort.

6) Markout Degradation Under PMTU Stress (MDP)

Matched-cohort post-fill markout delta between normal vs PMTU-stress episodes.

Minimal model architecture

Stage 1: PMTU stress classifier

Inputs:

retransmission profile by payload size,
effective MSS evolution,
RTO tails,
burst-compression metrics,
path/route class metadata (tunnel/VPN/overlay tags).

Output:

(P(\text{PMTU_STRESS}))

Stage 2: Conditional cost model

Predict:

(E[IS]), (q95(IS)), completion risk conditioned on PMTU stress probability.

Include interaction term:

[ \Delta IS \sim \beta_1 urgency + \beta_2 pmtu + \beta_3(urgency \times pmtu) ]

Urgency tends to become most expensive exactly when PMTU instability is active.

Controller state machine

GREEN — PMTU_STABLE

Stable MSS, low retransmission tails
Normal execution policy

YELLOW — PMTU_SUSPECT

LSR/RTO rising, early MSS instability
Actions:
- reduce burst fan-out,
- tighten pacing jitter bounds,
- increase transport observability sampling.

ORANGE — PMTU_BLACKHOLE_LIKELY

Persistent large-packet failure + fallback behavior
Actions:
- switch to conservative packetization profile,
- reduce aggression on thin books,
- prioritize robust route class (known-good path).

RED — PMTU_CONTAINMENT

Repeated collapse/reprobe loops + slippage tail blowout
Actions:
- containment execution mode,
- strict participation caps,
- incident escalation with packet evidence.

Apply hysteresis + minimum dwell time to prevent policy thrash.

Engineering mitigations (ROI order)

Enable Packetization-Layer PMTU probing where appropriate
Linux: net.ipv4.tcp_mtu_probing (0/1/2) with explicit policy.
Tune probe controls intentionally
tcp_base_mss, tcp_mtu_probe_floor, tcp_probe_interval, tcp_probe_threshold should be reviewed for latency-critical links.
Audit ICMP/PTB handling across network boundaries
PMTU relies on receiving trustworthy path-size feedback (IPv4 frag-needed / IPv6 packet-too-big).
Use MSS clamping on known encapsulation edges
Tunnels/overlays frequently create hidden MTU cliffs; enforce conservative MSS at boundaries.
Canary network changes with PMTU telemetry gates
Promote only if LSR/RTO tails and MFD remain stable.
Tag and route by path reliability class
Treat PMTU reliability as a first-class route feature in execution stack decisions.

Validation protocol

Label PMTU stress windows using LSR + MFD + PRL thresholds.
Build matched cohorts by symbol, spread, volatility, participation, venue, and time bucket.
Estimate uplift in (E[IS]), (q95(IS)), and completion shortfall.
Run canary policy: probing/tuning + route-class controls.
Promote only when tail-cost reduction persists without unacceptable completion drag.

Practical observability checklist

Effective MSS time series per flow class
Retransmission by payload-size bucket
RTO distribution conditioned on PMTU state
Stall→burst send cadence metrics
Decision-to-wire latency split by PMTU state
Cohort markout comparison under PMTU stress
Packet captures around collapse/reprobe episodes

Success criterion: stable tail latency and fill quality during path-MTU disturbances, not just normal-window average throughput.

Pseudocode sketch

features = collect_pmtu_features()  # LSR, MFD, RTO95, PRL, SBC
p_stress = pmtu_stress_model.predict_proba(features)
state = decode_pmtu_state(p_stress, features)

if state == "GREEN":
    params = default_execution_policy()
elif state == "YELLOW":
    params = bounded_fanout_with_tighter_pacing()
elif state == "ORANGE":
    params = conservative_packetization_and_route_hardening()
else:  # RED
    params = containment_mode_with_tail_budget_lock()

execute_with(params)
log(state=state, p_stress=p_stress)

Bottom line

PMTU failures are a hidden transport tax: they turn packetization assumptions into latency regime shifts, then into execution slippage tails.

Model PMTU stress as a first-class feature, instrument collapse/recovery dynamics, and wire explicit controller actions before path-level packet loss silently bills your basis points.

References

RFC 1191 — Path MTU Discovery for IPv4:
https://www.rfc-editor.org/rfc/rfc1191
RFC 8201 — Path MTU Discovery for IPv6:
https://www.rfc-editor.org/rfc/rfc8201
RFC 4821 — Packetization Layer PMTU Discovery (PLPMTUD):
https://www.rfc-editor.org/rfc/rfc4821
RFC 8899 — Datagram PLPMTUD (DPLPMTUD) update/extension context:
https://www.rfc-editor.org/rfc/rfc8899
Linux kernel IP/TCP sysctl documentation (tcp_mtu_probing, probe controls, PMTU behavior):
https://www.kernel.org/doc/html/latest/networking/ip-sysctl.html