Cross-Venue ACK Dispersion & Residual-Estimation Slippage Playbook
Why this matters
Most execution logic assumes your remaining parent size is known almost exactly in real time.
That assumption breaks in fragmented markets:
- child orders are sent to multiple venues,
- acknowledgements/fills arrive with different latencies,
- cancel confirms arrive even later,
- router state can be temporarily wrong about true residual.
When residual state is stale, the controller makes expensive choices:
- Under-estimates residual → slows down too early, then pays late catch-up convexity.
- Over-estimates residual → keeps pressing and risks overfill/self-competition churn.
- Ping-pongs urgency on noisy ACK updates, paying spread and queue-reset tax.
Execution slippage is then driven by state-estimation lag, not only market impact.
Core failure mechanism
Let:
R_true(t): true remaining parent quantity,R_est(t): router-estimated remaining quantity,e_R(t) = R_est(t) - R_true(t): residual estimation error.
If ACK/fill/cancel events are delayed and heterogeneous across venues, e_R(t) becomes regime-dependent.
A simple dispersion driver:
D_ack(t) = p90(ack_latency_v,t) - p10(ack_latency_v,t)
for active venues v.
As D_ack grows, event-ordering ambiguity rises; residual control acts on stale state and branch losses become nonlinear near deadline windows.
Branch-cost decomposition
Expected cost can be decomposed as:
E[Cost] = C_base + C_underfill_catchup + C_overfill_unwind + C_churn + C_toxicity_miss
Where:
C_base: normal spread/impact/fees under synchronized state,C_underfill_catchup: late aggressive completion when residual was under-estimated,C_overfill_unwind: liquidation/hedge correction cost after over-execution,C_churn: cancel/replace and queue-reset costs from oscillating urgency,C_toxicity_miss: wrong passive/aggressive mix chosen due to stale residual confidence.
The hidden tax is convex because mis-estimation interacts with time pressure.
Metric stack
1) ACK Dispersion Spread (ADS)
ADS = p90_ack_latency - p10_ack_latency across active venues.
- High ADS means asynchronous state truth.
2) Residual Error Band (REB)
Posterior interval width for residual estimate, e.g. q90(R_true|events) - q10(...).
- Wide REB means controller should reduce brittle one-shot urgency shifts.
3) Deadline Catch-up Convexity (DCC)
Marginal expected cost per residual unit as time-to-deadline shrinks.
- Rising DCC means residual uncertainty is becoming expensive fast.
4) Overfill Pressure Index (OPI)
Probability-weighted overfill risk under current in-flight child orders and cancel-lag.
- High OPI means continued aggression can create unwind drag.
5) ACK Ordering Violation Rate (AOVR)
Rate of event sequences implying ambiguous or inverted local ordering for state updates.
- Elevated AOVR means event ingest rules are too naive for current latency regime.
Control policy state machine
STATE 1 — SYNCHRONIZED
Conditions:
- low ADS,
- narrow REB,
- low AOVR.
Policy:
- standard residual-tracking controller,
- normal venue allocation and urgency schedule.
STATE 2 — DESYNC_WATCH
Conditions:
- ADS/REB rising,
- occasional ordering violations,
- moderate DCC.
Policy:
- downweight fast urgency flips,
- increase residual confidence threshold before large aggression jumps,
- cap simultaneous in-flight child notional.
STATE 3 — DESYNC_ACTIVE
Conditions:
- persistent high ADS/AOVR,
- wide REB,
- rising OPI or DCC.
Policy:
- switch to uncertainty-aware residual posterior control,
- prioritize venues with stable ACK behavior,
- constrain cancel/reprice churn and avoid oscillatory response.
STATE 4 — SAFE
Conditions:
- residual confidence collapse near deadline,
- combined overfill + catch-up risk exceeds budget.
Policy:
- halt non-essential micro-updates,
- execute pre-approved completion ladder with bounded overfill risk,
- prefer reliability over local spread optics.
Use asymmetric hysteresis to avoid state flapping.
Modeling pattern (production)
Residual posterior instead of point estimate
- maintain
P(R_true | event stream, per-venue latency model). - propagate uncertainty to tactic scoring, not just dashboarding.
- maintain
Per-venue ACK latency regime model
- online estimates by venue × session segment × stress state,
- detect sudden latency skew shifts (not only mean drift).
Uncertainty-aware action objective
- optimize mean + q95 cost with explicit overfill and deadline penalties,
- penalize tactic sensitivity to residual uncertainty.
Event-order robust ingestion contract
- idempotent apply rules,
- monotonic sequence/causality guards where available,
- reconciliation path for delayed cancel/fill confirms.
Counterfactual replay with latency perturbations
- inject ACK-latency shocks and ordering permutations into historical runs,
- verify state transitions and tail-cost protection before production promotion.
Practical rollout checklist
- Implement ADS/REB/OPI/AOVR dashboards by strategy and venue bucket.
- Add DESYNC state machine with manual override + incident logging.
- Limit in-flight exposure as a function of REB and DCC.
- Canary in one strategy with strict q95 and overfill guardrails.
- Weekly attribution: split slippage into synchronized vs desynchronized windows.
Bottom line
In fragmented execution, residual size is not a constant truth—it is a latency-conditioned belief state.
If your router acts on stale residual estimates as if they were exact, you quietly pay slippage through late catch-up, overfill unwind, and churn. Model ACK dispersion explicitly and let uncertainty drive control logic, especially when deadline convexity turns small state errors into expensive outcomes.