Drop-Copy Lag & Phantom Residual Slippage Playbook
Why this matters
Some execution stacks treat drop-copy as the source of truth for fills/positions during live slicing. When drop-copy lags (or arrives out of order), the router can hallucinate residual quantity that is already filled, then submit extra child orders.
That creates a hidden slippage loop:
- perceived underfill ->
- unnecessary catch-up aggression ->
- overfill unwind / hedge churn ->
- extra spread + impact + fee drag.
This is not classic market impact. It is a state-consistency tax.
Failure pattern (execution branch view)
At each decision point for parent order P:
Q_true: actual remaining quantityQ_seen: remaining quantity inferred from delayed drop-copydelta = Q_seen - Q_true
If delta > 0, strategy believes it is underfilled and over-trades.
Expected incremental cost from hallucinated residual:
E[ExtraCost] = E[ delta * (spread + impact + urgency_penalty) ] + E[unwind_cost]
The worst tails appear near deadlines (close, auction cutoff, schedule end), where urgency multipliers are steep.
Core metrics
1) Drop-Copy Delay Quantile (DCDQ)
DCDQ_p95 = p95( t_dropcopy_fill - t_true_fill )
Track by venue, symbol-liquidity bucket, and time-of-day regime.
2) Phantom Residual Ratio (PRR)
PRR = sum(max(Q_seen - Q_true, 0)) / sum(parent_qty)
Measures how much parent size was falsely considered "remaining".
3) Catch-up Overshoot Ratio (COR)
COR = sum(overfill_qty_due_to_state_lag) / sum(parent_qty)
Quantifies over-trading induced by stale execution state.
4) Reconciliation Shock Cost (RSC)
RSC = unwind_cost + forced_hedge_adjustment_cost
Use arrival-to-unwind and markout-adjusted accounting.
5) State Divergence Half-life (SDH)
Median time from divergence detection (Q_seen != Q_true) to restored consistency.
Modeling framework
A) Two-layer state model
- Latent true state from exchange/order-gateway ACK + execution reports
- Observed state from drop-copy stream used by strategy
Model P(divergence | latency, venue, load, message_rate, reconnect_state) and E[cost | divergence].
B) Tail-first objective
Do not optimize only mean bps.
Use:
J = E[slippage] + lambda1 * q95(slippage) + lambda2 * RSC
Because divergence events are sparse but expensive.
C) Deadline interaction term
Include time_to_deadline interaction:
cost ~ divergence * f(time_to_deadline)
This captures convex urgency tax when stale state persists late.
Execution controller (state machine)
STATE 1: SYNCED
Criteria:
- DCDQ p95 below threshold
- PRR low and stable
Policy:
- Normal slicing logic
- Full tactic set allowed
STATE 2: UNCERTAIN
Criteria:
- DCDQ rising, intermittent divergence
Policy:
- Reduce max child size
- Increase minimum inter-dispatch gap
- Require dual confirmation for aggressive catch-up
STATE 3: DEGRADED
Criteria:
- Persistent divergence or reconnect burst
- PRR/COR breach
Policy:
- Freeze non-essential repricing churn
- Prefer queue-preserving amend over cancel/replace
- Cap urgency escalation slope
- Enforce reconciliation window before large sweeps
STATE 4: SAFE_RECONCILE
Criteria:
- Severe divergence + deadline stress
Policy:
- Stop autonomous catch-up
- Run deterministic reconciliation snapshot
- Resume only after state lock restored
Practical guardrails
Confidence-weighted residual
- Use
Q_effective = w * Q_seen + (1-w) * Q_conservative - Decrease
was DCDQ/PRR worsens.
- Use
Dual-feed sanity checks
- Compare gateway-native execution view vs drop-copy view.
- Trigger degraded state if disagreement persists beyond tolerance.
Reconciliation cadence by regime
- Tighten cadence around open/close/auction windows.
Replayable incident packet
- Persist event timeline: send, ACK, fill, drop-copy arrival, decision.
- Needed for root-cause attribution and model recalibration.
No blind urgency under uncertainty
- If state confidence falls, urgency multiplier must saturate (hard cap).
Validation plan
Offline
- Reconstruct true-vs-seen residual timelines from historical packets.
- Counterfactual replay: baseline controller vs divergence-aware controller.
- Compare mean, q95, q99 slippage and RSC.
Shadow live
- Run divergence-aware policy in observe-only mode.
- Emit would-have-acted decisions and projected cost deltas.
Canary
- Start with low ADV symbols and low participation caps.
- Auto-rollback if completion shortfall or q95 cost breaches policy bands.
Operator checklist
- DCDQ monitored per venue/session bucket
- PRR/COR/RSC dashboards with alert thresholds
- State machine transitions audited and explainable
- SAFE_RECONCILE tested in drills (not only incidents)
- Weekly recalibration includes divergence features
Bottom line
When fill state is delayed, execution can pay a phantom residual tax that looks like random market noise. Treat drop-copy lag as a first-class risk factor, model it explicitly, and bind urgency to state confidence.
That is how you reduce tail slippage without sacrificing completion discipline.