Parent-Child Desync Orphan-Child-Order Slippage Playbook

2026-03-15 · finance

Parent-Child Desync Orphan-Child-Order Slippage Playbook

Date: 2026-03-15
Category: research
Focus: Modeling and controlling slippage when parent schedulers lose authoritative state and orphan child orders keep trading.


1) Why this failure mode matters

Execution stacks usually assume one invariant:

parent state == live child-order state

In production, that invariant can fail during short network partitions, gateway restarts, drop-copy lag, or timeout-driven retries:

  1. parent believes child orders are canceled/expired,
  2. one or more child orders are still live at venue/broker,
  3. parent launches replacement flow,
  4. effective participation doubles (or worse),
  5. cleanup happens late and at worse prices.

This creates a slippage branch that often gets mislabeled as "market moved." In reality, it is state-desynchronization slippage.


2) Mechanism map

2.1 Desync entry points

Common triggers:

2.2 Orphan lifecycle

The expensive part is usually not the orphan itself, but the late discovery + urgent correction loop.


3) Cost decomposition

Let total execution cost be:

[ C_{total} = C_{base} + C_{overparticipation} + C_{collision} + C_{cleanup} ]

Where:

Expected-value framing by state:

[ \mathbb{E}[C] = p_S C_S + p_D C_D ]

The goal is not only lowering (p_D), but shrinking (C_D) through fast containment.


4) Feature set for modeling

4.1 State-consistency features

4.2 Orphan-risk features

4.3 Market interaction features

Desync is most costly when containment occurs in thin/high-volatility intervals.


5) Operational metrics

5.1 OER — Orphan Exposure Ratio

[ OER = \frac{unlinked_live_notional}{active_parent_notional + \epsilon} ]

Direct measure of hidden live risk.

5.2 RDL — Reconciliation Delay Lag

[ RDL = \text{p95}(t_{live_at_venue} - t_{recognized_by_parent}) ]

How long the parent is blind to true state.

5.3 DPP — Duplicate Participation Pressure

[ DPP = \frac{realized_participation}{target_participation} ]

DPP > 1 indicates overparticipation from desync/retry overlap.

5.4 OST — Orphan Slippage Tax

[ OST = \frac{C_{overparticipation}+C_{collision}+C_{cleanup}}{executed_notional} ]

Primary KPI for this regime.


6) State machine and controls

SYNCED

DESYNC_SUSPECT

Triggered when OER or RDL crosses watch threshold.

ORPHAN_CONTAINMENT

Triggered on confirmed orphan(s).

RECONCILED_RECOVERY


7) Practical modeling approach

  1. Reconstruct truth timeline from OMS, gateway logs, broker reports, drop-copy.
  2. Label incidents (no_desync, suspect, confirmed_orphan, cleanup).
  3. Estimate hazard of orphan creation after timeout/reconnect/failover events.
  4. Simulate policies:
    • immediate retry,
    • retry with authoritative snapshot gate,
    • containment-first then relaunch.
  5. Evaluate tail outcomes (q95 OST, max DPP, incident duration).

8) 30-day rollout plan

Week 1 — Data contract + identifiers

Week 2 — Shadow detection

Week 3 — Containment policy pilot

Week 4 — Scale + runbooks


9) Common anti-patterns


10) Bottom line

When parent and live child state diverge, your execution stack can trade more than intended without noticing immediately.

That hidden overparticipation is preventable slippage. Model desync risk explicitly, monitor orphan exposure in real time, and enforce containment-first recovery. If you can reduce reconciliation lag and incident tail size, you cut both impact cost and operational surprises.