FIX Pending-Replace, Late-Fill & Replace-Reject State-Ambiguity Slippage Playbook

2026-04-07 · finance

FIX Pending-Replace, Late-Fill & Replace-Reject State-Ambiguity Slippage Playbook

Why this matters

A cancel/replace request feels like a harmless refinement:

But FIX does not make a replace logically atomic.

Between OrderCancelReplaceRequest (35=G) and an explicit ExecType=Replace, the order sits in a strange in-between world:

That means a naïve controller can end up living in two incompatible realities at once:

  1. the strategy thinks the new price/size is already active,
  2. the venue is still matching the old order,
  3. and fills or rejects arrive that update cumulative quantity under the old state while the scheduler has already started planning from the new one.

That gap turns directly into slippage:

This is not a corner-case protocol curiosity. It is a repeatable execution regime whenever the desk actively reprices or resizes live passive flow.


Failure mode in one line

A replace request is treated as economically final before it is actually accepted, so late fills and chained replace rejects corrupt residual state and turn ordinary repricing into avoidable slippage.


Protocol facts that matter operationally

A few FIX semantics drive the whole failure pattern:

  1. Pending Replace does not mean replaced.
    It only means the replace request has been received and is being processed.

  2. Pending Replace has higher OrdStatus precedence than Partially Filled.
    So an order can be economically “still trading under old parameters” while the reported order status shows Pending Replace.

  3. Fills during the pending window must use the original order parameters.
    Until the order is explicitly confirmed as Replaced, execution reports for late fills should still refer to the old effective state.

  4. An order cannot be considered replaced until a real replace confirmation arrives.
    Local intent is not venue truth.

  5. Chained replace requests are allowed by the protocol, but counterparties may reject them.
    In practice, a second replace often gets bounced with “already in pending cancel or pending replace status.”

  6. Reject messages are state-bearing.
    A replace reject is not just an error response. It tells you which order state still survives after the failed modification.

The practical consequence is simple: if your engine updates live-price, leaves, or queue expectations on replace-send rather than replace-ack, it is manufacturing fake certainty.


Observable signatures

1) Replace bursts with no corresponding improvement in fill quality

2) Fills arrive “from the past”

3) Residual jumps after replace reject or late replace ack

4) Chained replace attempts fail in bunches

5) Queue-preserving intent turns into queue-reset reality

6) Slippage tails concentrate around repricing windows


Mechanical path to slippage

Step 1) The strategy decides the live order should change

Maybe the quote moved, microprice tilted, toxicity rose, or the parent schedule wants a smaller/larger resting leaf.

Step 2) The engine sends OrderCancelReplaceRequest (35=G)

Locally, many systems now start behaving as if the new price/size is already active.

Step 3) The old order is still economically alive

Until the broker/venue confirms ExecType=Replace, the original order can still trade.

Step 4) FIX reports a mixed state

Typical path:

Step 5) Local controller compresses a concurrent state machine into a linear story

Typical bad assumptions:

Step 6) The router pays the tax

That tax shows up as one or more of:


Core model

Define:

Then during the ambiguity window:

S_obs(t) = mix(S_des(t), S_ven(t), replayed_execs(t), reject_semantics(t))

R_obs(t) = R_true(t) + epsilon_fill(t) + epsilon_chain(t) + epsilon_replace_assumption(t)

where:

A practical decomposition of the slippage tax is:

IS_pending_replace ≈ stale_residual_cost + queue_reset_cost + reject_recovery_cost + catchup_cost + overfill_cleanup_cost

Interpretation:


State ambiguity taxonomy

A) Replace-is-final optimism

The engine assumes that sending the replace makes the new order economically active.

Risk: old-price fills arrive after the scheduler already planned from new-price exposure.

B) Pending-replace blindness

The system sees Pending Replace and stops thinking about the old order’s fill risk.

Risk: live queue exposure continues, but residual logic acts as though the leaf has already moved.

C) Original-parameter disbelief

Execution reports during the pending window use old parameters, as FIX intends, but the local parser or state machine treats them as anomalous.

Risk: real completion is undercounted; duplicate-like cleanup flow appears.

D) Optimistic chain drift

Sender chains rapid replaces optimistically; venue accepts/rejects on a slower or more pessimistic basis.

Risk: OrigClOrdID relationships diverge, and the local controller chases the wrong active version.

E) Reject-as-transport-error

A replace reject is logged but not folded into authoritative state.

Risk: the controller keeps acting on the requested new state even though the venue remained on the old one.

F) Queue-value amnesia

The system focuses on parametric correctness but forgets the opportunity cost of repricing churn.

Risk: tiny theoretical quote improvements create larger realized queue loss.


Feature set worth modeling

Replace-path features

Timing features

State-integrity features

Execution-impact features


Highest-risk situations

1) High-frequency passive repricing

If the strategy nudges quotes often, replace ambiguity becomes a control-loop property rather than a rare exception.

2) Tight-deadline schedules

A small residual mistake near the end of the parent schedule can force a large cleanup burst.

3) Size-increase replaces

Increasing quantity while already part-filled is especially dangerous because the order may receive more fills before the new size is formally active.

4) Multi-venue routing with central residual control

One venue’s pending-replace ambiguity contaminates parent residual logic across all venues.

5) Counterparties that reject replace chaining

Fast local repricing logic collides with slower venue semantics and creates repeated chain breakage.

6) Queue-sensitive books

When queue value is large, unnecessary replace/retry loops cost more than the intended price improvement was ever worth.


Regime state machine

CLEAN

REPLACE_SENT

Trigger:

Actions:

PENDING_REPLACE_EXPOSED

Trigger:

Actions:

LATE_FILL_ON_OLD_STATE

Trigger:

Actions:

REPLACE_REJECT_RECOVERY

Trigger:

Actions:

REPLACED_CONFIRMED

Trigger:

Actions:

SAFE_CONTAIN

Trigger:

Actions:


Control rules that actually help

1) Separate desired state from effective state

Keep two ledgers:

Never let desired state overwrite confirmed state.

2) Treat replace-send as a request, not a transition

The state transition is ExecType=Replace, not the local API call.

3) Credit all late fills immediately, even if they look temporally awkward

If the venue says the old order filled while replace was pending, that fill is real. Residual logic must absorb it before taking more risk.

4) Add chain-integrity checks around OrigClOrdID

If the local optimistic chain and the venue-accepted chain diverge, stop repricing reflexively and reconcile first.

5) Penalize urgency when residual uncertainty is high

Uncertain residuals should lower confidence, not trigger aggressive completion.

6) Put a hard budget on replace churn

Tiny quote improvements should not be allowed to create endless pending/ reject / retry loops.

7) Attribute pending-replace slippage separately in TCA

Otherwise the desk blames volatility, spread, or venue toxicity for a state-machine bug.


TCA / KPI layer

Track these explicitly:

Segment by:


Validation approach

Replay / backtest questions

  1. After a replace request, how often did the order still receive fills under the old parameters?
  2. How often did the strategy behave as though the new state was already live before actual replace confirmation?
  3. What fraction of replace bursts created no measurable improvement in fill quality but did create queue loss?
  4. How much catch-up flow would disappear if residuals were recalculated from authoritative late fills earlier?
  5. How much tail slippage clusters around pending-replace dwell and replace-reject episodes?

Shadow-mode checks

Failure-injection drills

Simulate:

If the engine only behaves correctly when replace is immediate and clean, it is not production-ready.


Common anti-patterns


Minimal implementation sketch

A robust controller usually needs:

  1. authoritative per-leaf replace state machine

    • WORKING -> REPLACE_SENT -> PENDING_REPLACE -> {REPLACED_CONFIRMED, REPLACE_REJECT_RECOVERY}
    • with fills allowed to land under original parameters throughout the pending window
  2. dual-ledger state model

    • desired order parameters
    • confirmed effective order parameters
  3. chain-integrity validator

    • reconstruct active lineage from ClOrdID, OrigClOrdID, reject semantics, and accepted replace events
  4. ambiguity-aware scheduler

    • slows replace cadence and suppresses urgency when residual confidence falls
  5. pending-replace ledger

    • stores send time, pending time, final outcome, late-fill quantity, and chain-depth information
  6. TCA attribution hooks

    • assigns realized cost to pending-replace ambiguity instead of generic market conditions

Bottom line

Replace logic is not cosmetic.

A price/size update that feels instantaneous in the strategy can remain economically old at the venue for longer than you think. During that window, fills still belong to the original order, chained modifies may be rejected, and FIX is explicit that Pending Replace is not replacement finality.

If the execution stack treats replace intent as venue truth, it quietly converts protocol ambiguity into queue loss, false urgency, and cleanup flow.

The fix is not “replace faster.”

The fix is to model pending replace as a first-class uncertainty regime, keep desired state separate from effective state, and only let venue-confirmed transitions rewrite the book.


References