FIX Pending-Replace, Late-Fill & Replace-Reject State-Ambiguity Slippage Playbook

Why this matters

A cancel/replace request feels like a harmless refinement:

nudge the price by a tick,
resize the leaf,
update instructions,
keep working.

But FIX does not make a replace logically atomic.

Between OrderCancelReplaceRequest (35=G) and an explicit ExecType=Replace, the order sits in a strange in-between world:

the original order can still receive fills,
OrdStatus=Pending Replace has higher precedence than Partially Filled,
a second replace can be rejected because the first one is still pending,
and any fill arriving during the pending window must still describe the original order parameters.

That means a naïve controller can end up living in two incompatible realities at once:

the strategy thinks the new price/size is already active,
the venue is still matching the old order,
and fills or rejects arrive that update cumulative quantity under the old state while the scheduler has already started planning from the new one.

That gap turns directly into slippage:

false residual estimates,
replace-churn bursts,
accidental double aggression,
queue-rank destruction,
and catch-up flow fired from fake urgency.

This is not a corner-case protocol curiosity. It is a repeatable execution regime whenever the desk actively reprices or resizes live passive flow.

Failure mode in one line

A replace request is treated as economically final before it is actually accepted, so late fills and chained replace rejects corrupt residual state and turn ordinary repricing into avoidable slippage.

Protocol facts that matter operationally

A few FIX semantics drive the whole failure pattern:

Pending Replace does not mean replaced.
It only means the replace request has been received and is being processed.
Pending Replace has higher OrdStatus precedence than Partially Filled.
So an order can be economically “still trading under old parameters” while the reported order status shows Pending Replace.
Fills during the pending window must use the original order parameters.
Until the order is explicitly confirmed as Replaced, execution reports for late fills should still refer to the old effective state.
An order cannot be considered replaced until a real replace confirmation arrives.
Local intent is not venue truth.
Chained replace requests are allowed by the protocol, but counterparties may reject them.
In practice, a second replace often gets bounced with “already in pending cancel or pending replace status.”
Reject messages are state-bearing.
A replace reject is not just an error response. It tells you which order state still survives after the failed modification.

The practical consequence is simple: if your engine updates live-price, leaves, or queue expectations on replace-send rather than replace-ack, it is manufacturing fake certainty.

Observable signatures

1) Replace bursts with no corresponding improvement in fill quality

OrderCancelReplaceRequest (35=G) volume rises.
Queue position deteriorates.
Markouts worsen instead of improving.
Venue exposure looks more chaotic than the scheduler intended.

2) Fills arrive “from the past”

A replace has already been sent.
Execution reports then land with old ClOrdID / old price / old leaves semantics.
Local logic treats them as stale, surprising, or duplicate-like.

3) Residual jumps after replace reject or late replace ack

Parent residual shrinks or expands suddenly after 35=9 or ExecType=Replace.
Scheduler emits catch-up or cleanup flow right after state reconciliation.
The damage clusters around active quote management periods.

4) Chained replace attempts fail in bunches

First replace goes Pending Replace.
Second replace is rejected because the order is still pending.
Local chain state drifts away from venue-accepted chain state.

5) Queue-preserving intent turns into queue-reset reality

Strategy wanted a tidy amendment loop.
Instead it created a noisy sequence of replace, reject, retry, and re-entry.
Passive edge decays even when spread and book shape looked favorable.

6) Slippage tails concentrate around repricing windows

Mean fill price may look acceptable.
p95 / p99 implementation shortfall worsens sharply during fast repricing regimes.
TCA attributes the pain to “volatility” unless replace-state ambiguity is measured explicitly.

Mechanical path to slippage

Step 1) The strategy decides the live order should change

Maybe the quote moved, microprice tilted, toxicity rose, or the parent schedule wants a smaller/larger resting leaf.

Step 2) The engine sends `OrderCancelReplaceRequest (35=G)`

Locally, many systems now start behaving as if the new price/size is already active.

Step 3) The old order is still economically alive

Until the broker/venue confirms ExecType=Replace, the original order can still trade.

Step 4) FIX reports a mixed state

Typical path:

original order partially fills,
Pending Replace arrives,
another fill lands using original parameters,
then the replace is either accepted or rejected,
while a second replace may also be rejected for chaining too early.

Step 5) Local controller compresses a concurrent state machine into a linear story

Typical bad assumptions:

“replace sent” means new price is already working,
Pending Replace means old queue exposure is gone,
late fill under old parameters is stale noise,
rejected chained replace means “just retry” without full reconciliation,
or new desired leaves can overwrite venue-confirmed leaves.

Step 6) The router pays the tax

That tax shows up as one or more of:

over-cancel / over-replace churn,
catch-up aggression after fake underfill,
accidental overfill after uncredited late fills,
passive queue loss from unnecessary re-entry,
or false hedge adjustments tied to incorrect residual state.

Core model

Define:

S_des(t): desired order state after local repricing logic
S_ven(t): venue-effective order state
R_true(t): true parent residual
R_obs(t): locally observed residual during replace ambiguity
P_rep(t): probability the replace is still pending, not final
F_old(t): fills arriving under original order parameters after replace-send
C_chain(t): chained replace depth not yet authoritatively accepted
Q_loss(t): queue-priority loss caused by unnecessary replace/retry/re-entry flow

Then during the ambiguity window:

S_obs(t) = mix(S_des(t), S_ven(t), replayed_execs(t), reject_semantics(t))

R_obs(t) = R_true(t) + epsilon_fill(t) + epsilon_chain(t) + epsilon_replace_assumption(t)

where:

epsilon_fill(t) comes from late fills under old parameters,
epsilon_chain(t) comes from optimistic replace chaining not matched by venue acceptance,
epsilon_replace_assumption(t) comes from treating replace-send as replace-finality.

A practical decomposition of the slippage tax is:

IS_pending_replace ≈ stale_residual_cost + queue_reset_cost + reject_recovery_cost + catchup_cost + overfill_cleanup_cost

Interpretation:

stale_residual_cost: you mis-size the remaining work,
queue_reset_cost: you burn priority through unnecessary replace/retry behavior,
reject_recovery_cost: you spend spread and time recovering from failed chain assumptions,
catchup_cost: fake lateness drives excess urgency,
overfill_cleanup_cost: late credited fills force cleanup or hedge unwind.

State ambiguity taxonomy

A) Replace-is-final optimism

The engine assumes that sending the replace makes the new order economically active.

Risk: old-price fills arrive after the scheduler already planned from new-price exposure.

B) Pending-replace blindness

The system sees Pending Replace and stops thinking about the old order’s fill risk.

Risk: live queue exposure continues, but residual logic acts as though the leaf has already moved.

C) Original-parameter disbelief

Execution reports during the pending window use old parameters, as FIX intends, but the local parser or state machine treats them as anomalous.

Risk: real completion is undercounted; duplicate-like cleanup flow appears.

D) Optimistic chain drift

Sender chains rapid replaces optimistically; venue accepts/rejects on a slower or more pessimistic basis.

Risk: OrigClOrdID relationships diverge, and the local controller chases the wrong active version.

E) Reject-as-transport-error

A replace reject is logged but not folded into authoritative state.

Risk: the controller keeps acting on the requested new state even though the venue remained on the old one.

F) Queue-value amnesia

The system focuses on parametric correctness but forgets the opportunity cost of repricing churn.

Risk: tiny theoretical quote improvements create larger realized queue loss.

Feature set worth modeling

Replace-path features

replace_req_count_1s
pending_replace_count
replace_reject_count_1m
replace_accept_rate
chained_replace_depth
second_replace_reject_rate
already_pending_replace_reject_rate

Timing features

replace_to_pending_ms
replace_to_final_ack_ms
replace_to_reject_ms
pending_replace_dwell_ms
replace_send_to_old_fill_ms_p50/p95
replace_chain_resolution_ms

State-integrity features

late_fill_under_old_params_qty
late_fill_under_old_params_rate
desired_vs_effective_price_gap_ticks
desired_vs_effective_open_qty_gap
origclordid_chain_mismatch_count
local_vs_venue_residual_gap

Execution-impact features

replace_retry_burst_factor
post_replace_reject_markout_1s_5s_30s
queue_reset_bps_estimate
catchup_qty_after_reconcile
cleanup_qty_after_late_fill
completion_deficit_after_replace_window_pct

Highest-risk situations

1) High-frequency passive repricing

If the strategy nudges quotes often, replace ambiguity becomes a control-loop property rather than a rare exception.

2) Tight-deadline schedules

A small residual mistake near the end of the parent schedule can force a large cleanup burst.

3) Size-increase replaces

Increasing quantity while already part-filled is especially dangerous because the order may receive more fills before the new size is formally active.

4) Multi-venue routing with central residual control

One venue’s pending-replace ambiguity contaminates parent residual logic across all venues.

5) Counterparties that reject replace chaining

Fast local repricing logic collides with slower venue semantics and creates repeated chain breakage.

6) Queue-sensitive books

When queue value is large, unnecessary replace/retry loops cost more than the intended price improvement was ever worth.

Regime state machine

CLEAN

No active replace ambiguity.
Normal routing and quote-update cadence.

REPLACE_SENT

Trigger:

35=G emitted.

Actions:

Mark the leaf as replace-pending risk, not as updated truth.
Continue tracking the old effective parameters as economically live.
Reduce residual-confidence score.

PENDING_REPLACE_EXPOSED

Trigger:

ExecType=Pending Replace or replace dwell exceeds threshold.

Actions:

Treat the old order as still fillable until explicit replace confirmation.
Suppress urgency jumps based solely on the desired new state.
Block rapid chained replaces unless policy explicitly allows them.

LATE_FILL_ON_OLD_STATE

Trigger:

execution report arrives under original parameters after replace-send.

Actions:

Credit completion immediately from authoritative cumulative fields.
Recompute residual from venue truth, not intent.
Prevent duplicate compensation flow.

REPLACE_REJECT_RECOVERY

Trigger:

OrderCancelReject (35=9) on the replace path.

Actions:

Treat the reject as state-bearing.
Rebuild the active ClOrdID / OrigClOrdID chain.
Reconcile whether the old order is still working, partially filled, or terminal.
Freeze aggressive quote-churn until the chain is coherent again.

REPLACED_CONFIRMED

Trigger:

authoritative ExecType=Replace received.

Actions:

Switch effective parameters to the new state.
Carry over cumulative quantity correctly.
Release normal repricing logic gradually, not instantly.

SAFE_CONTAIN

Trigger:

residual uncertainty too high,
chained replace rejects exceed threshold,
or local/venue chain state diverges.

Actions:

cap replace cadence,
prefer fewer larger decisions,
avoid cross-venue catch-up bursts,
and keep hedges conservative until state converges.

Control rules that actually help

1) Separate desired state from effective state

Keep two ledgers:

desired: what the strategy wants,
effective: what the venue has confirmed.

Never let desired state overwrite confirmed state.

2) Treat replace-send as a request, not a transition

The state transition is ExecType=Replace, not the local API call.

3) Credit all late fills immediately, even if they look temporally awkward

If the venue says the old order filled while replace was pending, that fill is real. Residual logic must absorb it before taking more risk.

4) Add chain-integrity checks around `OrigClOrdID`

If the local optimistic chain and the venue-accepted chain diverge, stop repricing reflexively and reconcile first.

5) Penalize urgency when residual uncertainty is high

Uncertain residuals should lower confidence, not trigger aggressive completion.

6) Put a hard budget on replace churn

Tiny quote improvements should not be allowed to create endless pending/ reject / retry loops.

7) Attribute pending-replace slippage separately in TCA

Otherwise the desk blames volatility, spread, or venue toxicity for a state-machine bug.

TCA / KPI layer

Track these explicitly:

PRDW95 — Pending Replace Dwell p95
Tail time spent waiting for replace finality.
LFOPR — Late Fill on Old Params Rate
Fraction of replace requests followed by at least one fill still tied to the original order state.
LFOPQ — Late Fill on Old Params Quantity
Quantity filled under original parameters after replace-send.
CRMR — Chain Reject Mismatch Rate
Fraction of replace chains where local optimistic chaining diverges from venue-accepted chain state.
RUG — Residual Uncertainty Gap
|R_obs - R_reconciled| / parent_qty
RRT — Replace Retry Tax
Estimated bps lost from reject/retry/replace bursts.
PRM5 — Post-Replace-Window Markout 5s
Short-horizon markout after replace ambiguity episodes.
QRL — Queue Reset Loss
Estimated bps lost from unnecessary queue restarts after replace-state confusion.

Segment by:

venue,
liquidity bucket,
tactic,
time-to-deadline bucket,
and whether the replace changed price, size, or both.

Validation approach

Replay / backtest questions

After a replace request, how often did the order still receive fills under the old parameters?
How often did the strategy behave as though the new state was already live before actual replace confirmation?
What fraction of replace bursts created no measurable improvement in fill quality but did create queue loss?
How much catch-up flow would disappear if residuals were recalculated from authoritative late fills earlier?
How much tail slippage clusters around pending-replace dwell and replace-reject episodes?

Shadow-mode checks

Compare the live controller against a counterfactual that waits for ExecType=Replace before switching effective parameters.
Compare aggressive chained repricing against a policy with explicit chain-depth caps.
Measure p95/p99 shortfall and completion reliability, not just average quote responsiveness.

Failure-injection drills

Simulate:

delayed replace acknowledgements,
late fills under old parameters,
rejected second replace while first is pending,
reject-after-partial-fill paths,
and mismatched OrigClOrdID chain reconstruction.

If the engine only behaves correctly when replace is immediate and clean, it is not production-ready.

Common anti-patterns

Replace-on-send state mutation: local order state flips before venue confirmation.
Single-ledger design: desired and effective state are merged together.
Late-fill skepticism: old-parameter fills are treated as stale or suspicious.
Chained-replace spam: second and third replaces are fired without waiting for chain integrity.
Reject-as-logline: replace reject is not fed back into the authoritative state machine.
No queue-value guardrail: every micro reprice is assumed worth the churn.
No dedicated TCA bucket: all cost gets mislabeled as market impact.

Minimal implementation sketch

A robust controller usually needs:

authoritative per-leaf replace state machine
- WORKING -> REPLACE_SENT -> PENDING_REPLACE -> {REPLACED_CONFIRMED, REPLACE_REJECT_RECOVERY}
- with fills allowed to land under original parameters throughout the pending window
dual-ledger state model
- desired order parameters
- confirmed effective order parameters
chain-integrity validator
- reconstruct active lineage from ClOrdID, OrigClOrdID, reject semantics, and accepted replace events
ambiguity-aware scheduler
- slows replace cadence and suppresses urgency when residual confidence falls
pending-replace ledger
- stores send time, pending time, final outcome, late-fill quantity, and chain-depth information
TCA attribution hooks
- assigns realized cost to pending-replace ambiguity instead of generic market conditions

Bottom line

Replace logic is not cosmetic.

A price/size update that feels instantaneous in the strategy can remain economically old at the venue for longer than you think. During that window, fills still belong to the original order, chained modifies may be rejected, and FIX is explicit that Pending Replace is not replacement finality.

If the execution stack treats replace intent as venue truth, it quietly converts protocol ambiguity into queue loss, false urgency, and cleanup flow.

The fix is not “replace faster.”

The fix is to model pending replace as a first-class uncertainty regime, keep desired state separate from effective state, and only let venue-confirmed transitions rewrite the book.

References

FIX Trading Community — Order State Changes (OrdStatus precedences; pending replace semantics; scenario matrices): https://www.fixtrading.org/online-specification/order-state-changes/
B2BITS FIX Dictionary — Execution Report (MsgType=8) (fills during pending replace use original order parameters; order not considered replaced until explicitly confirmed): https://www.b2bits.com/fixopaedia/fixdic41/message_Execution_Report_8.html
B2BITS FIX 4.4 Dictionary — Order Cancel/Replace Request (MsgType=G) (pending replace recommendation; chaining guidance; optimistic vs pessimistic chain handling): https://www.b2bits.com/fixopaedia/fixdic44/message_Order_Cancel_Replace_Request_G.html
OnixS FIX 4.2 Appendix D7 — Part-filled order, followed by cancel/replace request to increase order qty, execution occurs whilst order is pending replace: https://www.onixs.biz/fix-dictionary/4.2/app_d7.html
OnixS FIX 4.2 Appendix D17 — One cancel/replace request is issued followed immediately by another — broker rejects the second as order is pending replace: https://www.onixs.biz/fix-dictionary/4.2/app_d17.html

FIX Pending-Replace, Late-Fill & Replace-Reject State-Ambiguity Slippage Playbook

FIX Pending-Replace, Late-Fill & Replace-Reject State-Ambiguity Slippage Playbook

Why this matters

Failure mode in one line

Protocol facts that matter operationally

Observable signatures

1) Replace bursts with no corresponding improvement in fill quality

2) Fills arrive “from the past”

3) Residual jumps after replace reject or late replace ack

4) Chained replace attempts fail in bunches

5) Queue-preserving intent turns into queue-reset reality

6) Slippage tails concentrate around repricing windows

Mechanical path to slippage

Step 1) The strategy decides the live order should change

Step 2) The engine sends OrderCancelReplaceRequest (35=G)

Step 3) The old order is still economically alive

Step 4) FIX reports a mixed state

Step 5) Local controller compresses a concurrent state machine into a linear story

Step 6) The router pays the tax

Core model

State ambiguity taxonomy

A) Replace-is-final optimism

B) Pending-replace blindness

C) Original-parameter disbelief

D) Optimistic chain drift

E) Reject-as-transport-error

F) Queue-value amnesia

Feature set worth modeling

Replace-path features

Timing features

State-integrity features

Execution-impact features

Highest-risk situations

1) High-frequency passive repricing

2) Tight-deadline schedules

3) Size-increase replaces

4) Multi-venue routing with central residual control

5) Counterparties that reject replace chaining

6) Queue-sensitive books

Regime state machine

CLEAN

REPLACE_SENT

PENDING_REPLACE_EXPOSED

LATE_FILL_ON_OLD_STATE

REPLACE_REJECT_RECOVERY

REPLACED_CONFIRMED

SAFE_CONTAIN

Control rules that actually help

1) Separate desired state from effective state

2) Treat replace-send as a request, not a transition

3) Credit all late fills immediately, even if they look temporally awkward

4) Add chain-integrity checks around OrigClOrdID

5) Penalize urgency when residual uncertainty is high

6) Put a hard budget on replace churn

7) Attribute pending-replace slippage separately in TCA

TCA / KPI layer

Validation approach

Replay / backtest questions

Shadow-mode checks

Failure-injection drills

Common anti-patterns

Minimal implementation sketch

Bottom line

References

Step 2) The engine sends `OrderCancelReplaceRequest (35=G)`

4) Add chain-integrity checks around `OrigClOrdID`