Order Idempotency & Duplicate-Order Prevention Playbook (FIX/API)

Date: 2026-03-11
Category: knowledge
Domain: finance / execution engineering / trading operations

Why this matters

In live execution, the most expensive bug is often not a bad signal — it is a duplicated order.

Typical path:

you send an order,
ack is delayed or dropped,
your retry logic fires,
venue/broker treats retry as a new intent,
you get unintended extra exposure.

This is a reliability problem first, and a PnL problem immediately after.

Core principle

Treat order submission as an idempotent intent pipeline, not a best-effort message send.

Retries are inevitable.
Disconnect/reconnect is inevitable.
Sequence resets and replay flags happen.
Therefore duplicate-prevention must be a first-class control, not an afterthought.

Identifier semantics you must keep straight

1) `ClOrdID` (FIX Tag 11)

Client-assigned order identifier.
Must be unique at least within trading day; for multi-day flows, encode date/scope.
This is your primary idempotency key on outbound intent.

2) `OrigClOrdID` (FIX Tag 41)

Links cancel/replace request to previous client order ID.
Critical for lineage when modifying orders.

3) `OrderID` (FIX Tag 37)

Broker/venue-assigned order identifier.
Useful for downstream reconciliation, but not your submission idempotency root.

4) `ExecID` (FIX Tag 17)

Unique execution report identifier from sell-side.
Use for inbound dedupe/replay-safe fill processing.

5) `PossDupFlag` (43) / `PossResend` (97)

Signals that message may be retransmitted/replayed.
Your consumers must be safe under replay.

Failure modes that create accidental duplicates

Ack-timeout blind retry
- “No ack in X ms => send new order” without stable ClOrdID reuse policy.
Session reconnect with volatile ID generator
- ID sequence restarts after process crash/redeploy.
Cancel/replace race
- Replace submitted while original ack state is unknown; both legs become live.
Gateway failover split-brain
- Primary and standby both emit the same strategy intent independently.
Replay-unaware execution consumer
- Duplicate ExecutionReport counted twice in position/PnL.

Practical architecture (minimal but robust)

1) Intent Ledger (authoritative)

Before sending to broker, persist:

intent_id (internal UUID),
deterministic cl_ord_id,
symbol/side/qty/price/tif/account fingerprint,
lifecycle state (created -> sent -> acked/rejected/canceled/filled),
timestamps and route metadata.

Rule: no outbound send without durable ledger write.

2) Deterministic ClOrdID policy

Recommended pattern:

<strategy>-<yyyymmdd>-<session>-<monotonic-seq>-<short-checksum>

Rules:

sequence survives process restart,
never regenerate different ID for same intent,
never reuse old IDs inside safety retention window.

3) Retry contract

On timeout/uncertain state:

first action: query status (if supported),
retry with same idempotency identity,
only create new intent with explicit human/strategy decision.

4) Inbound dedupe keys

Maintain a processed set on ExecID (+ venue/session scope) and guard against replay.

Rule: position/PnL updates must be idempotent.

5) Reconciliation loop

Continuously reconcile:

intent ledger,
broker order state,
executions/drop copy,
internal position.

Any divergence enters incident workflow (not silent auto-heal).

Control states (ops-friendly)

NORMAL

duplicate metrics near baseline,
ack latency within normal band,
no ledger/order-state breaks.

Action: standard operation.

DEGRADED

Triggers:

ack timeout percentile spike,
duplicate reject ratio rising,
reconnect frequency elevated.

Action:

reduce aggression,
widen retry timers,
force status-query-first path,
page operator if sustained.

DUP_RISK

Triggers:

duplicate rejects exceed threshold,
unresolved uncertain orders accumulate,
reconciliation breaks not converging.

Action:

disable auto-new intents for affected routes,
permit cancel-only / reduce-only,
require explicit operator release.

SAFE

Triggers:

split-brain suspicion,
ledger persistence instability,
exchange/broker state uncertainty too high.

Action:

hard stop new exposure,
preserve capital and audit trail,
recover state before reopening.

Metrics that actually catch this early

Duplicate Reject Rate (DRR) = duplicate-ID rejects / new orders
Uncertain Order Count (UOC) = sent but unresolved by timeout + query
Replay Drop Rate (RDR) = replayed execution reports safely ignored / total exec reports
ID Collision Count (ICC) = attempted ClOrdID reuse events
Reconciliation Break Duration (RBD) = time from divergence detection to convergence

If DRR and UOC rise together, move to DEGRADED quickly.

Hard guardrails (non-negotiable)

No ephemeral ID generators (memory-only counters are forbidden).
No side effects before dedupe check on inbound executions.
No auto-resubmit with new ClOrdID while prior state is unknown.
No silent healing of reconciliation breaks — alert and track incident id.
No deployment without duplicate-order game day (disconnect/replay/failover drills).

One-line runbook for incidents

Freeze new risk on affected route.
Snapshot ledger + broker open orders + latest exec stream offsets.
Resolve uncertain intents by status query / broker desk confirmation.
Cancel unintended residuals.
Replay execution stream through idempotent consumer and verify position parity.
Postmortem with specific control change (timer, dedupe key, failover fencing, etc.).

References

B2BITS FIXopaedia — ClOrdID (Tag 11), uniqueness guidance
https://www.b2bits.com/fixopaedia/fixdic41/tag_11_ClOrdID.html
OnixS FIX Dictionary — duplicate ClOrdID state matrix example (FIX 4.4 Appendix D F.1.a)
https://www.onixs.biz/fix-dictionary/4.4/app_df.1.a.html
OnixS FIX Dictionary — PossDupFlag (43)
https://www.onixs.biz/fix-dictionary/4.4/tagnum_43.html
OnixS FIX Dictionary — PossResend (97)
https://www.onixs.biz/fix-dictionary/4.4/tagnum_97.html
B2BITS FIXopaedia — ExecID (Tag 17), uniqueness guidance
https://www.b2bits.com/fixopaedia/fixdic44/tag_17_ExecID_.html

One-line takeaway

In trading infra, “retry” without strict idempotency is just a polite word for accidental leverage.

Order Idempotency & Duplicate-Order Prevention Playbook (FIX/API)

Order Idempotency & Duplicate-Order Prevention Playbook (FIX/API)

Why this matters

Core principle

Identifier semantics you must keep straight

1) ClOrdID (FIX Tag 11)

2) OrigClOrdID (FIX Tag 41)

3) OrderID (FIX Tag 37)

4) ExecID (FIX Tag 17)

5) PossDupFlag (43) / PossResend (97)

Failure modes that create accidental duplicates

Practical architecture (minimal but robust)

1) Intent Ledger (authoritative)

2) Deterministic ClOrdID policy

3) Retry contract

4) Inbound dedupe keys

5) Reconciliation loop

Control states (ops-friendly)

NORMAL

DEGRADED

DUP_RISK

SAFE

Metrics that actually catch this early

Hard guardrails (non-negotiable)

One-line runbook for incidents

References

One-line takeaway

1) `ClOrdID` (FIX Tag 11)

2) `OrigClOrdID` (FIX Tag 41)

3) `OrderID` (FIX Tag 37)

4) `ExecID` (FIX Tag 17)

5) `PossDupFlag` (43) / `PossResend` (97)