Drop Copy + Post-Trade Reconciliation Break-Management Playbook

Date: 2026-02-26 (KST)

TL;DR

Execution quality is not only about slippage at fill time. A desk can trade “well” intraday and still lose money (or create compliance risk) if post-trade records drift across OMS, broker drop copy, and clearing confirmations.

Use a three-layer reconciliation loop:

Real-time sanity (seconds): catch obvious mismatches early
Intraday break queue (minutes): classify and triage unresolved differences
End-of-day hard close (hours): force economic truth + audit trail before settlement windows

Treat breaks as an operational risk budget, not back-office noise.

1) Why this matters more in a tighter settlement world

As settlement cycles get shorter (e.g., many markets moving toward tighter post-trade timelines), tolerance for manual repair shrinks.

Common failure pattern:

Front office assumes fills are final after ACK
Middle/back office discovers quantity/fee/counterparty mismatches late
Team burns time in manual chase during narrow cutoff windows
Economic exposure and regulatory reporting quality degrade

Practical implication: reconciliation latency is now a trading-system metric.

2) Canonical event model (single source of operational truth)

Build a normalized internal schema before matching anything. Do not reconcile directly across raw vendor payloads.

2.1 Required fields (minimum)

trade_date
symbol
side
exec_qty
exec_price
exec_id (venue/broker execution identifier)
order_id / cl_ord_id lineage
broker
venue
currency
fees_commission
gross_amount / net_amount
event_ts_exchange
event_ts_gateway
event_ts_ingest
correction_seq (for cancel/correct chains)

2.2 Key normalization rules

Normalize side representation (BUY/SELL only)
Normalize quantity units (shares/contracts/lots)
Normalize price precision and currency scales
Convert all timestamps to UTC internally, keep original timezone metadata
Preserve raw payload hash for audit replay

If you skip normalization, your “break rate” mostly measures schema inconsistency.

3) Matching strategy: deterministic first, fuzzy second

3.1 Tier-1 deterministic match

Primary key (example priority):

broker + exec_id
venue_trade_id
order_id + fill_seq

If exact key match exists, reconcile economics (qty/price/fees/notional) with tolerance rules.

3.2 Tier-2 constrained fuzzy match

When deterministic key is missing (common in partial legacy integrations), match on:

same symbol + side
quantity within expected lot granularity
timestamp window (e.g., ±3s to ±30s, venue-dependent)
price within tick-based tolerance

All fuzzy matches must carry a confidence score and human-review flag above a risk threshold.

3.3 Tier-3 unresolved breaks

Anything unresolved enters a break queue with explicit owner and SLA clock. No “silent pending” state.

4) Break taxonomy (so fixes are actionable)

Tag each break by type. One break can have multiple tags.

Economic breaks
- qty mismatch
- price mismatch
- fee/tax mismatch
- net amount mismatch
Lifecycle breaks
- missing cancel/correct chain
- duplicate execution
- out-of-order event application
Reference-data breaks
- symbol mapping/corporate-action drift
- account/book mismatch
- venue code mismatch
Timing breaks
- late drop copy arrival
- clock skew causing window miss
- settlement-date derivation mismatch

The taxonomy should map directly to routing: trading, middle office, reference data, or infra.

5) Reconciliation state machine

Use an explicit state machine instead of ad-hoc status text.

NEW -> MATCHED_EXACT | MATCHED_FUZZY | BREAK_OPEN -> BREAK_ACKED -> RESOLVED | ESCALATED -> CLOSED

Recommended behavior:

MATCHED_FUZZY always requires periodic sampling review
BREAK_OPEN auto-assigns owner/team by taxonomy
ESCALATED triggers notification if SLA breach threshold is hit
CLOSED requires resolution reason code (DATA_FIX, COUNTERPARTY_CONFIRM, WAIVER_APPROVED, etc.)

No state transition without timestamp + actor + reason.

6) Tolerance policy (hard vs soft)

Define tolerances before incidents happen.

6.1 Hard-fail examples

side mismatch
symbol mismatch
duplicate exec_id with different economics
opposite-sign net amount

6.2 Soft-fail examples (review queue)

tiny fee rounding differences
micro price precision drift under configured tick tolerance
timestamp jitter without economic mismatch

Keep soft-fail thresholds versioned and change-controlled. If thresholds move, historical break-rate comparability must be preserved.

7) Operational SLOs (starter set)

Example desk-level SLOs:

Real-time unreconciled notional (p95, 5-min window) below internal risk limit
EOD unresolved break ratio < 0.10% of executions
Economic break median resolution time < 20 minutes
Hard-break false-positive rate < 5%

Also track by symbol-liquidity bucket, venue, and broker. Aggregate-only dashboards hide concentrated fragility.

8) Data architecture pattern (practical)

Use append-only ledgers plus materialized views:

raw_events (immutable ingestion)
normalized_exec_events (canonical fields)
recon_links (match candidates, confidence, rule id)
recon_breaks (open/resolved lifecycle)
recon_snapshot_eod (frozen daily close evidence)

Never overwrite the raw event history. Corrections should be additional events with lineage pointers.

9) SQL-style skeleton

-- 1) Candidate exact matches
insert into recon_links (left_id, right_id, match_type, confidence, rule_id)
select a.id, b.id, 'EXACT', 1.0, 'broker_exec_id'
from normalized_exec_events a
join normalized_exec_events b
  on a.source = 'OMS'
 and b.source = 'DROP_COPY'
 and a.broker = b.broker
 and a.exec_id = b.exec_id
where a.trade_date = :trade_date
  and b.trade_date = :trade_date;

-- 2) Open breaks from unmatched records
insert into recon_breaks (...)
select ...
from unmatched_view
where trade_date = :trade_date;

Keep matching rules explicit and versioned (rule_id + config hash).

10) 30-minute incident runbook (when break count spikes)

Contain
- pause noncritical parameter changes
- snapshot current break queue and ingestion lag
Classify quickly
- is this mostly timing, reference-data, or economics?
- identify top broker/venue/symbol concentration
Stabilize feed health
- check ingest lag, parser error rate, clock drift, schema version rollouts
Apply temporary guardrails
- widen only timing window if economically safe
- do not relax hard economic consistency checks without explicit approval
Communicate
- send concise status: blast radius, ETA, current risk posture
Post-incident hardening
- add a regression test from captured payloads
- backfill and re-run reconciliation for affected window

11) Common anti-patterns

“Fixing” breaks by mutating source-of-truth tables directly
Treating duplicate execs as harmless noise
Ignoring correction/cancel lineage and only storing latest state
Mixing local time and UTC in matching logic
No owner/SLA per break (queue becomes archaeology)

12) Implementation checklist (first 2 weeks)

Week 1:

Define canonical schema + field-level contracts
Implement deterministic matching on strongest IDs
Stand up break taxonomy + state machine
Build minimal dashboard (break count, unresolved age, top tags)

Week 2:

Add constrained fuzzy match with confidence scoring
Add SLA-based escalation notifications
Add daily immutable reconciliation snapshot export
Run one game-day: inject duplicate, late, and correction events

If game-day fails, do not claim reconciliation is production-ready.

13) Bottom line

A desk without strong post-trade reconciliation is running invisible leverage. Execution alpha can be real and still be erased by operational drift.

Make reconciliation fast, explicit, and auditable:

canonical schema
deterministic-first matching
break taxonomy + SLA ownership
append-only evidence trail

That turns reconciliation from cleanup work into a real risk-control system.