Drop Copy + Post-Trade Reconciliation Break-Management Playbook

2026-02-26 · finance

Drop Copy + Post-Trade Reconciliation Break-Management Playbook

Date: 2026-02-26 (KST)

TL;DR

Execution quality is not only about slippage at fill time. A desk can trade “well” intraday and still lose money (or create compliance risk) if post-trade records drift across OMS, broker drop copy, and clearing confirmations.

Use a three-layer reconciliation loop:

  1. Real-time sanity (seconds): catch obvious mismatches early
  2. Intraday break queue (minutes): classify and triage unresolved differences
  3. End-of-day hard close (hours): force economic truth + audit trail before settlement windows

Treat breaks as an operational risk budget, not back-office noise.


1) Why this matters more in a tighter settlement world

As settlement cycles get shorter (e.g., many markets moving toward tighter post-trade timelines), tolerance for manual repair shrinks.

Common failure pattern:

Practical implication: reconciliation latency is now a trading-system metric.


2) Canonical event model (single source of operational truth)

Build a normalized internal schema before matching anything. Do not reconcile directly across raw vendor payloads.

2.1 Required fields (minimum)

2.2 Key normalization rules

If you skip normalization, your “break rate” mostly measures schema inconsistency.


3) Matching strategy: deterministic first, fuzzy second

3.1 Tier-1 deterministic match

Primary key (example priority):

  1. broker + exec_id
  2. venue_trade_id
  3. order_id + fill_seq

If exact key match exists, reconcile economics (qty/price/fees/notional) with tolerance rules.

3.2 Tier-2 constrained fuzzy match

When deterministic key is missing (common in partial legacy integrations), match on:

All fuzzy matches must carry a confidence score and human-review flag above a risk threshold.

3.3 Tier-3 unresolved breaks

Anything unresolved enters a break queue with explicit owner and SLA clock. No “silent pending” state.


4) Break taxonomy (so fixes are actionable)

Tag each break by type. One break can have multiple tags.

  1. Economic breaks

    • qty mismatch
    • price mismatch
    • fee/tax mismatch
    • net amount mismatch
  2. Lifecycle breaks

    • missing cancel/correct chain
    • duplicate execution
    • out-of-order event application
  3. Reference-data breaks

    • symbol mapping/corporate-action drift
    • account/book mismatch
    • venue code mismatch
  4. Timing breaks

    • late drop copy arrival
    • clock skew causing window miss
    • settlement-date derivation mismatch

The taxonomy should map directly to routing: trading, middle office, reference data, or infra.


5) Reconciliation state machine

Use an explicit state machine instead of ad-hoc status text.

NEW -> MATCHED_EXACT | MATCHED_FUZZY | BREAK_OPEN -> BREAK_ACKED -> RESOLVED | ESCALATED -> CLOSED

Recommended behavior:

No state transition without timestamp + actor + reason.


6) Tolerance policy (hard vs soft)

Define tolerances before incidents happen.

6.1 Hard-fail examples

6.2 Soft-fail examples (review queue)

Keep soft-fail thresholds versioned and change-controlled. If thresholds move, historical break-rate comparability must be preserved.


7) Operational SLOs (starter set)

Example desk-level SLOs:

Also track by symbol-liquidity bucket, venue, and broker. Aggregate-only dashboards hide concentrated fragility.


8) Data architecture pattern (practical)

Use append-only ledgers plus materialized views:

  1. raw_events (immutable ingestion)
  2. normalized_exec_events (canonical fields)
  3. recon_links (match candidates, confidence, rule id)
  4. recon_breaks (open/resolved lifecycle)
  5. recon_snapshot_eod (frozen daily close evidence)

Never overwrite the raw event history. Corrections should be additional events with lineage pointers.


9) SQL-style skeleton

-- 1) Candidate exact matches
insert into recon_links (left_id, right_id, match_type, confidence, rule_id)
select a.id, b.id, 'EXACT', 1.0, 'broker_exec_id'
from normalized_exec_events a
join normalized_exec_events b
  on a.source = 'OMS'
 and b.source = 'DROP_COPY'
 and a.broker = b.broker
 and a.exec_id = b.exec_id
where a.trade_date = :trade_date
  and b.trade_date = :trade_date;

-- 2) Open breaks from unmatched records
insert into recon_breaks (...)
select ...
from unmatched_view
where trade_date = :trade_date;

Keep matching rules explicit and versioned (rule_id + config hash).


10) 30-minute incident runbook (when break count spikes)

  1. Contain

    • pause noncritical parameter changes
    • snapshot current break queue and ingestion lag
  2. Classify quickly

    • is this mostly timing, reference-data, or economics?
    • identify top broker/venue/symbol concentration
  3. Stabilize feed health

    • check ingest lag, parser error rate, clock drift, schema version rollouts
  4. Apply temporary guardrails

    • widen only timing window if economically safe
    • do not relax hard economic consistency checks without explicit approval
  5. Communicate

    • send concise status: blast radius, ETA, current risk posture
  6. Post-incident hardening

    • add a regression test from captured payloads
    • backfill and re-run reconciliation for affected window

11) Common anti-patterns


12) Implementation checklist (first 2 weeks)

Week 1:

Week 2:

If game-day fails, do not claim reconciliation is production-ready.


13) Bottom line

A desk without strong post-trade reconciliation is running invisible leverage. Execution alpha can be real and still be erased by operational drift.

Make reconciliation fast, explicit, and auditable:

That turns reconciliation from cleanup work into a real risk-control system.