Spoofing & Layering Trade-Surveillance Playbook (Execution Desk Edition)

Date: 2026-03-07
Category: knowledge (market integrity / execution operations)

Why this playbook exists

Execution quality can look great on short windows while integrity risk quietly accumulates.

Typical failure pattern:

strategy adds/leans size near touch,
quickly cancels when market reacts,
repeats across symbols/venues,
desk frames it as “adaptive quoting,”
surveillance later frames the same pattern as spoofing/layering intent.

If surveillance is post-hoc and manual, you discover risk only after alerts, inquiries, or account restrictions.

This playbook turns spoofing/layering risk into a real-time control loop with explicit data contracts, metrics, and action states.

Scope and non-goals

In scope

real-time and T+1 monitoring for potentially manipulative order behavior,
prevention controls in execution engines,
investigation workflow and evidence retention.

Out of scope

legal advice,
replacing formal compliance policy,
proving intent from a single metric.

Use this as an engineering + operations layer that supports compliance/legal review.

Regulatory context (high level)

Across major jurisdictions, spoofing/layering risk is typically tied to non-bona-fide order intent:

placing visible orders to move price or signal false interest,
canceling before likely execution,
benefiting from fills on opposite-side child orders.

Practical implication for engineering teams:

don’t optimize only for fill/cost,
optimize for fill/cost under integrity constraints,
retain replayable evidence for every suspicious episode.

Data contract (minimum viable surveillance)

At order-event granularity:

identifiers: strategy_id, parent_id, child_id, account, trader, symbol, venue, side
event timeline: new_ts, ack_ts, modify_ts, cancel_ts, cancel_ack_ts, fill_ts
order details: price, qty, display_qty, order_type, tif, post_only, reduce_only
market context: best_bid/ask, spread, imbalance, microprice, topN_depth, trade_rate
queue context: est_queue_ahead, queue_age_ms, touch_distance_ticks
outcome context: filled_qty, cancelled_qty, markout_100ms/1s/5s, opposite_side_fills
infra context: gateway_latency_ms, exchange_ack_ms, reject_code

Without accurate cancel/fill sequencing and opposite-side linkage, surveillance quality collapses.

Core metrics to track

1) Near-touch cancel ratio (NTCR)

A high ratio of short-lifetime cancels close to touch.

[ NTCR = \frac{\text{near-touch cancels within } \tau}{\text{near-touch order submissions}} ]

Compute by symbol × strategy × session.

2) Order lifetime asymmetry (OLA)

Difference between lifetimes of large displayed orders and opposite-side fill orders.

Large imbalance can indicate “display to move, execute elsewhere.”

3) Layer concentration score (LCS)

Measures repeated multi-level quote stacking and rapid pull patterns.

Example features:

number of price levels posted within short window,
notional concentrated near top levels,
synchronized cancellation burst timing.

4) Opposite-side benefit coupling (OBC)

How often suspicious-side cancellations are followed by favorable opposite-side fills.

[ OBC = P(\text{opposite fill} \mid \text{suspicious cancel episode}) ]

5) Spoofing episode severity index (SESI)

Composite score combining:

cancel intensity,
near-touch proximity,
layering depth,
opposite-side benefit,
recurrence over rolling windows.

Detection architecture: rules + model hybrid

Pure rules are interpretable but noisy. Pure ML is powerful but hard to defend without explainability.

Use a two-stage pipeline:

Rule gate for deterministic candidate episodes,
Risk model for prioritization and false-positive reduction.

Stage A: deterministic episode extraction

Create an episode when all hold:

burst of same-side adds within W1 ms,
significant cancellation before likely fill within W2 ms,
opposite-side execution within W3 ms,
repeated pattern count over rolling window exceeds threshold.

Stage B: probabilistic risk scoring

Train on labeled historical investigations (or semi-supervised bootstraps) with features:

NTCR, OLA, LCS, OBC,
markout after suspected episodes,
venue/session-specific baseline deviations,
strategy behavior drift vs own historical profile.

Model output should be calibrated risk buckets, not opaque binary judgments.

Real-time control state machine

Use explicit operating states to prevent surveillance from becoming passive dashboards.

GREEN: normal
WATCH: elevated pattern density
ALERT: high SESI episodes recurring
RESTRICT: temporary strategy constraints
SAFE: hard risk-off for strategy/account

Example actions by state

WATCH

tighten max displayed size near touch,
increase minimum order dwell time,
log enriched evidence snapshots.

ALERT

cap cancel rate,
enforce minimum resting time for top-of-book placements,
require human ack for high-risk strategy toggles.

RESTRICT/SAFE

disable specific tactics (e.g., aggressive layering templates),
route through conservative execution profile,
escalate to compliance/on-call immediately.

Add hysteresis (entry/exit thresholds) so states do not flap in noisy periods.

False-positive controls (critical)

Not all cancel-heavy behavior is manipulative. Legitimate reasons include:

fast quote revisions during volatility spikes,
stale quote protection under feed/latency shocks,
venue-level microstructure artifacts.

Reduce false positives with:

Regime conditioning: compare behavior against volatility/liquidity-matched baselines.
Venue normalization: calibrate per venue/session; avoid one global threshold.
Latency-aware interpretation: separate intentional pull from delayed cancel-ack races.
Counterfactual checks: ask whether opposite-side benefit persists after regime controls.
Analyst feedback loop: feed confirmed false positives back into model features.

Investigation workflow (T+1 and incident mode)

For each high-severity episode, produce a compact evidence packet:

millisecond event timeline (new/modify/cancel/fill),
L2 context snapshots before/after actions,
opposite-side fill linkage,
SESI component breakdown,
prior 30-day strategy behavior baseline,
replay artifact (deterministic reconstruction).

Triage levels

L1: low confidence / monitor only
L2: analyst review required within same day
L3: immediate compliance escalation + temporary strategy restriction

This keeps alerts auditable and actionable, not just numerous.

Governance metrics (weekly)

Track surveillance quality itself:

alert precision@k (confirmed-risk ratio among top alerts),
median investigation time,
repeat-episode rate after intervention,
false-positive rate by regime/venue,
evidence completeness score,
time-to-restriction for L3 episodes.

If precision is falling while alert volume rises, your system is becoming noise.

Practical implementation roadmap

Phase 1 — Baseline instrumentation (1-2 weeks)

enforce order-event schema consistency,
add opposite-side linkage IDs,
store deterministic replay inputs.

Phase 2 — Rule engine (1-2 weeks)

ship episode extraction rules,
build dashboards by strategy/venue/session,
start human-label pipeline.

Phase 3 — Risk scoring + controls (2-4 weeks)

calibrate SESI and risk buckets,
connect state machine to execution guardrails,
run shadow mode first, then limited enforcement canary.

Phase 4 — Production hardening (ongoing)

weekly threshold re-calibration,
drift detection on behavior/market regimes,
post-incident rule/model refinement.

Common implementation mistakes

Single global thresholds across symbols/venues.
Reality is heterogenous; this explodes false positives.
Ignoring opposite-side benefit linkage.
Cancels alone are weak evidence.
No replay artifacts.
Unreproducible alerts are operationally useless.
Treating surveillance as compliance-only.
Execution controls must react in real time.
No feedback loop from investigations.
Precision decays quickly without human-in-the-loop correction.

Implementation checklist

Event schema includes full order lifecycle and latency stamps
Opposite-side fill linkage available at child-order level
Deterministic replay available for every L3 alert
NTCR/OLA/LCS/OBC/SESI computed in streaming + batch
State-machine controls wired to execution engine
Hysteresis thresholds and escalation runbooks documented
Weekly surveillance-quality review in place

Bottom line

Spoofing/layering risk is not just a legal afterthought; it is an execution-system design problem.

The winning pattern is:

high-fidelity event data → interpretable episode metrics → calibrated risk scoring → real-time control states → replayable investigations.

That loop keeps desks fast and defensible when market behavior gets messy.