Self-Exciting Order-Flow Cascade Slippage Playbook (Hawkes + Budget Controller)

Date: 2026-02-25 (KST)

TL;DR

Average-bps slippage models break exactly when you need them most: during clustered, self-reinforcing order-flow bursts.

This playbook adds a self-excitation layer (Hawkes-style intensity) on top of baseline slippage forecasts, then drives execution with a tail-budget controller:

Estimate real-time buy/sell event intensities (lambda+, lambda-)
Convert intensity imbalance + burst persistence into a Cascade Stress Score (CSS)
Use CSS + remaining slippage budget to switch execution states (Harvest -> Guard -> Brake)
Recalibrate daily, verify online with p95 breach alarms

The goal is simple: survive cascade regimes without giving up all fills in normal regimes.

1) Why this model layer matters

Most live controllers already include spread, volatility, depth, and maybe markout. Useful, but still incomplete.

What gets missed in practice:

Flow arrives in clusters, not IID events
One burst changes the conditional probability of the next burst
Queue quality degrades faster than displayed size suggests
Passive strategy can become a delayed-impact strategy during cascade windows

So the model should ask:

“Given the last burst, how likely is another burst before my next slice?”

That is a conditional-intensity question, not a static regression question.

2) Minimal model architecture

Use a two-layer stack:

2.1 Baseline slippage model (already familiar)

Predict quantiles from standard microstructure inputs:

spread, micro-volatility
top-of-book + local depth slope
queue imbalance / OFI-like features
participation rate + residual schedule pressure

Output:

q50_base, q90_base, q95_base in bps

2.2 Cascade overlay (new layer)

Model market-order/significant trade arrivals as self-exciting processes:

Buy intensity: lambda+(t)
Sell intensity: lambda-(t)

Practical starter (exponential kernel):

lambda_s(t) = mu_s + sum_{j: t_j < t} alpha_{s,s_j} * exp(-beta_{s,s_j}(t - t_j))

Where:

mu_s: background flow
alpha: excitation strength
beta: decay speed (memory)

From this, compute Cascade Stress Score (CSS):

CSS = w1 * burst_ratio + w2 * same_side_persistence + w3 * cross_side_instability

Example components:

burst_ratio = (lambda+ + lambda-) / rolling_median(lambda_total)
same_side_persistence = max(lambda+, lambda-) / (min(lambda+, lambda-) + eps)
cross_side_instability = short_horizon_flip_rate * lambda_total

Then augment tail forecast:

q95_live = q95_base + g(CSS, spread, vol, residual_qty)

Keep g(.) monotone in CSS.

3) Control policy: 3-state execution machine

State A: Harvest (CSS low, budget healthy)

Normal participation bounds
Passive-first when queue quality supports it
Standard child-order spacing

State B: Guard (CSS medium or budget tightening)

Reduce passive quote lifetime
Increase cancel discipline
Narrow max-child-size
Raise fill-quality threshold (don’t chase toxic prints)

State C: Brake (CSS high or p95 budget near breach)

Hard cap aggressive participation
Optional temporary symbol cooldown
Force residual plan re-optimization
Escalate if consecutive tail breaches exceed threshold

Transition should include hysteresis to avoid flapping.

4) Online metrics that actually matter

Track these every 1–5 minutes:

Tail coverage: realized slippage <= predicted q95 (target ~95%)
Budget burn velocity: used_bps / elapsed_schedule
Cascade false-negative rate: severe slippage with low CSS (bad)
Over-defensive rate: high CSS but benign realized cost (also bad)
State dwell profile: time spent in Harvest/Guard/Brake

If coverage drops and false-negatives rise together, prioritize recalibration immediately.

5) Data contract (implementation-ready)

interface SliceEvent {
  ts: string
  symbol: string
  side: 'BUY' | 'SELL'
  qty: number
  px: number
  eventType: 'trade' | 'book_update' | 'own_fill' | 'own_cancel'
}

interface CascadeFeatures {
  lambdaBuy: number
  lambdaSell: number
  burstRatio: number
  persistence: number
  instability: number
  css: number
}

interface ExecutionDecision {
  state: 'HARVEST' | 'GUARD' | 'BRAKE'
  maxParticipation: number
  maxChildQty: number
  passiveTtlMs: number
  aggressionCap: number
  reasonCodes: string[]
}

6) Calibration loop (daily + intraday)

6.1 Daily batch

Fit/refresh Hawkes parameters per symbol-liquidity bucket
Robustify with shrinkage to bucket priors for sparse names
Backtest on rolling windows with coverage-first objective

6.2 Intraday lightweight update

Recompute CSS mapping coefficients (w1..w3) using recent residual errors
Keep strict bounds on coefficient drift per hour
Freeze updates when data quality checks fail

7) Failure modes and guardrails

Failure mode 1: Model overreacts to benign bursts

Guardrail:

Cap CSS contribution from one feature
Require multi-signal confirmation for Brake state

Failure mode 2: Model underreacts in fast toxic cascades

Guardrail:

Add emergency trigger independent of fitted model:
- sudden spread expansion + cancel surge + adverse markout spike

Failure mode 3: Hidden latency invalidates decisions

Guardrail:

Include end-to-end decision latency in features
Auto-tighten aggression caps when latency > threshold

8) Fast rollout sequence (practical)

Shadow mode: compute CSS + suggested state, no live control
Paper control mode: apply state machine in simulator/replay
Canary symbols: 5–10 liquid names only
Progressive rollout by ADV buckets
Weekly postmortem on tail misses + over-defensive misses

Do not skip shadow diagnostics; it catches most threshold mistakes cheaply.

9) Reference reading (for deeper theory)

Bacry, Mastromatteo, Muzy (2015), Hawkes processes in finance
Bouchaud et al., market impact & propagator literature
Cartea, Jaimungal, Penalva, Algorithmic and High-Frequency Trading (execution foundations)

One-line takeaway

Execution risk is not just “how expensive now,” but “how likely this burst is to trigger the next burst before we finish” — model that conditional cascade, then tie policy to explicit tail budget.