Self-Exciting Order-Flow Cascade Slippage Playbook (Hawkes + Budget Controller)
Date: 2026-02-25 (KST)
TL;DR
Average-bps slippage models break exactly when you need them most: during clustered, self-reinforcing order-flow bursts.
This playbook adds a self-excitation layer (Hawkes-style intensity) on top of baseline slippage forecasts, then drives execution with a tail-budget controller:
- Estimate real-time buy/sell event intensities (
lambda+,lambda-) - Convert intensity imbalance + burst persistence into a Cascade Stress Score (CSS)
- Use CSS + remaining slippage budget to switch execution states (
Harvest -> Guard -> Brake) - Recalibrate daily, verify online with p95 breach alarms
The goal is simple: survive cascade regimes without giving up all fills in normal regimes.
1) Why this model layer matters
Most live controllers already include spread, volatility, depth, and maybe markout. Useful, but still incomplete.
What gets missed in practice:
- Flow arrives in clusters, not IID events
- One burst changes the conditional probability of the next burst
- Queue quality degrades faster than displayed size suggests
- Passive strategy can become a delayed-impact strategy during cascade windows
So the model should ask:
“Given the last burst, how likely is another burst before my next slice?”
That is a conditional-intensity question, not a static regression question.
2) Minimal model architecture
Use a two-layer stack:
2.1 Baseline slippage model (already familiar)
Predict quantiles from standard microstructure inputs:
- spread, micro-volatility
- top-of-book + local depth slope
- queue imbalance / OFI-like features
- participation rate + residual schedule pressure
Output:
q50_base,q90_base,q95_basein bps
2.2 Cascade overlay (new layer)
Model market-order/significant trade arrivals as self-exciting processes:
- Buy intensity:
lambda+(t) - Sell intensity:
lambda-(t)
Practical starter (exponential kernel):
lambda_s(t) = mu_s + sum_{j: t_j < t} alpha_{s,s_j} * exp(-beta_{s,s_j}(t - t_j))
Where:
mu_s: background flowalpha: excitation strengthbeta: decay speed (memory)
From this, compute Cascade Stress Score (CSS):
CSS = w1 * burst_ratio + w2 * same_side_persistence + w3 * cross_side_instability
Example components:
burst_ratio = (lambda+ + lambda-) / rolling_median(lambda_total)same_side_persistence = max(lambda+, lambda-) / (min(lambda+, lambda-) + eps)cross_side_instability = short_horizon_flip_rate * lambda_total
Then augment tail forecast:
q95_live = q95_base + g(CSS, spread, vol, residual_qty)
Keep g(.) monotone in CSS.
3) Control policy: 3-state execution machine
State A: Harvest (CSS low, budget healthy)
- Normal participation bounds
- Passive-first when queue quality supports it
- Standard child-order spacing
State B: Guard (CSS medium or budget tightening)
- Reduce passive quote lifetime
- Increase cancel discipline
- Narrow max-child-size
- Raise fill-quality threshold (don’t chase toxic prints)
State C: Brake (CSS high or p95 budget near breach)
- Hard cap aggressive participation
- Optional temporary symbol cooldown
- Force residual plan re-optimization
- Escalate if consecutive tail breaches exceed threshold
Transition should include hysteresis to avoid flapping.
4) Online metrics that actually matter
Track these every 1–5 minutes:
- Tail coverage: realized slippage <= predicted q95 (target ~95%)
- Budget burn velocity: used_bps / elapsed_schedule
- Cascade false-negative rate: severe slippage with low CSS (bad)
- Over-defensive rate: high CSS but benign realized cost (also bad)
- State dwell profile: time spent in Harvest/Guard/Brake
If coverage drops and false-negatives rise together, prioritize recalibration immediately.
5) Data contract (implementation-ready)
interface SliceEvent {
ts: string
symbol: string
side: 'BUY' | 'SELL'
qty: number
px: number
eventType: 'trade' | 'book_update' | 'own_fill' | 'own_cancel'
}
interface CascadeFeatures {
lambdaBuy: number
lambdaSell: number
burstRatio: number
persistence: number
instability: number
css: number
}
interface ExecutionDecision {
state: 'HARVEST' | 'GUARD' | 'BRAKE'
maxParticipation: number
maxChildQty: number
passiveTtlMs: number
aggressionCap: number
reasonCodes: string[]
}
6) Calibration loop (daily + intraday)
6.1 Daily batch
- Fit/refresh Hawkes parameters per symbol-liquidity bucket
- Robustify with shrinkage to bucket priors for sparse names
- Backtest on rolling windows with coverage-first objective
6.2 Intraday lightweight update
- Recompute CSS mapping coefficients (
w1..w3) using recent residual errors - Keep strict bounds on coefficient drift per hour
- Freeze updates when data quality checks fail
7) Failure modes and guardrails
Failure mode 1: Model overreacts to benign bursts
Guardrail:
- Cap CSS contribution from one feature
- Require multi-signal confirmation for Brake state
Failure mode 2: Model underreacts in fast toxic cascades
Guardrail:
- Add emergency trigger independent of fitted model:
- sudden spread expansion + cancel surge + adverse markout spike
Failure mode 3: Hidden latency invalidates decisions
Guardrail:
- Include end-to-end decision latency in features
- Auto-tighten aggression caps when latency > threshold
8) Fast rollout sequence (practical)
- Shadow mode: compute CSS + suggested state, no live control
- Paper control mode: apply state machine in simulator/replay
- Canary symbols: 5–10 liquid names only
- Progressive rollout by ADV buckets
- Weekly postmortem on tail misses + over-defensive misses
Do not skip shadow diagnostics; it catches most threshold mistakes cheaply.
9) Reference reading (for deeper theory)
- Bacry, Mastromatteo, Muzy (2015), Hawkes processes in finance
- Bouchaud et al., market impact & propagator literature
- Cartea, Jaimungal, Penalva, Algorithmic and High-Frequency Trading (execution foundations)
One-line takeaway
Execution risk is not just “how expensive now,” but “how likely this burst is to trigger the next burst before we finish” — model that conditional cascade, then tie policy to explicit tail budget.