Self-Exciting Order-Flow (Hawkes) + Propagator Slippage Playbook
Date: 2026-03-02
Category: research (quant execution / slippage modeling)
Why this playbook exists
In live execution, slippage spikes are usually not random noise. They come in clusters:
- one-sided market order bursts,
- cancel cascades at touch,
- short refill droughts,
- then forced aggressive chasing near the deadline.
A static impact curve misses this because it assumes independent flow. This playbook combines:
- Hawkes-style order-flow intensity (to model clustered toxicity),
- Transient impact propagator (to model footprint + decay),
- Multi-horizon markout surface (to separate immediate fill cost vs later adverse drift),
- Tail-aware controller (to keep q95 under budget, not only mean bps).
1) Model stack (production view)
At each decision step (e.g., 250ms–1s), score candidate actions:
POST_PASSIVEJOIN_TOUCHIMPROVE_1TTAKE_SMALLTAKE_LARGE
For each action a, forecast:
E[cost | a](mean slippage),Q95[cost | a](tail),P(incomplete by deadline | a).
Then optimize:
[ J(a)=\mathbb{E}[C\mid a] + \lambda_{95}Q_{0.95}(C\mid a) + \lambda_{sla}R_{deadline}(a) ]
subject to hard constraints (POV, venue/risk limits, reject guardrails).
2) Hawkes layer: detect clustered order-flow toxicity
Let buy/sell aggressive flow counts be point processes N^+, N^-.
Use a marked bi-variate Hawkes intensity:
[ \lambda_t^{\pm}=\mu^{\pm}(x_t)+\sum_{\tau<t}\phi_{\pm\pm}(t-\tau),dN_\tau^{\pm}+\sum_{\tau<t}\phi_{\pm\mp}(t-\tau),dN_\tau^{\mp} ]
x_t: microstructure covariates (spread, touch depth, imbalance, quote age, volatility bucket, auction flag)- marks: event size bucket, venue, price-moving vs non-price-moving trade
Key derived signals for the controller:
- flow pressure:
lambda_plus - lambda_minus - toxicity persistence: integrated kernel mass over short horizon
- burst probability:
P(k or more aggressive prints in next H seconds)
Operational point: this layer predicts when waiting becomes expensive because adverse bursts are likely.
3) Propagator layer: convert flow into transient impact
Model short-term impact as convolution of signed flow:
[ I_t = \sum_{\tau\le t} G_{s_t}(t-\tau),q_\tau ]
with regime-conditioned kernel:
[ G_s(\Delta)=a_s e^{-\Delta/\tau_{f,s}} + b_s e^{-\Delta/\tau_{s,s}} ]
- fast component: immediate sweep + partial refill
- slow component: lingering metaorder footprint
- regime
s: tight/normal/wide spread × thick/thin depth × open/continuous/close session
Why pair with Hawkes?
- Hawkes says event intensity is about to cluster.
- Propagator says if we trade through that cluster, cost decays slowly or quickly.
- Combined, we can decide whether to cross now, post now, or split adaptively.
4) Markout surface: measure “cheap fill, expensive aftermath”
Build a surface over horizons h ∈ {1s, 5s, 30s, 120s}:
[ M(h, x_t, a)=\mathbb{E}[\text{mid}_{t+h}-\text{fillPrice}_t \mid x_t, a] ]
For buys, positive markout is bad (adverse selection); for sells, sign flips.
Use this to decompose total cost:
[ C = C_{instant} + w_1 M(1s)+w_5 M(5s)+w_{30}M(30s)+w_{120}M(120s) ]
This prevents a known failure mode: optimizer over-fits immediate spread capture while ignoring post-fill drift.
5) Data schema (minimum viable event log)
Capture one record per decision event:
- timestamp (exchange + local)
- symbol, venue, side, residual qty, deadline
- top-of-book snapshot (spread, depth1..k, imbalance)
- queue state (estimated queue percentile ahead)
- Hawkes features (
lambda+,lambda-, burst score) - propagator state (recent signed flow convolution terms)
- candidate action set + chosen action
- realized fill path (partials, latency, rejects, cancels)
- benchmark prices (arrival/decision/schedule)
- forward marks (1s/5s/30s/120s)
Strict point-in-time discipline is mandatory; no leakage from future book updates.
6) Training blueprint
A. Hawkes calibration
- start with exponential-basis kernels for stability and speed
- calibrate per symbol cluster; shrink sparse names to sector prior
- monitor branching ratio
n(self-excitation mass)- high
noften signals bursty toxicity regime
- high
B. Propagator estimation
- regress returns on lagged signed-flow basis by regime
- constrain kernel shape to avoid pathological oscillation
- include event type split: price-changing vs non-price-changing prints
C. Markout model
- gradient-boosted quantile model or distributional network
- output q50/q90/q95 per action and horizon
- recalibrate daily with PIT histograms / calibration curves
D. Joint scorer
- combine components into unified
J(a) - add uncertainty penalty when model disagreement grows
7) Online adaptation and safeguards
Intraday adaptive knobs
Update only low-dimensional multipliers intraday:
kappa_flowfor Hawkes intensity scaling,alpha_impactfor kernel amplitude,psi_tailfor q95 inflation.
Regime ladder
- GREEN: normal policy
- AMBER: reduce max slice, raise passive threshold
- RED: cap aggression hard, increase tail weight
- SAFE: deterministic completion mode + operator alert
Hard rollback conditions
- q95 slippage breach for N consecutive windows
- reject burst above threshold
- completion SLA drop below floor
- calibration collapse (predicted vs realized quantiles diverge)
8) Validation protocol before promotion
Offline walk-forward
Must beat baseline on all three:
- mean shortfall,
- q95 shortfall,
- completion within deadline.
Also check policy stability:
- action churn rate,
- cancel-replace loop frequency,
- sensitivity around open/close.
Shadow live
Emit (without routing control):
- top-1 recommended action,
- score decomposition,
- confidence / uncertainty,
- counterfactual expected delta vs current router.
Canary rollout
- 5% → 15% → 30% flow ramp
- automatic rollback triggers wired to risk controls
9) Practical implementation roadmap (4-week slice)
Week 1
- finalize event schema and replayable decision log
- add horizon markout labels
Week 2
- deploy Hawkes feature service (
/flow-intensity) - deploy propagator feature service (
/impact-state)
Week 3
- train quantile markout surface
- wire unified scorer (
/execution-score)
Week 4
- shadow + canary with strict kill-switches
- produce daily calibration report and drift dashboard
Deliverables:
- reproducible backtest notebook,
- model card with assumptions/limits,
- runbook for AMBER/RED/SAFE transitions.
10) Common failure modes (and what to do)
Good backtest, bad live open
Cause: open auction spillover and stale depth assumptions.
Fix: open-specific regime + wider uncertainty penalty for first X minutes.Passive overuse, deadline misses
Cause: fill model optimistic under cancel cascades.
Fix: raise non-fill chase penalty + intradaykappa_flowupdate.Mean improves, tail worsens
Cause: optimizer over-targets spread capture.
Fix: increaselambda_95, enforce q95 hard constraint.Action twitching
Cause: score ties + noisy features.
Fix: hysteresis band and minimum hold time between tactic switches.
11) Reference papers worth revisiting
- Huang, Lehalle, Rosenbaum — The Queue-Reactive Model (arXiv:1312.0563)
- Taranto et al. — Propagators: Transient vs History-Dependent Impact (arXiv:1602.02735)
- Alfonsi, Blanc, Schied — Mixed-Impact Hawkes Price Model for Optimal Execution (arXiv:1404.0648)
TL;DR
Treat slippage as a stateful control problem, not a static impact lookup. A robust production stack should jointly model:
- clustered flow risk (Hawkes),
- transient footprint (propagator),
- multi-horizon adverse selection (markout surface),
- and tail-constrained decisioning (q95 + SLA).
That is the difference between “good average bps” and “survives ugly sessions.”