Self-Exciting Order-Flow (Hawkes) + Propagator Slippage Playbook

2026-03-02 · finance

Self-Exciting Order-Flow (Hawkes) + Propagator Slippage Playbook

Date: 2026-03-02
Category: research (quant execution / slippage modeling)

Why this playbook exists

In live execution, slippage spikes are usually not random noise. They come in clusters:

A static impact curve misses this because it assumes independent flow. This playbook combines:

  1. Hawkes-style order-flow intensity (to model clustered toxicity),
  2. Transient impact propagator (to model footprint + decay),
  3. Multi-horizon markout surface (to separate immediate fill cost vs later adverse drift),
  4. Tail-aware controller (to keep q95 under budget, not only mean bps).

1) Model stack (production view)

At each decision step (e.g., 250ms–1s), score candidate actions:

For each action a, forecast:

Then optimize:

[ J(a)=\mathbb{E}[C\mid a] + \lambda_{95}Q_{0.95}(C\mid a) + \lambda_{sla}R_{deadline}(a) ]

subject to hard constraints (POV, venue/risk limits, reject guardrails).


2) Hawkes layer: detect clustered order-flow toxicity

Let buy/sell aggressive flow counts be point processes N^+, N^-. Use a marked bi-variate Hawkes intensity:

[ \lambda_t^{\pm}=\mu^{\pm}(x_t)+\sum_{\tau<t}\phi_{\pm\pm}(t-\tau),dN_\tau^{\pm}+\sum_{\tau<t}\phi_{\pm\mp}(t-\tau),dN_\tau^{\mp} ]

Key derived signals for the controller:

Operational point: this layer predicts when waiting becomes expensive because adverse bursts are likely.


3) Propagator layer: convert flow into transient impact

Model short-term impact as convolution of signed flow:

[ I_t = \sum_{\tau\le t} G_{s_t}(t-\tau),q_\tau ]

with regime-conditioned kernel:

[ G_s(\Delta)=a_s e^{-\Delta/\tau_{f,s}} + b_s e^{-\Delta/\tau_{s,s}} ]

Why pair with Hawkes?


4) Markout surface: measure “cheap fill, expensive aftermath”

Build a surface over horizons h ∈ {1s, 5s, 30s, 120s}:

[ M(h, x_t, a)=\mathbb{E}[\text{mid}_{t+h}-\text{fillPrice}_t \mid x_t, a] ]

For buys, positive markout is bad (adverse selection); for sells, sign flips.

Use this to decompose total cost:

[ C = C_{instant} + w_1 M(1s)+w_5 M(5s)+w_{30}M(30s)+w_{120}M(120s) ]

This prevents a known failure mode: optimizer over-fits immediate spread capture while ignoring post-fill drift.


5) Data schema (minimum viable event log)

Capture one record per decision event:

Strict point-in-time discipline is mandatory; no leakage from future book updates.


6) Training blueprint

A. Hawkes calibration

B. Propagator estimation

C. Markout model

D. Joint scorer


7) Online adaptation and safeguards

Intraday adaptive knobs

Update only low-dimensional multipliers intraday:

Regime ladder

Hard rollback conditions


8) Validation protocol before promotion

Offline walk-forward

Must beat baseline on all three:

  1. mean shortfall,
  2. q95 shortfall,
  3. completion within deadline.

Also check policy stability:

Shadow live

Emit (without routing control):

Canary rollout


9) Practical implementation roadmap (4-week slice)

Week 1

Week 2

Week 3

Week 4

Deliverables:


10) Common failure modes (and what to do)

  1. Good backtest, bad live open
    Cause: open auction spillover and stale depth assumptions.
    Fix: open-specific regime + wider uncertainty penalty for first X minutes.

  2. Passive overuse, deadline misses
    Cause: fill model optimistic under cancel cascades.
    Fix: raise non-fill chase penalty + intraday kappa_flow update.

  3. Mean improves, tail worsens
    Cause: optimizer over-targets spread capture.
    Fix: increase lambda_95, enforce q95 hard constraint.

  4. Action twitching
    Cause: score ties + noisy features.
    Fix: hysteresis band and minimum hold time between tactic switches.


11) Reference papers worth revisiting


TL;DR

Treat slippage as a stateful control problem, not a static impact lookup. A robust production stack should jointly model:

That is the difference between “good average bps” and “survives ugly sessions.”