Slippage Modeling Regime-Switching Impact Calibration Playbook
Stop Using One Global Impact Curve — Fit by Liquidity Regime and Update Online
Why this note: A single slippage model is usually “right on average, wrong when it hurts.” Production execution should calibrate impact by intraday/liquidity regime and switch policies when the regime changes.
1) Failure Mode in One Sentence
If you fit one global impact curve across all sessions and volatility states, you will underprice tails in stressed regimes and overtrade exactly when liquidity is fragile.
2) Practical Model Stack (Structural + Data-Driven)
At decision time (t), expected cost for action (a):
[ \mathbb{E}[IS_t(a)] = C_{spread} + C_{temp\ impact} + C_{perm\ impact} + C_{timing} + C_{fees} ]
A production-friendly decomposition:
- Spread term: crossing + queue-loss penalty
- Temporary impact: participation/urgency-sensitive
- Permanent impact proxy: adverse selection / information footprint
- Timing/opportunity term: alpha decay while waiting
- Fee/rebate term: venue + order-type economics
Use structure for interpretability and ML residuals for regime-specific corrections.
3) Core Equations You Actually Need
A) Baseline impact prior (square-root family)
[ I_{prior}(Q) = Y,\sigma,\sqrt{\frac{Q}{V}} ]
- (Q): parent size, (V): expected session volume (or bucket ADV)
- (\sigma): horizon volatility
- (Y): asset/regime coefficient
This is a useful prior, not a production truth.
B) Participation form for child decisions
[ I_{temp}(u) = \eta_r \cdot u^{\beta_r}, \quad u = \frac{\text{participation rate}}{\text{target liquidity}} ]
- Regime index (r) allows different ((\eta,\beta)) by state.
C) Transient decay (propagator intuition)
[ \Delta p_t = \sum_{k \le t} G_r(t-k),q_k + \epsilon_t ]
Where (G_r) decays faster/slower depending on regime (normal vs stressed refill speed).
4) Regime Definition (Simple and Robust)
Do not overcomplicate with fragile HMMs on day one. Start with deterministic buckets:
- R1 Calm: tight spread, deep book, low short-horizon vol
- R2 Busy: medium spread/imbalance, normal refill
- R3 Stressed: wide spread, shallow depth, high cancel rate, elevated volatility
Suggested online signals:
spread_bpsdepth_l1_usd,depth_l5_usdcancel_to_trade_ratiomicroprice_drift_1svol_1s,vol_30squeue_age_ms/ refill half-life proxy
Promote to probabilistic regime labeling only after telemetry quality is stable.
5) Calibration Pipeline (Daily + Intraday Online Updates)
Step 1 — Offline robust fit (per symbol bucket × regime)
- Fit (\eta_r, \beta_r) using robust regression (Huber/quantile).
- Winsorize extreme events; store raw tails separately for risk overlays.
- Estimate spread/fee components directly from realized fills.
Step 2 — Tail model
- Fit conditional quantiles (P50/P90/P97.5) of slippage, not only mean.
- Track exceedance rate by bucket (coverage control).
Step 3 — Online coefficient refresh
- Exponential forgetting update for (\eta_r) and residual bias.
- Hard guardrail: freeze updates when sample size is too small or data quality degrades.
Step 4 — Policy linkage
Use cost quantiles for action selection:
[ a_t^* = \arg\min_a; \mathbb{E}[IS_t(a)] + \lambda,\mathrm{CVaR}_{q}(IS_t(a)) ]
6) Telemetry Contract (Must-Have Fields)
Decision features
decision_ts,symbol,side,parent_id,child_idremaining_qty,time_to_deadline_ms,schedule_progressregime_label,regime_prob
Market features
mid,spread_bps,depth_l1/l5,imbalancetrade_rate,cancel_rate,vol_1s/30s
Execution outcomes
child_qty,limit_offset_ticks,tacticfill_qty,fill_px,fill_latency_ms,reject_counteffective_fee_bps,rebate_bps
Labels
realized_is_bpsmarkout_1s/5s/30stail_event_flag(e.g., P97.5 breach)
If regime label at decision time is missing, your calibration is untrustworthy.
7) Control States for Live Routing
- NORMAL: optimize mean+tail cost as configured
- GUARD: triggered by rising tail undercoverage or regime shift to R3
- cap aggression step size
- shorten passive timeout
- tighten max queue age
- RECOVERY: after repeated non-fill or deadline risk
- raise participation with bounded sweep size
- SAFE_EXIT: near deadline breach
- prioritize completion certainty, log explicit exception reason
Use hysteresis + minimum dwell time to avoid oscillation.
8) KPIs That Catch Model Drift Early
- Coverage Error @ q: observed tail exceedance minus target
- Regime Confusion Rate: post-hoc regime mismatch vs decision-time label
- Tail Cost Inflation: P95/P50 ratio drift
- Late Catch-Up Share: cost paid in final bucket
- Venue Fee Drift: predicted vs realized fee/rebate gap
A common anti-pattern: average IS improves while Coverage Error worsens.
9) Rollout Plan (Production-Safe)
- Shadow predictions for 1–2 weeks (no policy impact)
- Compare old vs new by symbol-liquidity deciles
- Canary rollout with notional caps and kill-switch
- Promote only if:
- tail coverage improves,
- deadline breaches do not increase,
- fee drift remains bounded
10) Fast Checklist
[ ] Define 3 deterministic liquidity regimes from live microstructure signals
[ ] Fit impact coefficients per regime (not global)
[ ] Model tail quantiles and monitor coverage
[ ] Add online update with sample-size and data-quality guardrails
[ ] Wire model output to NORMAL/GUARD/RECOVERY/SAFE_EXIT state machine
[ ] Gate promotion on tail stability, not mean slippage alone
References
- Almgren, R., Chriss, N. (2000), Optimal Execution of Portfolio Transactions.
- Kyle, A. (1985), Continuous Auctions and Insider Trading.
- Gatheral, J. (2010), No-Dynamic-Arbitrage and Market Impact.
- Obizhaeva, A., Wang, J. (2013), Optimal Trading Strategy and Supply/Demand Dynamics.
- Cartea, Á., Jaimungal, S., Penalva, J. (2015), Algorithmic and High-Frequency Trading.
TL;DR
Use a structural impact prior, calibrate by liquidity regime, monitor tail coverage continuously, and switch routing behavior when regime stress appears. One global slippage curve is convenient—but expensive.