Conformal Slippage Control Playbook (Online Calibration for p95 Survival)

2026-02-24 · finance

Conformal Slippage Control Playbook (Online Calibration for p95 Survival)

Why this playbook

Most slippage models fail in production for one boring reason: calibration drift.

This playbook combines:

  1. A base slippage predictor (feature-rich, fast inference)
  2. Online conformal calibration (distribution-free interval control)
  3. A budget-aware execution controller

Goal: keep realized slippage tail risk inside explicit limits, even when market microstructure changes intraday.


Core idea in one line

Don’t trust point estimates; enforce calibrated upper bounds and drive execution urgency from remaining tail budget.


System design

1) Base predictor (fast, stable)

Predict per-child-order expected shortfall in bps with a lightweight model (e.g., gradient boosting / linear + interactions):

Output:

Design constraint

Inference must stay cheap enough for per-slice routing decisions (sub-second budget end-to-end).


2) Nonconformity score

For each completed child fill i:

Use one-sided upper-tail nonconformity:

a_i = max(0, r_i)

Why one-sided: execution control mainly cares about not underestimating bad-cost tail.


3) Online conformal wrapper (rolling)

Maintain rolling buffer A_t of recent nonconformity scores (e.g., last 1,000–5,000 fills, optionally regime-weighted).

For target miscoverage alpha:

Interpretation:

Drift adaptation

Use weighted/segmented buffers, not a single global memory:

This keeps calibration responsive without throwing away all history.


4) Budget-aware execution controller

Define per-parent-order budget:

Use calibrated upper bound U_t to estimate worst-case incremental burn.

Control states

A simple gating score:

stress = U_t / max(eps, target_slice_budget)

with hysteresis bands to avoid flip-flop.


Production algorithm (minimal)

for each decision tick t:
  x_t <- build microstructure/features
  mu_hat <- base_model.predict(x_t)

  q95 <- conformal_quantile(buffer=A_regime(t), level=0.95)
  U95 <- mu_hat + q95

  budget_slice <- B_left / max(1, slices_left)
  stress <- U95 / max(eps, budget_slice)

  if stress < 0.8: state = HARVEST
  else if stress < 1.2: state = BALANCE
  else: state = SALVAGE

  policy <- map_state_to_execution(state)
  place/modify/cancel orders via policy

on fill completion:
  y <- realized slippage(fill)
  a <- max(0, y - mu_hat_at_decision)
  update conformal buffer by regime
  update B_used

Monitoring (what actually matters)

Track these online (5m/30m/day):

  1. Coverage error
    • realized(y <= U95) vs target 95%
  2. Tail exceedance magnitude
    • mean/median of (y - U95)_+
  3. Budget burn velocity
    • bps consumed per elapsed participation/time
  4. State occupancy
    • Harvest/Balance/Salvage dwell time
  5. Opportunity-cost companion metric
    • underfill and alpha decay penalty (don’t “solve” slippage by never trading)

If coverage drops materially (e.g., <90% for sustained window), auto-tighten controller and trigger recalibration alarm.


Practical parameter defaults (starting point)

These are starter values; production values must be tuned from live paper-trading logs.


Backtest-to-live validation ladder

  1. Historical replay
    • Compare raw model vs conformal-wrapped coverage stability.
  2. Paper trading (shadow)
    • Log predicted U95, realized slippage, and state transitions.
  3. Tiny capital canary
    • Hard caps on notional + automatic freeze on coverage breach.
  4. Progressive scale-up
    • Increase size only if coverage and budget-burn SLOs hold for N sessions.

Failure modes and fixes

A) Coverage looks good, PnL still weak

Cause: controller too conservative, opportunity cost too high. Fix: jointly optimize slippage + completion/alpha metrics; add explicit tradeoff coefficient.

B) Coverage collapses at open/close

Cause: non-stationary microstructure around auctions. Fix: dedicated auction regime buffers and separate conformal quantiles.

C) State oscillation (thrashing)

Cause: no hysteresis / noisy stress score. Fix: smoothing + minimum dwell + asymmetric enter/exit thresholds.

D) “Calibrated” but lagging on regime break

Cause: buffer memory too long. Fix: stronger decay, drift detector (CUSUM/Page-Hinkley), temporary defensive multiplier.


Integration notes for Vellab execution stack


Bottom line

A slippage model is only useful if it stays honest under drift.

Online conformal calibration gives a practical honesty layer: “How bad can this slice get with controlled error rate?”

Once that bound exists, execution policy becomes a disciplined control problem, not a vibes-based urgency argument.