Bayesian State-Space Slippage Nowcasting Playbook

2026-02-25 · finance

Bayesian State-Space Slippage Nowcasting Playbook

Date: 2026-02-25
Category: research (quant execution)

TL;DR

If your slippage model is calibrated once per day, it is already stale by lunchtime.
Use a Bayesian state-space model to continuously nowcast slippage distribution (not just mean), then drive a risk-aware participation controller that adapts speed and aggressiveness in real time.


1) Problem Framing

Real execution costs drift intraday because:

A static model gives false confidence. In production, we need:

  1. Online-updating coefficients
  2. Uncertainty-aware forecasts (p50/p90/p95)
  3. Guardrails for tail-cost events

2) Model Structure

Model short-horizon slippage (bps) for child order (t):

[ y_t = x_t^\top \beta_t + \epsilon_t, \quad \epsilon_t \sim t_\nu(0, \sigma_t^2) ]

Time-varying coefficients:

[ \beta_t = \beta_{t-1} + \eta_t, \quad \eta_t \sim \mathcal{N}(0, Q) ]

Where:

Feature blocks (practical)


3) Hierarchical Priors (for sparse symbols)

For thinly traded names, borrow strength via hierarchical shrinkage:

[ \beta_{i,t} \sim \mathcal{N}(\mu_{g(i),t}, \Sigma_{g(i)}), ]

This stabilizes estimates early in session and reduces overreaction from tiny sample windows.


4) Online Update Loop

At each fill / micro-batch (e.g., every 5–15s):

  1. Ingest latest features and realized fill outcomes
  2. Predict posterior slippage distribution for next action
  3. Update posterior ((\beta_t, \Sigma_t)) with Bayesian filter step
  4. Recompute decision thresholds (p90/p95 and tail budget usage)

Use forgetting or adaptive process noise (Q):


5) Decision Policy (cost + risk)

Select participation/aggressiveness action (a_t) by minimizing:

[ \min_{a_t} ; \mathbb{E}[C_t(a_t)] + \lambda \cdot \mathrm{Var}(C_t(a_t)) + \gamma \cdot \text{AlphaDecayPenalty}(a_t) ]

Subject to tail constraint:

[ \Pr(C_t(a_t) > B_t) \le \delta_t ]

Where:

Action knobs


6) Guardrails for Live Trading

Minimum production controls:

  1. Data freshness gate: stale book/quote → safe fallback mode
  2. Tail breach gate: if p95 exceeds threshold N times, reduce POV / pause strategy
  3. Drift monitor: posterior residual CUSUM triggers regime reset
  4. Hard risk rails: max order size, max participation, max spread-cross count
  5. Kill-switch ladder: auto downgrade before full stop

7) Vellab + KIS Integration Blueprint

Services

Storage

Operational cadence


8) Evaluation Protocol

Do not evaluate only by average slippage.

Track:

A/B test against baseline static model with identical risk rails.


9) Implementation Skeleton (pseudo)

state = load_prior(symbol)
for t in stream(child_order_events):
    x = build_features(t)
    pred = posterior_predict(state, x)   # mean + quantiles

    action = solve_policy(
        pred_dist=pred,
        alpha_decay=t.alpha_decay,
        tail_budget=t.tail_budget,
        constraints=risk_limits
    )
    send_order(action)

    if t.has_realized_fill:
        y = realized_slippage(t)
        state = bayes_update(state, x, y, adaptive_Q=t.regime_score)

    if breach_or_drift(state, pred, t):
        action = safe_mode(action)

10) Common Failure Modes


11) Practical Rollout Plan (2 weeks)

  1. Week 1: shadow mode nowcasting + risk telemetry only (no execution control)
  2. Week 2: limited notional pilot with strict tail caps and automatic downgrade
  3. Expand universe only if p95 + breach KPIs beat baseline consistently

References (starting points)

The practical edge is not one “perfect model”; it is the closed loop: forecast distribution → risk-aware action → online update → guardrail enforcement.