Censoring-Aware Slippage Modeling Playbook

2026-02-26 · finance

Censoring-Aware Slippage Modeling Playbook

(Partial Fills, Cancels, and Survivorship Bias in Live Execution)

Date: 2026-02-26
Category: Research (Execution / Slippage Modeling)
Scope: Intraday live execution for single-name and baskets (KRX/NXT portable)


Why this model exists

Most slippage models are trained on filled child orders only. That is convenient—and wrong.

In production, a meaningful share of child intents are:

If we ignore these censored outcomes, the model learns from survivors (orders that did get fills), and systematically underestimates true execution drag.

This playbook adds a censoring-aware layer so expected cost reflects the real question:

“What is expected implementation cost for an intent, not only for already-filled prints?”


Core problem: MNAR labels in execution data

For each child intent at decision time (t), define:

Naive model trains (S_t \sim f(x_t)) only where (F_t>0). But fill is not random; it depends on queue state, urgency, toxicity, and your own policy. So missingness is MNAR (missing not at random).

Result:


Target quantity (what to predict)

For each intent, predict expected all-in execution tax:

[ \mathbb{E}[C_t \mid x_t] = \mathbb{E}[F_t \cdot S_t \mid x_t] + \mathbb{E}[(1-F_t) \cdot O_t \mid x_t] ]

Where:

Operationally, this decomposes into two heads:

  1. Fill head: (\hat{p}_t = P(F_t>0\mid x_t)), plus expected fill ratio (\widehat{\phi}_t=\mathbb{E}[F_t\mid x_t])
  2. Price head: (\hat{s}_t = \mathbb{E}[S_t\mid x_t, F_t>0])
  3. Opportunity head: (\hat{o}_t = \mathbb{E}[O_t\mid x_t, F_t=0]) (or from replay)

Then:

[ \widehat{C}_t = \widehat{\phi}_t \cdot \hat{s}_t + (1-\widehat{\phi}_t) \cdot \hat{o}_t ]


Modeling blueprint

1) Fill/Survival model

Use discrete-time survival or hazard model over short buckets (e.g., 100ms–1s):

[ h_{t,k} = P(\text{fill in bucket }k \mid \text{not filled before}, x_{t,k}) ]

Derive:

Useful features:


2) Conditional slippage model (filled branch)

Model (S_t) on filled events only, but correct selection bias via weighting.

Two practical choices:

Train with robust loss (Huber/quantile) because tails matter more than mean.


3) Opportunity-cost model (unfilled branch)

Estimate what happens to leftover quantity when the first intent fails:

Best source: counterfactual/replay engine with policy logs. If unavailable, approximate with:


4) Joint estimator and uncertainty

Return not just point estimate but interval:

[ \widehat{C}_t^{q50}, \widehat{C}_t^{q90}, \widehat{C}_t^{q99} ]

Use conformal or quantile calibration by symbol-liquidity bucket. Controllers should consume q90+ for guardrails, not mean only.


Training dataset design (critical)

Unit of analysis = intent event, not trade print.

Include for each intent:

Avoid leakage:


Evaluation scorecard

Primary offline metrics:

Policy-facing metrics:

Ablation to prove value:

  1. filled-only baseline,
  2. fill+price two-head,
  3. full censoring-aware (fill+price+opportunity).

If (3) doesn’t improve tail control, calibration is not production-ready.


Online controller integration

At each decision tick, compute:

Choose action minimizing risk-adjusted objective:

[ \arg\min_a ; \widehat{C}_t(a) + \lambda \cdot \text{CompletionRisk}_t(a) ]

with (\lambda) increasing as deadline approaches.

Simple tiering:


Pseudocode (intent-centric)

for intent in decision_stream:
    x = build_features(intent)

    phi = fill_ratio_model.predict(x)          # E[F|x]
    s_q50, s_q90 = slippage_model.predict(x)   # conditional on fill, bias-corrected
    o_q50, o_q90 = opp_model.predict(x)        # unfilled residue penalty

    c_q50 = phi * s_q50 + (1 - phi) * o_q50
    c_q90 = phi * s_q90 + (1 - phi) * o_q90

    action = policy_min_cost_under_budget(c_q90, completion_risk(intent))
    send(action)

Common failure modes

  1. Print-level training only
    Ignores canceled intents; produces chronic optimism.

  2. No policy-version feature
    Model drift appears as market drift.

  3. Opportunity cost set to zero
    Encourages fake patience and deadline panic later.

  4. Mean-only optimization
    Tail blowups continue even when average improves.

  5. No closed-loop monitoring
    Calibration decays silently after router/risk rule changes.


Minimal production monitoring

Run weekly champion-challenger:


One-line takeaway

A slippage model trained only on fills is a survivorship trap; model costs at the intent level (fill + no-fill opportunity branch) to control real-world execution tails instead of paper averages.