Slippage Modeling with Point-in-Time Feature Integrity

2026-03-27 · finance

Slippage Modeling with Point-in-Time Feature Integrity

A Practical Playbook to Kill Lookahead Leakage Before It Hits Live PnL

Why this note: Many slippage models look great offline not because they are smart, but because they accidentally peek into the future (late prints, corrected books, revised reference data). This note is about making execution-cost modeling time-honest.


1) The Hidden Failure Mode

A model leaks when feature values at decision time (t_d) are built with information that became available at (t > t_d).

In slippage pipelines, this happens more often than teams think:

Result: offline MAE improves, live markout worsens.


2) Time Model You Need (No Exceptions)

For every signal and every label, store at least:

Rule: feature eligibility is based on ingest_time <= decision_time, not event time.


3) Common Leak Sources in Execution Research

  1. Book-state leakage
    • Using reconstructed L2 that includes packets received after action.
  2. Benchmark leakage
    • Arrival/VWAP benchmark computed from cleaned bars unavailable in real-time.
  3. Reference data leakage
    • Tick-size, lot-size, fee tier, SSR flags, corporate-action state applied with revision hindsight.
  4. Transport-state leakage
    • Using final ACK/cancel status in pre-trade features.
  5. Cross-venue synchronization leakage
    • Assuming zero skew between venues/feeds in historical replay.

4) Point-in-Time Join Contract (Minimal SQL Pattern)

-- For each decision row d, pick the latest feature snapshot f
-- that was actually ingested before the decision.
SELECT d.order_id,
       d.decision_time,
       f.feature_value,
       f.ingest_time AS feature_ingest_time
FROM decisions d
LEFT JOIN LATERAL (
  SELECT feature_value, ingest_time
  FROM feature_store f
  WHERE f.symbol = d.symbol
    AND f.ingest_time <= d.decision_time
  ORDER BY f.ingest_time DESC
  LIMIT 1
) f ON TRUE;

If this constraint is missing, assume leakage until proven otherwise.


5) Leakage Diagnostics (Production KPIs)

Track these continuously:

Red line: FAV > 0.1% in active regimes should block promotion.


6) Training/Evaluation Setup that Resists Leakage

  1. Purged walk-forward splits (Lopez de Prado style): prevent temporal overlap contamination.
  2. Embargo window around split boundaries: drop samples near boundaries where latent leakage is highest.
  3. Decision-time feature freeze: train and infer from same feature materialization logic.
  4. Dual backtests:
    • Idealized (clean final data)
    • Deployable PIT (realistically delayed/partial data)

Promote only when deployable PIT metrics pass.


7) Model Design Implications

Leakage-safe slippage models should explicitly consume data-freshness state:

Aging/freshness is not metadata; it is predictive microstructure context.


8) Safe Rollout Pattern

  1. PIT linter in CI
    • fail build if schema lacks event/ingest/decision clocks.
  2. Shadow mode with frozen features (2+ weeks)
    • compare live-shadow vs offline-shadow residuals.
  3. Canary by symbol-liquidity buckets
    • start with liquid names + low volatility sessions.
  4. Leakage circuit breaker
    • auto-fallback if FAV/LG/SLD breaches threshold.

9) Fast Implementation Checklist

[ ] Add event_time + ingest_time + decision_time to all datasets
[ ] Enforce ingest_time <= decision_time in feature joins
[ ] Version reference data (fee tiers, lot/tick, SSR, corp actions)
[ ] Add PIT diagnostics (FAV, LG, SLD, PCR, RIM)
[ ] Run purged walk-forward + embargo evaluation
[ ] Gate release on deployable-PIT metrics, not clean-replay metrics

10) References


TL;DR

If your slippage model can “see” data that was not available at decision time, your backtest edge is fake. Use strict point-in-time joins, version every mutable reference source, monitor leakage KPIs in production, and only ship models that survive deployable-PIT evaluation.