Markout-Horizon Mismatch Slippage Playbook

Focus: prevent execution policies from overfitting to short-horizon markouts (milliseconds/seconds) while the real PnL objective lives on longer horizons (seconds/minutes), where impact decay, drift, and completion risk behave differently.

1) Why this matters in production

A common failure mode in live execution:

policy/tactic selection is tuned on very short markouts (e.g., 100ms-1s),
but desk success is judged by implementation shortfall + completion quality over longer windows (e.g., 30s-15m).

This mismatch can produce systematic errors:

False confidence in aggressive actions (great 200ms markout, poor 30s outcome),
Maker under-utilization (short-horizon adverse prints hide medium-horizon price improvement),
Late panic catch-up when short-horizon optimization starves completion,
Regime-fragile behavior (works in stable tape, fails in transition/liquidity-shock windows).

Short horizon is not wrong. It is just incomplete as a standalone objective.

2) Core definitions

For child fill (i) at time (t_i), side (s_i \in {+1,-1}), execution price (p_i), and midprice (m_t):

[ \text{Markout}i(\tau) = s_i \cdot (m{t_i+\tau} - p_i) ]

Use a horizon set such as:

[ \tau \in {100\text{ms}, 500\text{ms}, 1\text{s}, 5\text{s}, 30\text{s}, 120\text{s}} ]

Parent-order objective (simplified):

[ J = \mathbb{E}[IS] + \lambda \cdot \text{CVaR}_{q}(IS) + \eta \cdot P(\text{deadline miss}) ]

Key point: optimizing only (\text{Markout}(1s)) is generally not equivalent to minimizing (J).

3) Observable diagnostics for horizon mismatch

3.1 Horizon Inversion Rate (HIR)

Fraction of fills where short and long horizons disagree in sign:

[ HIR = P\big(\text{sign}(M(1s)) \neq \text{sign}(M(30s))\big) ]

High HIR means short-horizon ranking is not stable.

3.2 Delayed Regret Delta (DRD)

Gap between short-horizon win rate and long-horizon win rate:

[ DRD = P(M(1s)>0) - P(M(30s)>0) ]

Large positive DRD indicates “early win, later pain.”

3.3 Horizon Consistency Spread (HCS)

Cross-horizon dispersion per tactic/venue bucket:

[ HCS = \operatorname{Std}\big(\mathbb{E}[M(\tau)]\big)_{\tau \in \mathcal{T}} ]

Use to identify unstable tactics that look good only on one horizon.

3.4 Completion-Adjusted Markout (CAM)

Blend realized markouts with unfilled residual cost proxy:

[ CAM(\tau)= \text{FilledMarkout}(\tau) - c_{res}\cdot\text{ResidualRatio} ]

Prevents under-filling tactics from appearing artificially strong.

4) Modeling architecture

Use a multi-horizon, multi-head model instead of a single-horizon label.

4.1 Output heads

Predict quantiles (not only mean) for each horizon:

[ \hat M_q(\tau),\quad q\in{0.5,0.9,0.99},\ \tau\in\mathcal{T} ]

4.2 Structural constraints

Add soft consistency penalties:

adjacent-horizon smoothness,
sign-flip penalties when not justified by regime features,
monotone-risk constraints on spread/quote-age/latency variables.

4.3 Regime gate

Gate by liquidity/volatility/latency state:

STABLE
TRANSITION
SHOCK

Estimate:

[ P(R_t\mid X_t),\quad \hat M(\tau)=\sum_R P(R_t=R\mid X_t),\hat M_R(\tau) ]

This reduces “one-size-fits-all horizon behavior.”

5) Policy layer: from markouts to action score

For candidate action (a), compute horizon-weighted utility:

[ U(a)=\sum_{\tau\in\mathcal{T}} w_\tau(t_{deadline})\cdot \hat M(a,\tau) -\lambda,\widehat{CVaR}_q(a)-\eta,\widehat{MissProb}(a) ]

Where weights (w_\tau) shift with remaining time:

far from deadline -> more weight on medium/long horizon,
near deadline -> more weight on short horizon + completion.

This prevents “always optimize 1s markout” behavior.

6) Live control rules (operator-friendly)

Define a simple state machine:

HORIZON_ALIGNED (low HIR, low DRD)
SHORT_BIASED (high DRD, mild completion risk)
UNSTABLE (high HIR + high volatility/liquidity churn)
SAFE_COMPLETION (deadline risk dominates)

Control examples:

if state == SHORT_BIASED:
  reduce aggressive-cross bonus
  increase passive dwell window (bounded)
  require medium-horizon uplift confirmation

if state == UNSTABLE:
  shrink tactic-switch frequency
  cap venue hopping
  increase uncertainty penalty in action score

if state == SAFE_COMPLETION:
  prioritize completion reliability over short markout edge
  tighten residual budget and escalation ladder

7) Backtest and validation protocol

Step A — Cross-horizon calibration

For each horizon:

calibration error (quantile coverage),
sign-accuracy,
tail error (q90/q99).

Step B — Ranking stability

Compare action rankings under 1s-only objective vs multi-horizon utility. Track rank-correlation drift by regime/time-of-day.

Step C — Economic objective check

Report out-of-sample changes in:

parent IS mean,
IS tail (q95/CVaR),
completion ratio,
late catch-up frequency.

Step D — Counterfactual fairness

Ensure gains are not from hidden selection bias:

include no-fill/residual penalties,
verify by symbol liquidity tiers and volatile sessions,
run venue-stratified diagnostics.

Step E — Canary rollout

5% -> 20% -> 50% flow
rollback if completion or q95 IS breaches threshold
keep per-horizon telemetry dashboard during rollout

8) Typical pitfalls

Horizon leakage: using features unavailable at decision time for longer horizons.
Survivorship bias: evaluating only filled orders, ignoring residual/timeout cost.
Over-penalizing maker tactics: short adverse markout can coexist with better medium-horizon outcomes.
Static horizon weights: deadline and regime should change the weighting policy.
Metric monoculture: single KPI (e.g., 1s markout) quietly distorts routing behavior.

9) Minimal 2-week implementation plan

Week 1

Build multi-horizon labels and telemetry (M(100ms..120s)).
Add HIR/DRD/HCS/CAM dashboards by tactic/venue/session.
Train baseline multi-head model (no regime gate yet).

Week 2

Add regime gating + horizon-adaptive action score.
Shadow run against current policy, then small canary.
Publish operator report: tail IS delta, completion delta, inversion-rate delta.

References

Almgren, R., & Chriss, N. (2000). Optimal execution of portfolio transactions.
Gatheral, J. (2010). No-Dynamic-Arbitrage and Market Impact.
Cartea, Á., Jaimungal, S., & Penalva, J. (2015). Algorithmic and High-Frequency Trading.
Easley, D., López de Prado, M., & O’Hara, M. (2012). Optimal Execution Horizon.
- https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2038387
Bouchaud, J.-P., Farmer, J. D., & Lillo, F. (2009). How markets slowly digest changes in supply and demand.

One-line takeaway

If your policy is optimized on one short markout horizon, it can look locally brilliant while globally leaking slippage; multi-horizon consistency is the missing control surface.