Monotonic-Constrained Slippage Models: Production Sanity Playbook

Date: 2026-03-16
Category: research
Audience: small quant/operator teams that need slippage models to behave safely under live drift

Why this research matters

In live execution, many model incidents are not from low average accuracy, but from economically impossible local behavior:

predicted cost goes down when spread widens,
predicted impact goes down when participation rises,
predicted tail risk drops as latency increases.

These violations look small in offline metrics, but they can flip routing/tactic choices at exactly the wrong time.

Monotonic constraints are a practical way to encode first-principles guardrails so models stay policy-aligned during regime stress.

1) Define the monotonic contract first (before training)

For signed slippage cost (C) in bps (higher = worse), create a feature sign contract:

(\partial C / \partial \text{halfSpread} \ge 0)
(\partial C / \partial \text{participation} \ge 0)
(\partial C / \partial \text{latencyMs} \ge 0)
(\partial C / \partial \text{queueAhead} \ge 0)
(\partial C / \partial \text{depthAtTouch} \le 0)

Not every feature should be constrained. Keep unconstrained those with known non-monotone effects (e.g., intraday U-shape time-of-day indicators).

Operational rule: version this contract (e.g., mono_contract_v3) and tie it to model artifacts.

2) Hard vs soft constraints (what to use in production)

Hard constraints

Use model classes that enforce shape globally at inference time.

Gradient-boosted trees with monotone constraints (XGBoost/LightGBM)
Calibrated lattice models (TensorFlow Lattice)

Pros:

impossible to violate declared signs at serve time
good for safety-critical feature axes

Cons:

small loss of unconstrained fit when data is noisy or mis-specified

Soft constraints

Add penalty terms for shape violations in custom objectives.

Pros:

more flexible fit

Cons:

can still violate signs under drift
harder to reason about in incidents

For execution systems, use hard constraints for core microstructure physics and soft constraints (if any) only for secondary interactions.

3) Recommended model stack

Use a two-head setup:

Mean cost head: monotonic-constrained GBDT
Tail head (q95): monotonic quantile model

Then rank tactics with:

[ \text{Score}=\mathbb{E}[C]+\lambda_{tail}\cdot Q_{95}(C)+\lambda_{deadline}\cdot P(unfinished) ]

This prevents “cheap-on-average, catastrophic-in-tail” tactic picks.

4) Minimal implementation recipe

Step A: Feature buckets

Constrained (+): spread, participation, latency, queue-ahead, reject-rate
Constrained (-): displayed depth, passive queue priority percentile
Unconstrained: time-of-day harmonics, venue dummies, event flags

Step B: Encode constraints

Example sign vector by column order (XGBoost style):

(+1,+1,+1,+1,-1,0,0,0,...)

Step C: Train with robust loss

Huber for mean head
pinball loss for q90/q95 heads
purged time-series CV + symbol stratification

Step D: Validate shape explicitly

Run pairwise monotone checks over sampled feature pairs:

hold others fixed,
bump constrained feature by (\Delta),
verify sign of prediction delta.

Track violation rate; target is 0% on hard-constrained axes.

5) Monitoring in live operations

Add a “shape health” dashboard alongside regular TCA:

mono_violation_rate_by_feature (should stay 0 for hard constraints)
edge_flip_count_due_to_small_feature_move
tail_underprediction_rate (realized > predicted q95)
constraint_tension_index (fit loss gap: constrained vs unconstrained challenger)

If tension index spikes, do not remove constraints first. Investigate:

broken feature pipeline,
regime shift,
stale fee/latency snapshots,
wrong sign assumption for that bucket.

6) Where teams usually fail

Over-constraining everything
Forces underfit and hides useful nonlinearities. Constrain only economically invariant directions.
No bucketed contracts
One global sign policy can be wrong for specific order types (e.g., midpoint peg vs IOC). Keep contract variants by tactic class when needed.
Ignoring interaction order effects
A feature can be monotone marginally but unstable in interaction. Run local grid checks on key pairs (spread × participation, latency × urgency).
Constraint drift blindness
Teams monitor MAE but not shape tension. MAE can look stable while decision boundaries become fragile.

7) 10-day rollout plan

Days 1-2
Define constrained feature contract + artifact schema.

Days 3-4
Train unconstrained baseline and constrained challenger (mean + q95).

Days 5-6
Run shape test suite + tail calibration checks by liquidity buckets.

Days 7-8
Shadow route with policy score using constrained predictions only.

Day 9
Canary on low-risk symbols with strict kill-switch thresholds.

Day 10
Promote if tail breach and shape health are within budget; freeze runbook v1.

Bottom line

Monotonic constraints are not about model elegance; they are about execution safety under uncertainty.

If you can only ship one improvement this month: enforce monotonicity on spread, participation, and latency for both mean and q95 slippage heads, then monitor shape health in production. That usually reduces policy flips and costly tail incidents faster than adding more exotic features.

References

XGBoost monotonic constraints tutorial
https://xgboost.readthedocs.io/en/stable/tutorials/monotonic.html
XGBoost parameter reference (monotone_constraints)
https://xgboost.readthedocs.io/en/stable/parameter.html
LightGBM parameter reference (monotone_constraints)
https://lightgbm.readthedocs.io/en/latest/Parameters.html
TensorFlow Lattice overview (shape-constrained models)
https://www.tensorflow.org/lattice/overview
Perold, A. F. (1988), The Implementation Shortfall: Paper versus Reality
https://www.hbs.edu/faculty/Pages/item.aspx?num=2083
Almgren, R., Chriss, N. (2000), Optimal Execution of Portfolio Transactions
https://www.smallake.kr/wp-content/uploads/2016/03/optliq.pdf