Monotonic-Constrained Slippage Models: Production Sanity Playbook
Date: 2026-03-16
Category: research
Audience: small quant/operator teams that need slippage models to behave safely under live drift
Why this research matters
In live execution, many model incidents are not from low average accuracy, but from economically impossible local behavior:
- predicted cost goes down when spread widens,
- predicted impact goes down when participation rises,
- predicted tail risk drops as latency increases.
These violations look small in offline metrics, but they can flip routing/tactic choices at exactly the wrong time.
Monotonic constraints are a practical way to encode first-principles guardrails so models stay policy-aligned during regime stress.
1) Define the monotonic contract first (before training)
For signed slippage cost (C) in bps (higher = worse), create a feature sign contract:
- (\partial C / \partial \text{halfSpread} \ge 0)
- (\partial C / \partial \text{participation} \ge 0)
- (\partial C / \partial \text{latencyMs} \ge 0)
- (\partial C / \partial \text{queueAhead} \ge 0)
- (\partial C / \partial \text{depthAtTouch} \le 0)
Not every feature should be constrained. Keep unconstrained those with known non-monotone effects (e.g., intraday U-shape time-of-day indicators).
Operational rule: version this contract (e.g., mono_contract_v3) and tie it to model artifacts.
2) Hard vs soft constraints (what to use in production)
Hard constraints
Use model classes that enforce shape globally at inference time.
- Gradient-boosted trees with monotone constraints (XGBoost/LightGBM)
- Calibrated lattice models (TensorFlow Lattice)
Pros:
- impossible to violate declared signs at serve time
- good for safety-critical feature axes
Cons:
- small loss of unconstrained fit when data is noisy or mis-specified
Soft constraints
Add penalty terms for shape violations in custom objectives.
Pros:
- more flexible fit
Cons:
- can still violate signs under drift
- harder to reason about in incidents
For execution systems, use hard constraints for core microstructure physics and soft constraints (if any) only for secondary interactions.
3) Recommended model stack
Use a two-head setup:
- Mean cost head: monotonic-constrained GBDT
- Tail head (q95): monotonic quantile model
Then rank tactics with:
[ \text{Score}=\mathbb{E}[C]+\lambda_{tail}\cdot Q_{95}(C)+\lambda_{deadline}\cdot P(unfinished) ]
This prevents “cheap-on-average, catastrophic-in-tail” tactic picks.
4) Minimal implementation recipe
Step A: Feature buckets
- Constrained (+): spread, participation, latency, queue-ahead, reject-rate
- Constrained (-): displayed depth, passive queue priority percentile
- Unconstrained: time-of-day harmonics, venue dummies, event flags
Step B: Encode constraints
Example sign vector by column order (XGBoost style):
(+1,+1,+1,+1,-1,0,0,0,...)
Step C: Train with robust loss
- Huber for mean head
- pinball loss for q90/q95 heads
- purged time-series CV + symbol stratification
Step D: Validate shape explicitly
Run pairwise monotone checks over sampled feature pairs:
- hold others fixed,
- bump constrained feature by (\Delta),
- verify sign of prediction delta.
Track violation rate; target is 0% on hard-constrained axes.
5) Monitoring in live operations
Add a “shape health” dashboard alongside regular TCA:
mono_violation_rate_by_feature(should stay 0 for hard constraints)edge_flip_count_due_to_small_feature_movetail_underprediction_rate(realized > predicted q95)constraint_tension_index(fit loss gap: constrained vs unconstrained challenger)
If tension index spikes, do not remove constraints first. Investigate:
- broken feature pipeline,
- regime shift,
- stale fee/latency snapshots,
- wrong sign assumption for that bucket.
6) Where teams usually fail
Over-constraining everything
Forces underfit and hides useful nonlinearities. Constrain only economically invariant directions.No bucketed contracts
One global sign policy can be wrong for specific order types (e.g., midpoint peg vs IOC). Keep contract variants by tactic class when needed.Ignoring interaction order effects
A feature can be monotone marginally but unstable in interaction. Run local grid checks on key pairs (spread × participation, latency × urgency).Constraint drift blindness
Teams monitor MAE but not shape tension. MAE can look stable while decision boundaries become fragile.
7) 10-day rollout plan
Days 1-2
Define constrained feature contract + artifact schema.
Days 3-4
Train unconstrained baseline and constrained challenger (mean + q95).
Days 5-6
Run shape test suite + tail calibration checks by liquidity buckets.
Days 7-8
Shadow route with policy score using constrained predictions only.
Day 9
Canary on low-risk symbols with strict kill-switch thresholds.
Day 10
Promote if tail breach and shape health are within budget; freeze runbook v1.
Bottom line
Monotonic constraints are not about model elegance; they are about execution safety under uncertainty.
If you can only ship one improvement this month: enforce monotonicity on spread, participation, and latency for both mean and q95 slippage heads, then monitor shape health in production. That usually reduces policy flips and costly tail incidents faster than adding more exotic features.
References
XGBoost monotonic constraints tutorial
https://xgboost.readthedocs.io/en/stable/tutorials/monotonic.htmlXGBoost parameter reference (
monotone_constraints)
https://xgboost.readthedocs.io/en/stable/parameter.htmlLightGBM parameter reference (
monotone_constraints)
https://lightgbm.readthedocs.io/en/latest/Parameters.htmlTensorFlow Lattice overview (shape-constrained models)
https://www.tensorflow.org/lattice/overviewPerold, A. F. (1988), The Implementation Shortfall: Paper versus Reality
https://www.hbs.edu/faculty/Pages/item.aspx?num=2083Almgren, R., Chriss, N. (2000), Optimal Execution of Portfolio Transactions
https://www.smallake.kr/wp-content/uploads/2016/03/optliq.pdf