Order-Book Depth Truncation & Hidden-Gap Slippage Playbook
Date: 2026-04-11
Category: research
Scope: How L1/L5/L10-style partial-depth feeds understate aggressive cost and distort slippage control
Why this matters
A lot of production execution stacks do not have the full book available at decision time.
Common realities:
- vendor only delivers top N price levels,
- storage budget keeps only truncated depth in historical replay,
- model features are built from L1/L5/L10 snapshots even when deeper book exists,
- routing logic prices child slices off visible cumulative depth only.
That creates a specific failure mode:
the model thinks it knows the local supply curve, but it only knows the front porch.
When urgency rises or displayed depth thins, the order sweeps beyond the observed ladder, hits hidden gaps, and realized cost jumps several ticks beyond the model forecast. The result is not just noisier execution — it is a systematic tail-underestimation problem.
One-line intuition
Partial depth is fine until your order needs the first level you cannot see. After that, slippage becomes a hidden-gap problem, not a spread problem.
Failure mechanism (operator timeline)
- Training data stores only top 5 or top 10 levels.
- Cost model learns impact from visible cumulative depth and near-touch imbalance.
- Router sizes marketable or urgency-escalated child orders using that truncated view.
- In calm states, execution stays inside observed depth often enough that the model looks good on average.
- In thin or stressed states, the order consumes past the last visible level.
- True deeper-book spacing is wider than implied by the truncated ladder.
- Realized implementation shortfall jumps, especially in p95/p99 tails.
- Postmortem says “regime shift” or “sudden liquidity shock,” but part of the damage was simply book-information insufficiency.
Extend slippage decomposition with an information-level term
Let:
- (C_{obs,N}(q)): estimated sweep cost for size (q) using only top (N) levels,
- (C_{true}(q)): realized or full-book cost,
- (IS): implementation shortfall.
Then write:
[ IS = IS_{market} + IS_{impact} + IS_{timing} + IS_{fees} + \underbrace{IS_{trunc}}_{\text{depth truncation / hidden-gap tax}} ]
with
[ IS_{trunc}(q, N) \approx C_{true}(q) - C_{obs,N}(q). ]
This term is near zero when the order stays inside observed depth, but becomes convex once sweep size crosses the truncation boundary.
Core production metrics
1) Overflow Probability (OP)
Probability that decision size (q) requires liquidity beyond the last visible level (N):
[ OP_N(q) = P\big(L^*(q) > N \mid x_t\big) ]
where (L^*(q)) is the deepest level actually needed to fill size (q), and (x_t) is state (spread, depth, volatility, imbalance, event intensity, venue, time-of-day).
This is the first metric to operationalize. If you cannot estimate overflow risk, you cannot trust the truncated-book cost estimate.
2) Conditional Hidden-Gap Burden (HGB)
Extra cost once overflow happens:
[ HGB_N(q) = E\big[C_{true}(q) - C_{obs,N}(q) \mid L^*(q) > N, x_t\big]. ]
Think of this as “how bad it gets when the visible book runs out.”
3) Truncation Coverage Ratio (TCR)
Observed fraction of required executable depth:
[ TCR_N(q) = \frac{D_{obs,N}(q)}{D_{req}(q) + \epsilon} ]
where:
- (D_{obs,N}(q)): visible cumulative depth up to level (N),
- (D_{req}(q)): depth actually required to complete the child order.
Low TCR means the model is making decisions with incomplete local supply information.
4) Information Sufficiency Curve (ISC)
Measure forecast quality as a function of depth level count:
[ ISC(N) = 1 - \frac{\mathrm{Loss}(N)}{\mathrm{Loss}(N_{full})} ]
for a cost or markout prediction loss of your choice.
This tells you whether L5 is “almost all the signal” or whether the step from L10 to L20 materially improves tail cost prediction.
5) Tail Overflow Loss Share (TOLS)
Fraction of p95/p99 cost attributable to truncation events:
[ TOLS = \frac{\sum IS_{trunc,i} \cdot 1{IS_i > q_{0.95}}}{\sum IS_i \cdot 1{IS_i > q_{0.95}}}. ]
If TOLS is large, your tail problem is partly a data-contract problem, not only a policy problem.
The key modeling split: two-stage truncation-aware cost model
A practical production model should separate:
- Will we overflow the visible book?
- If yes, how expensive is the hidden remainder?
Stage 1 — overflow classifier
Predict:
[ \hat{p}_{over} = P(L^*(q)>N \mid x_t). ]
Useful features:
- spread and spread regime,
- cumulative visible depth at L1/L3/L5/L10,
- imbalance and microprice skew,
- recent depletion/refill velocity,
- cancel intensity by side,
- trade intensity and sweep frequency,
- time-of-day / auction proximity / halt transition state,
- venue and symbol liquidity bucket.
Stage 2 — conditional overflow severity model
Given overflow, predict extra ticks/bps:
[ \widehat{HGB}N(q) = E[C{true}(q)-C_{obs,N}(q) \mid overflow, x_t]. ]
Good targets:
- extra ticks beyond truncated estimate,
- extra bps vs baseline cost curve,
- worst-level reached,
- post-trade short-horizon markout when overflow forced urgency escalation.
Combined estimator
[ E[C_{true}(q)\mid x_t] \approx C_{obs,N}(q) + \hat{p}_{over}(x_t)\cdot \widehat{HGB}_N(q, x_t). ]
This is much more robust than pretending truncated depth is the whole book.
Why average performance lies
A depth-truncated model can look good in backtests for three reasons:
- most child orders are small and stay inside visible depth,
- calm periods dominate sample count,
- mean loss hides overflow tails.
So a model may “win” on average while still failing exactly when:
- parent schedule is behind,
- spreads widen,
- queues evaporate,
- one more tick matters.
This is why promotion gates should be tail-first, not mean-first.
Control states for a live router
GREEN — VISIBLE_BOOK_SUFFICIENT
- low (OP_N(q))
- low hidden-gap burden
- small slices can trust visible depth
Actions:
- normal child sizing,
- standard passive/aggressive mix,
- ordinary confidence weight on visible supply curve.
YELLOW — OVERFLOW_RISK_RISING
- overflow probability climbing,
- touch depth thinning,
- refill slower than usual
Actions:
- reduce child clip size,
- shorten confidence horizon on truncated-book forecasts,
- require more evidence before sweeping.
ORANGE — HIDDEN_GAP_EXPOSED
- high overflow probability,
- conditional burden materially positive,
- p95 uplift concentrated in overflow states
Actions:
- cap marketable size per wave,
- re-route toward deeper venues,
- widen uncertainty bands in implementation shortfall forecast,
- use more conservative schedule catch-up.
RED — INFORMATION-INSUFFICIENT EXECUTION
- visible book clearly not decision-grade for current size/state,
- overflow burden unstable or exploding,
- live model confidence collapsing
Actions:
- switch to bounded deterministic sizing,
- tighten notional caps,
- prefer time-spreading over large sweeps,
- escalate operator visibility for large residuals.
Use hysteresis and minimum dwell time; otherwise the controller will flap around thin-book transitions.
Engineering patterns that help in the real world
1) Keep a shadow full-depth sample even if live decisions use truncated depth
You do not need full depth for every symbol at every microsecond to estimate truncation damage.
Even one of these helps:
- sampled full-depth capture,
- periodic full-book snapshots,
- venue-specific sweep reconstruction from historical updates,
- post-trade deepest-level-reached logs.
Without any shadow truth, truncation tax becomes invisible.
2) Version the information level in your feature store
depth_levels=1, 5, 10, full should be explicit metadata.
Otherwise you silently mix:
- models trained on L10,
- replays reconstructed from L50,
- live decisions running on L5.
That is a hidden train/serve mismatch.
3) Make child sizing conditional on information sufficiency
Do not use one global “safe marketable clip” if the visible depth level count changes by venue or symbol.
Safer rule:
[ q_{max}^{safe}(x_t) = \arg\max_q ; { OP_N(q\mid x_t) \le \alpha } ]
for an overflow-risk budget (\alpha).
4) Track deepest-level-reached as a first-class execution label
If your logs only keep fill price and notional, you miss the easiest proxy for truncation damage.
Track at least:
- deepest visible level consumed,
- whether deeper-than-visible execution was needed,
- slippage delta vs truncated estimate.
5) Separate “book is thin” from “book is unknown” in controls
Those are not the same problem.
- Thin but known book -> aggressive execution may still be optimal.
- Partially unknown book -> uncertainty penalty should rise even before realized impact does.
Validation protocol
- Build paired datasets using:
- truncated-depth features at level (N), and
- deeper-book or realized sweep truth.
- Compare baseline cost model vs truncation-aware two-stage model.
- Segment by:
- symbol liquidity tier,
- venue,
- urgency bucket,
- spread regime,
- time-of-day.
- Report:
- mean IS,
- p95/p99 IS,
- overflow calibration error,
- completion reliability,
- false-safe rate (predicted safe but overflowed badly).
- Promote only if tail metrics improve without unacceptable completion degradation.
A useful canary question:
When the order exceeded visible depth, did the new model know that before the order was sent?
Observability checklist
- overflow probability by symbol/venue/time bucket
- conditional hidden-gap burden histogram
- deepest-level-reached distribution
- truncated vs realized sweep-cost delta
- information sufficiency curve by level count
- p95/p99 IS split into overflow vs non-overflow events
- live share of notional sent while
OP_N(q)exceeds guardrail
Common mistakes
Treating cumulative visible depth as executable depth.
It is only executable depth if you never need the next unseen level.Optimizing mean cost only.
Overflow damage is a tail phenomenon.Ignoring information-level train/serve mismatch.
L10 training and L5 live inference is silent model corruption.Assuming deeper-book ignorance is random noise.
It is state-dependent and often worst exactly when urgency is highest.Using larger slices because average slippage looked stable.
Mean stability can coexist with catastrophic overflow tails.
Minimal implementation checklist
- Log information level (
L1/L5/L10/full) with every feature snapshot and replay dataset - Label overflow events: whether execution required liquidity beyond visible level (N)
- Train a two-stage model: overflow probability + conditional hidden-gap burden
- Put overflow-aware caps on marketable child sizing
- Add tail dashboards split by overflow vs non-overflow events
- Shadow-sample deeper book data for ongoing calibration
Practical takeaway
If your execution model sees only the first few levels, it should not pretend to forecast full sweep cost directly. First ask whether you are about to run out of visible book; then price the hidden remainder explicitly.
References
- Paddrik, M., Hayes, R., Scherer, W., Beling, P. (2014). Effects of Limit Order Book Information Level on Market Stability Metrics. Office of Financial Research Working Paper. https://www.financialresearch.gov/working-papers/files/OFRwp2014-09_PaddrikHayesSchererBeling_EffectsLimitOrderBookInformationLevelMarketStabilityMetrics.pdf
- Cont, R., Kukanov, A., Stoikov, S. (2014). The Price Impact of Order Book Events. Journal of Financial Econometrics. arXiv:1011.6402 — https://arxiv.org/abs/1011.6402
- Pham, T., Anderson, H., Lajbcygier, P., Cui, Y. (2020). The effects of trade size and market depth on immediate price impact in a limit order book market. Journal of Economic Dynamics and Control. https://ideas.repec.org/a/eee/dyncon/v120y2020ics0165188920301603.html
- Gould, M. D., Porter, M. A., Williams, S., McDonald, M., Fenn, D. J., Howison, S. D. (2013). Limit Order Books. Quantitative Finance, survey version on arXiv. https://arxiv.org/abs/1012.0349