Queue-Depletion Velocity × Refill-Latency Asymmetry Slippage Playbook
Date: 2026-03-30
Category: research
Focus: Practical slippage modeling for markets where displayed depth disappears faster than it refills (especially during microbursts and queue-shock regimes).
1) Problem Framing
Most execution models assume “depth loss is temporary” and that refill arrives quickly enough to keep impact near expected curves.
In production, that assumption fails in specific windows:
- quote lifetimes collapse,
- one side of book is repeatedly depleted,
- refill arrives late and thinner,
- spread widens in discrete jumps,
- passive orders age into toxicity.
This creates a convex slippage regime: small participation changes can produce outsized cost changes.
The useful abstraction is:
- QDV (Queue Depletion Velocity): speed of near-touch depth removal
- RLA (Refill-Latency Asymmetry): delay/skew of replenishment across bid/ask
When QDV rises while RLA worsens, passive edge decays quickly and aggressive catches become expensive.
2) Core Metrics (Live + Research)
2.1 QDV — Queue Depletion Velocity
For side (s \in {bid, ask}), level band (L) (e.g., top 3 levels), horizon (\Delta t):
[ QDV_s(t)=\frac{\max(0, D_s(t)-D_s(t+\Delta t))}{\Delta t} ]
Where (D_s) is displayed size in the monitored band.
Use robust versions:
- median QDV over 200–500 ms windows,
- tail QDV (p90/p95) for stress detection,
- event-time normalization (per book event) to avoid clock aliasing.
2.2 RLA — Refill-Latency Asymmetry
Measure time to recover a fraction (\rho) (e.g., 70%) of pre-shock depth:
[ \tau_s(\rho)=\inf{u>0: D_s(t+u) \ge \rho D_s(t^-)} ]
Then
[ RLA(t)=\log\left(\frac{\tau_{ask}(\rho)+\epsilon}{\tau_{bid}(\rho)+\epsilon}\right) ]
- (RLA>0): ask refills slower (buy-side urgency risk)
- (RLA<0): bid refills slower (sell-side urgency risk)
2.3 DSI — Depletion Synchrony Index
How synchronized are depletion bursts across venues/symbol peers:
[ DSI = \text{corr}\left(\mathbb{1}[QDV_s>q], \mathbb{1}[QDV_{peer,s}>q]\right) ]
High DSI means local routing won’t easily escape pressure.
2.4 QHL — Queue Half-Life
Time for available queue ahead to halve (passive order perspective). Short QHL with adverse drift implies queue position decays before fill odds materialize.
3) Model Architecture
Use a two-head model sharing microstructure features.
Head A: Fill Hazard Model
Estimate short-horizon fill probability / time-to-fill:
[ \lambda_{fill}(t)=f_1(\text{queueAhead}, QDV, RLA, spread, OFI, quoteAge, venueState) ]
Suggested families:
- survival model (Cox / AFT / piecewise exponential),
- calibrated GBDT with monotonic constraints,
- state-conditioned hazard (different params in NORMAL vs SHOCK).
Head B: Conditional Slippage Model
Estimate expected cost given action and realized fill path:
[ E[Slip|a_t, x_t] = f_2(a_t, QDV, RLA, volatilityBurst, DSI, latency, childSize) ]
Use quantile outputs (p50/p90/p99), not only mean.
Coupling
Expected action value:
[ J(a_t)=E[Slip|a_t] + \lambda \cdot P(\text{not filled by deadline}|a_t) ]
where (\lambda) converts deadline risk into bps penalty.
4) Regime State Machine (Execution Controls)
Define practical states from QDV and RLA:
- S0 STABLE: low QDV, balanced refill
- S1 THINNING: moderate QDV rise, mild refill lag
- S2 FRAGILE_SIDE: one-sided refill delay + fast depletion
- S3 BURST_STRESS: high QDV tails + synchronized depletion (high DSI)
- S4 SAFE_DEGRADE: model confidence low or telemetry inconsistent
Example policy by state
- S0: normal passive participation; standard child cadence.
- S1: reduce child size, shorten quote TTL, tighten cancel/replace guard.
- S2: avoid hanging passive orders on fragile side; switch to hybrid IOC + short passive peeks.
- S3: enforce urgency budget caps, widen expected-cost guardrails, venue diversification only if DSI low enough.
- S4: fail conservative: cap participation, prioritize completion certainty rules, emit operator alert.
5) Feature Set That Actually Matters
Microstructure features
- queue ahead / behind by level
- spread and spread velocity
- depth imbalance and imbalance acceleration
- order-flow imbalance (OFI) in event-time buckets
- cancel intensity vs trade intensity ratio
- quote age distribution (p50/p90)
- hidden/odd-lot proxy activity (if available)
System/transport features
- decision-to-wire latency
- ACK dispersion / reject burst counts
- market-data freshness gap
- sequencer/gateway mode flags
Context features
- opening/closing window flag
- auction proximity and imbalance publications
- macro/event windows
- venue health score
Avoid feature leakage:
- strict point-in-time joins,
- event-time alignment with uncertainty bounds,
- late packet correction marked, not silently overwritten.
6) Calibration & Validation Ladder
Offline
- Build shock-labeled datasets via QDV/RLA thresholds.
- Calibrate fill hazard first (Brier + calibration curves by regime).
- Calibrate slippage quantiles by side, venue, and state.
- Evaluate deadline-adjusted objective (J(a_t)), not standalone RMSE.
Counterfactual replay
- Compare baseline policy vs new policy on same market tape.
- Track:
- mean bps,
- p95/p99 bps,
- completion shortfall,
- cancel-to-fill waste ratio.
Live ramp
- 5% shadow → 10% guarded → 25% with kill-switch.
- Promotion gates should include tail metrics under S2/S3, not only overall mean.
7) Operational Guardrails
- Tail budget: hard p99 slippage ceiling by symbol bucket.
- Deadline budget: max tolerated completion miss probability.
- Quote staleness guard: cancel passive quotes when quote-age z-score exceeds threshold during S2/S3.
- Burst freeze: temporary child-order throttle if QDV tail + reject burst cross joint trigger.
- Telemetry integrity checks: if market-data freshness or ACK clocks are suspect, force S4.
8) Minimal Pseudocode
for each decision tick t:
x <- build_features(t)
state <- regime_classifier(QDV, RLA, DSI, telemetry_health)
for action in action_set:
fill_hazard <- model_fill(action, x, state)
slip_dist <- model_slip(action, x, state)
score[action] <- E[s slip_dist] + lambda * P(miss_deadline | fill_hazard)
action* <- argmin(score) subject to risk/tail/deadline guards
if telemetry_health bad or confidence too low:
action* <- safe_degrade_policy()
execute(action*)
9) Failure Modes to Watch
- False stability: mean spread looks normal while refill latency drifts up.
- Venue mirage: routing to “thicker” venue that is simultaneously depleting (high DSI).
- Passive toxicity trap: high queue age mistaken for queue edge.
- Control oscillation: overly reactive state transitions causing self-induced churn.
- Timestamp drift contamination: apparent refill lag due to clock mismatch, not market mechanics.
10) Practical Takeaway
Slippage spikes in modern books are often less about static liquidity level and more about liquidity recovery kinetics.
A robust execution stack should explicitly model:
- how fast depth disappears (QDV),
- how unevenly it comes back (RLA),
- and how that changes fill-vs-cost tradeoffs in real time.
If your controller does not account for depletion/refill asymmetry, it will over-trust passive exposure exactly when queue edge is evaporating.
Suggested reading (for deeper implementation)
- Large, J. (2006), Measuring the resiliency of an electronic limit order book.
- Xu et al. (2015), Resiliency of the limit order book.
- Jain, Kochems, Treleaven (2024), Limit Order Book Simulations: A Review.
- Cartea, Jaimungal, Penalva, Algorithmic and High-Frequency Trading (impact + execution control foundations).