Queue-Depletion Velocity × Refill-Latency Asymmetry Slippage Playbook

Date: 2026-03-30
Category: research
Focus: Practical slippage modeling for markets where displayed depth disappears faster than it refills (especially during microbursts and queue-shock regimes).

1) Problem Framing

Most execution models assume “depth loss is temporary” and that refill arrives quickly enough to keep impact near expected curves.

In production, that assumption fails in specific windows:

quote lifetimes collapse,
one side of book is repeatedly depleted,
refill arrives late and thinner,
spread widens in discrete jumps,
passive orders age into toxicity.

This creates a convex slippage regime: small participation changes can produce outsized cost changes.

The useful abstraction is:

QDV (Queue Depletion Velocity): speed of near-touch depth removal
RLA (Refill-Latency Asymmetry): delay/skew of replenishment across bid/ask

When QDV rises while RLA worsens, passive edge decays quickly and aggressive catches become expensive.

2) Core Metrics (Live + Research)

2.1 QDV — Queue Depletion Velocity

For side (s \in {bid, ask}), level band (L) (e.g., top 3 levels), horizon (\Delta t):

[ QDV_s(t)=\frac{\max(0, D_s(t)-D_s(t+\Delta t))}{\Delta t} ]

Where (D_s) is displayed size in the monitored band.

Use robust versions:

median QDV over 200–500 ms windows,
tail QDV (p90/p95) for stress detection,
event-time normalization (per book event) to avoid clock aliasing.

2.2 RLA — Refill-Latency Asymmetry

Measure time to recover a fraction (\rho) (e.g., 70%) of pre-shock depth:

[ \tau_s(\rho)=\inf{u>0: D_s(t+u) \ge \rho D_s(t^-)} ]

Then

[ RLA(t)=\log\left(\frac{\tau_{ask}(\rho)+\epsilon}{\tau_{bid}(\rho)+\epsilon}\right) ]

(RLA>0): ask refills slower (buy-side urgency risk)
(RLA<0): bid refills slower (sell-side urgency risk)

2.3 DSI — Depletion Synchrony Index

How synchronized are depletion bursts across venues/symbol peers:

[ DSI = \text{corr}\left(\mathbb{1}[QDV_s>q], \mathbb{1}[QDV_{peer,s}>q]\right) ]

High DSI means local routing won’t easily escape pressure.

2.4 QHL — Queue Half-Life

Time for available queue ahead to halve (passive order perspective). Short QHL with adverse drift implies queue position decays before fill odds materialize.

3) Model Architecture

Use a two-head model sharing microstructure features.

Head A: Fill Hazard Model

Estimate short-horizon fill probability / time-to-fill:

[ \lambda_{fill}(t)=f_1(\text{queueAhead}, QDV, RLA, spread, OFI, quoteAge, venueState) ]

Suggested families:

survival model (Cox / AFT / piecewise exponential),
calibrated GBDT with monotonic constraints,
state-conditioned hazard (different params in NORMAL vs SHOCK).

Head B: Conditional Slippage Model

Estimate expected cost given action and realized fill path:

[ E[Slip|a_t, x_t] = f_2(a_t, QDV, RLA, volatilityBurst, DSI, latency, childSize) ]

Use quantile outputs (p50/p90/p99), not only mean.

Coupling

Expected action value:

[ J(a_t)=E[Slip|a_t] + \lambda \cdot P(\text{not filled by deadline}|a_t) ]

where (\lambda) converts deadline risk into bps penalty.

4) Regime State Machine (Execution Controls)

Define practical states from QDV and RLA:

S0 STABLE: low QDV, balanced refill
S1 THINNING: moderate QDV rise, mild refill lag
S2 FRAGILE_SIDE: one-sided refill delay + fast depletion
S3 BURST_STRESS: high QDV tails + synchronized depletion (high DSI)
S4 SAFE_DEGRADE: model confidence low or telemetry inconsistent

Example policy by state

S0: normal passive participation; standard child cadence.
S1: reduce child size, shorten quote TTL, tighten cancel/replace guard.
S2: avoid hanging passive orders on fragile side; switch to hybrid IOC + short passive peeks.
S3: enforce urgency budget caps, widen expected-cost guardrails, venue diversification only if DSI low enough.
S4: fail conservative: cap participation, prioritize completion certainty rules, emit operator alert.

5) Feature Set That Actually Matters

Microstructure features

queue ahead / behind by level
spread and spread velocity
depth imbalance and imbalance acceleration
order-flow imbalance (OFI) in event-time buckets
cancel intensity vs trade intensity ratio
quote age distribution (p50/p90)
hidden/odd-lot proxy activity (if available)

System/transport features

decision-to-wire latency
ACK dispersion / reject burst counts
market-data freshness gap
sequencer/gateway mode flags

Context features

opening/closing window flag
auction proximity and imbalance publications
macro/event windows
venue health score

Avoid feature leakage:

strict point-in-time joins,
event-time alignment with uncertainty bounds,
late packet correction marked, not silently overwritten.

6) Calibration & Validation Ladder

Offline

Build shock-labeled datasets via QDV/RLA thresholds.
Calibrate fill hazard first (Brier + calibration curves by regime).
Calibrate slippage quantiles by side, venue, and state.
Evaluate deadline-adjusted objective (J(a_t)), not standalone RMSE.

Counterfactual replay

Compare baseline policy vs new policy on same market tape.
Track:
- mean bps,
- p95/p99 bps,
- completion shortfall,
- cancel-to-fill waste ratio.

Live ramp

5% shadow → 10% guarded → 25% with kill-switch.
Promotion gates should include tail metrics under S2/S3, not only overall mean.

7) Operational Guardrails

Tail budget: hard p99 slippage ceiling by symbol bucket.
Deadline budget: max tolerated completion miss probability.
Quote staleness guard: cancel passive quotes when quote-age z-score exceeds threshold during S2/S3.
Burst freeze: temporary child-order throttle if QDV tail + reject burst cross joint trigger.
Telemetry integrity checks: if market-data freshness or ACK clocks are suspect, force S4.

8) Minimal Pseudocode

for each decision tick t:
  x <- build_features(t)
  state <- regime_classifier(QDV, RLA, DSI, telemetry_health)

  for action in action_set:
    fill_hazard <- model_fill(action, x, state)
    slip_dist   <- model_slip(action, x, state)
    score[action] <- E[s slip_dist] + lambda * P(miss_deadline | fill_hazard)

  action* <- argmin(score) subject to risk/tail/deadline guards

  if telemetry_health bad or confidence too low:
    action* <- safe_degrade_policy()

  execute(action*)

9) Failure Modes to Watch

False stability: mean spread looks normal while refill latency drifts up.
Venue mirage: routing to “thicker” venue that is simultaneously depleting (high DSI).
Passive toxicity trap: high queue age mistaken for queue edge.
Control oscillation: overly reactive state transitions causing self-induced churn.
Timestamp drift contamination: apparent refill lag due to clock mismatch, not market mechanics.

10) Practical Takeaway

Slippage spikes in modern books are often less about static liquidity level and more about liquidity recovery kinetics.

A robust execution stack should explicitly model:

how fast depth disappears (QDV),
how unevenly it comes back (RLA),
and how that changes fill-vs-cost tradeoffs in real time.

If your controller does not account for depletion/refill asymmetry, it will over-trust passive exposure exactly when queue edge is evaporating.

Queue-Depletion Velocity × Refill-Latency Asymmetry Slippage Playbook

Queue-Depletion Velocity × Refill-Latency Asymmetry Slippage Playbook

1) Problem Framing

2) Core Metrics (Live + Research)

2.1 QDV — Queue Depletion Velocity

2.2 RLA — Refill-Latency Asymmetry

2.3 DSI — Depletion Synchrony Index

2.4 QHL — Queue Half-Life

3) Model Architecture

Head A: Fill Hazard Model

Head B: Conditional Slippage Model

Coupling

4) Regime State Machine (Execution Controls)

Example policy by state

5) Feature Set That Actually Matters

Microstructure features

System/transport features

Context features

6) Calibration & Validation Ladder

Offline

Counterfactual replay

Live ramp

7) Operational Guardrails

8) Minimal Pseudocode

9) Failure Modes to Watch

10) Practical Takeaway

Suggested reading (for deeper implementation)