Sub-Penny Queue-Jump & Price-Improvement Slippage Playbook

Date: 2026-03-10 Category: research (quant execution / slippage modeling)

Why this matters

A lot of execution models still assume a clean world:

join best bid/ask,
wait in queue,
fill probability follows visible queue dynamics,
spread capture is the main passive edge.

In practice, that is often false in names where sub-penny price improvement and midpoint/internalized flow are active. You can quote at the displayed best and still lose economic priority to hidden or internalized liquidity that improves by tiny amounts.

That creates a recurring tax:

lower-than-expected passive fills,
delayed completion and late chasing,
false confidence from “tight displayed spread” regimes.

This playbook treats that tax as first-class slippage risk.

Core concept: Sub-Penny Queue-Jump Tax (SQT)

Define SQT as the expected cost impact caused by hidden or internalized price-improved executions that reduce your displayed-queue edge.

At parent-order horizon:

[ \text{Total Slippage} = \text{Spread/Fee} + \text{Impact} + \text{Delay} + \text{Opportunity} + \text{SQT} ]

Where SQT is not a fee; it is a microstructure competitiveness penalty.

Mechanism map (how the tax appears)

You rest passively at displayed best.
Contra flow is intercepted by internalizers or hidden price-improved liquidity.
Your queue advances slower than visible tape would suggest.
Residual grows as alpha half-life decays.
You switch to higher urgency later and pay impact/markout.

So SQT is tightly coupled to both:

fill hazard distortion (lower passive completion odds), and
branch transition risk (forced late aggression).

Observable proxy metrics

Use a small metric stack you can compute intraday.

1) Queue-Jump Pressure (QJP)

A proxy for how much contra flow bypasses displayed queue.

[ \text{QJP} = 1 - \frac{\Delta \text{Displayed Queue Depletion Explained by Prints-at-touch}}{\Delta \text{Displayed Queue Depletion}} ]

Interpretation:

near 0: queue depletion mostly consistent with visible prints,
high: hidden/internalized executions likely siphoning flow.

2) Price-Improvement Share (PIS)

Fraction of executed volume occurring with sub-tick improvement vs displayed NBBO touch benchmark.

Higher PIS usually implies greater risk that displayed passive orders lose priority economics.

3) Midpoint Diversion Ratio (MDR)

Share of eligible flow routed/filling at midpoint or midpoint-like venues versus lit-touch participation.

High MDR often correlates with weaker realized fill for touch-joining passive slices.

4) Displayed Fill Efficiency (DFE)

[ \text{DFE} = \frac{\text{Realized passive fills at displayed touch}}{\text{Model-implied passive fills from queue dynamics alone}} ]

DFE < 1 indicates queue-jump pressure not explained by standard queue model.

5) Late Catch-Up Cost (LCC)

Extra bps paid in the final execution window after passive underfill accumulation.

Track by parent order and by symbol regime; this is where SQT invoice is paid.

Modeling approach

Use a two-layer model.

Layer A: Passive fill hazard with queue-jump correction

Start with your existing queue-based survival/fill model, then add QJP/PIS/MDR/DFE features and interaction terms:

queue position × QJP,
spread state × PIS,
volatility × MDR,
time-to-deadline × DFE.

This gives corrected passive completion probability.

Layer B: Branch cost model

Condition on branch outcomes:

Branch 1: passive completion succeeds,
Branch 2: partial fill then controlled aggression,
Branch 3: underfill until deadline then catch-up aggression.

Estimate branch-specific expected cost and q95 tail cost.

Final decision objective:

[ \min_a ; \mathbb{E}[\text{Cost} \mid a] + \lambda \cdot \text{CVaR}_{95}(\text{Cost} \mid a) + \eta \cdot \text{MissPenalty} ]

where action (a) is join/improve/mid/take/split route policy.

Execution state machine

Use explicit regime states (with hysteresis to avoid flapping):

DISPLAYED_EDGE_OK: low QJP, DFE near 1. Passive-friendly.
JUMP_RISK: QJP/PIS rising, DFE deteriorating. Reduce blind touch-join.
DIVERTED_FLOW: persistent high MDR + low DFE. Increase midpoint/internalization-aware routing mix and controlled improvement tactics.
SAFE: extreme underfill risk near deadline or unstable microstructure. Protect completion with hard caps and controlled aggression.

State transitions should be driven by smoothed metrics and minimum dwell time.

Practical policy knobs

When JUMP_RISK or DIVERTED_FLOW is active:

Reduce pure touch-join weight.
Increase tactical price-improvement attempts within risk limits.
Shorten passive dwell times when corrected fill hazard falls below threshold.
Pre-allocate completion buffer earlier (avoid terminal catch-up).
Tighten venue eligibility if a venue shows persistent high LCC contribution.

Data contract (minimum)

Per child order / event:

timestamp (decision, send, ack, fill, cancel),
side, price, size, displayed vs non-displayed flag (if available),
venue, order type, midpoint eligibility,
queue-position proxies,
NBBO/touch snapshot features,
markout ladder (e.g., 100ms/1s/5s/30s),
parent-order deadline and residual trajectory.

Without reliable event-time sequencing, QJP and DFE quickly become noisy.

Calibration & monitoring

Weekly

Refit hazard correction terms.
Re-estimate branch costs by liquidity bucket and symbol cohort.
Validate calibration of passive completion probabilities.

Daily

Monitor DFE drift by symbol/venue.
Track LCC contribution to total slippage.
Alert on QJP regime frequency shifts.

Intraday guardrails

If realized DFE breaches lower bound for N consecutive windows, force state ≥ JUMP_RISK.
If residual ratio + deadline pressure exceeds threshold under low DFE, escalate SAFE.

Rollout plan

Shadow mode (2–3 weeks): compute SQT metrics and counterfactual decisions only.
Canary (5–10% flow): enable state-machine policy with strict kill-switch.
Expand by cohort: high-liquidity names first, then harder names.
Promotion gates:
- q95 slippage non-inferior or better,
- completion reliability stable,
- LCC reduced,
- no unacceptable turnover burst.

Rollback immediately if completion drops or tail costs widen materially.

Common failure modes

Treating displayed spread compression as proof of passive edge.
Overfitting hidden-liquidity proxies to one venue regime.
Ignoring deadline coupling (where most damage occurs).
Using mean-cost improvement as success while q95 worsens.

Bottom line

In sub-penny and price-improved microstructure regimes, displayed queue position is not full priority truth.

If you model only visible queue dynamics, you will systematically overestimate passive fill quality and underestimate late catch-up cost.

Model SQT explicitly, control with regime states, and optimize for tail-aware completion—not just average spread capture.

References (starting points)

Budish, Cramton, Shim (2015), The High-Frequency Trading Arms Race: Frequent Batch Auctions as a Market Design Response.
O’Hara, Yao, Ye (2014), What’s Not There: The Odd-Lot Bias in TAQ Data.
Cartea, Jaimungal, Penalva (2015), Algorithmic and High-Frequency Trading (execution microstructure foundations).
Gatheral, Schied, Slynko (2012), transient impact / propagator-style execution modeling context.