Peg Repricing Latency and the Step-Behind Tax

Date: 2026-03-06
Category: research (execution / slippage modeling)

Why this playbook exists

Pegged orders (primary-peg, market-peg, midpoint-peg) look elegant in calm tapes: less manual repricing, lower spread cost, cleaner queue behavior. In stress, they leak bps through a different channel:

reference price moves,
your peg update arrives late (or is rejected),
you either get filled at a stale level (adverse selection) or miss the fill and chase,
both paths create hidden implementation shortfall.

I call this hidden leak the Step-Behind Tax (SBT).

Core failure mode

For a buy child order:

NBBO midprice lifts from m_t to m_{t+1}.
Pegged order should reprice up.
Reprice is delayed (Δreprice) because of gateway/exchange throttle/reject queue.
During delay window, either:
- you fill against stale displayed liquidity and post-fill markout turns negative, or
- you do not fill, then cross later at worse levels.

Symmetric for sell side.

The desk typically sees this as “normal volatility slippage”. It is often control-plane lag + microstructure interaction, not just volatility.

Data contract (minimum)

At child-order granularity:

parent_id, child_id, symbol, side, qty
venue, order_type, peg_type (primary/market/midpoint)
submit_ts, ack_ts, replace_send_ts, replace_ack_ts
replace_reject_code, reject_ts (if any)
nbbo_bid, nbbo_ask, mid, microprice sampled at 1-10ms
fill_ts, fill_px, fill_qty
fee/rebate fields
optional: queue estimate, odd-lot inside touch, venue-specific top-of-book depth

Without precise replace_send_ts/ack_ts, this entire risk class becomes invisible.

Metrics that expose SBT

1) Peg Reprice Lag (PRL)

[ PRL = t_{replace_ack} - t_{ref_move} ]

where t_ref_move is first timestamp when peg reference (e.g., NBBO/mid) changed by at least one tick.

Track p50/p90/p95 by symbol, venue, and regime.

2) Effective Stale Window (ESW)

[ ESW = \max(0, t_{effective_new_price} - t_{ref_move}) ]

effective_new_price is when matching engine has accepted the new price (not when router emitted replace).

3) Step-Behind Fill Share (SBFS)

[ SBFS = \frac{\sum \text{fill_qty where fill during ESW}}{\sum \text{total fill_qty}} ]

High SBFS means your fills are concentrated in stale windows.

4) Step-Behind Tax (SBT, bps)

For buys: [ SBT = 10^4 \cdot \frac{\sum q_i (p_i - p^{ref}{i})}{\sum q_i p^{ref}{i}} ]

For sells, invert sign accordingly.
p_ref should be contemporaneous fair reference (mid or microprice at fill instant), not arrival-only.

5) Reprice Reject Loop Rate (RRLR)

[ RRLR = \frac{#(replace_rejects)}{#(replace_attempts)} ]

Segment by reject reason: throttle, price-band, invalid-state, queue-change race.

6) Missed-Peg Opportunity Cost (MPOC)

No fill during stale window, then later aggressive completion:

[ MPOC = \text{AggressiveCatchupCost} - \text{CounterfactualPegCost} ]

Estimate counterfactual with replay simulator or conservative depth model.

Modeling blueprint (branch-aware)

Model net child-order cost as branch mixture:

[ C = \pi_{stale-fill} C_{stale-fill} + \pi_{stale-miss} C_{stale-miss} + \pi_{clean} C_{clean} ]

Where probabilities come from a multinomial/competing-risks model.

Branch A: stale-fill adverse selection

Predict markout-conditioned cost for fills inside ESW.

Features:

PRL, ESW
local volatility (rv_100ms, rv_1s)
spread state, depth imbalance, cancel intensity
queue position proxy
venue micro-latency and reject burst flags

Model: quantile regression (q50/q90/q95) or distributional head.

Branch B: stale-miss then catch-up

Hazard/survival model for fill miss during stale window + expected catch-up cost.

Features:

fill hazard in stale vs fresh state
urgency remaining (time-to-deadline)
residual size fraction
event regime (open/close/news/luld proximity)

Branch C: clean peg tracking

Baseline peg behavior when PRL low and rejects low. Used as reference regime.

Regime state machine

Use explicit operating states with hysteresis:

SYNCED
- PRL_p95 <= 8ms
- RRLR <= 0.5%
- normal peg usage
LAGGING
- 8ms < PRL_p95 <= 20ms or rising reject bursts
- reduce passive size, shorten TTL, tighten cancel/replace pacing
DISLOCATED
- PRL_p95 > 20ms or RRLR > 2% or SBT burn-rate breach
- degrade from peg to explicit limit bands or controlled IOC slices
SAFE
- repeated gate breaches, uncertain position, or venue instability
- hard cap participation, optional pause/quarantine venue

Require stronger evidence to leave DISLOCATED than to enter it (anti-flap hysteresis).

Execution controls

Control 1: Peg TTL guard

Cancel pegged child if no effective reprice ack within TTL.

liquid names: TTL 10-20ms
thin names: TTL 20-50ms (avoid over-churn)

Control 2: Replace pacing with reject-aware backoff

Avoid self-induced reject storms:

dynamic min inter-replace interval
jittered backoff after throttle reject
max replace attempts per second by venue

Control 3: Peg mode switch

midpoint peg -> primary peg -> explicit limit ladder
switch based on PRL, RRLR, and live SBT burn-rate

Control 4: Residual urgency split

When stale-miss risk rises:

small protective aggression slice (to prevent deadline blow-up)
retain passive remainder only if queue/fill hazard still favorable

Control 5: Venue quarantine

If one venue shows persistent high PRL + reject loop, downweight or temporarily quarantine.

Calibration workflow

Build child-order event tape with microsecond ordering where possible.
Label stale windows and branch outcomes.
Fit branch probabilities + branch cost models.
Backtest with replay (include reject mechanics and replace latency).
Validate by regime (open, midday, close, event windows).
Promote only if tail improvements hold out-of-sample.

Promotion gates (shadow -> canary -> live)

Minimum pass criteria (example):

q95 net slippage improvement >= 6 bps in target regime
SBT mean reduced >= 20%
underfill increase <= 1.5 pp
RRLR not worse than baseline by > 0.4 pp
no increase in kill-switch events

Rollback triggers:

2 consecutive windows with q95 breach > +8 bps vs control
RRLR spike > 3% for 5+ minutes
unexpected SAFE transitions above incident threshold

Common false conclusions

"Peg is broken"
Often false. Broken component may be replace pipeline or venue-specific throttling policy.
"Volatility day, nothing to do"
Incomplete. PRL and reject loops are operable; not all cost is exogenous.
"More replaces = better tracking"
Can invert into throttle loops and worse ESW.
"Arrival benchmark says fine"
Arrival-only hides stale-window markout damage. Need multi-horizon markout and branch attribution.

Minimal pseudo-policy

if state == SYNCED:
  use peg_default
elif state == LAGGING:
  reduce child_size
  set peg_ttl=tight
  increase replace_spacing
elif state == DISLOCATED:
  switch peg->explicit_limit_band
  reserve urgency_slice
  downweight high-RRLR venues
elif state == SAFE:
  cap participation hard
  pause affected venue(s) if needed

if SBT_burn_rate > threshold and residual_time_short:
  controlled_catchup()

Desk-level takeaway

Pegged execution is not “set and forget.”
In real markets, peg quality = reference quality × reprice latency × reject mechanics.

If you do not model stale-window branches explicitly, Step-Behind Tax will appear as random noise and keep charging you every volatile session.