Routing Eligibility Drift Slippage Playbook

Date: 2026-03-14
Category: research
Focus: Hidden slippage from stale eligibility assumptions (session/rule/entitlement drift)

1) Why this deserves its own model

Many execution stacks treat order eligibility as a static pre-check:

symbol is tradable
venue is open
side/type/TIF flags are allowed
account permissions are valid

But in live markets, eligibility is time-varying:

session phase flips (auction/open/close/after-hours)
short-sale constraints or borrow gates activate
venue-specific rule flags switch intraday
broker risk/entitlement policies change on the fly

If the router acts on stale eligibility state, the first order wave gets rejected, then emergency fallback logic fires. The resulting cost is usually mislabeled as “market moved.”

2) Hidden cost branches

When local eligibility truth diverges from venue/broker truth, slippage leaks through four channels:

Reject-Loop Delay Tax (RDT)
Time lost between first reject and valid re-submit.
Fallback Toxicity Tax (FTT)
Re-routed flow lands in worse liquidity (wider spread, thinner touch, higher markout).
Retry Amplification Tax (RAT)
Extra message churn (reject/retry/re-route) increases queue resets and control-plane stress.
Opportunity Loss Tax (OLT)
Missed favorable windows while intent is trapped in invalid paths.

Incremental cost decomposition:

[ \Delta C_{elig} = RDT + FTT + RAT + OLT ]

3) Data contract (must-have fields)

Persist eligibility metadata per order intent:

intent_id, parent_id, child_id
eligibility_snapshot_id (hash of compiled rule context)
eligibility_snapshot_age_ms
router_decision_time, first_send_time
first_reject_time, reject_reason_code
first_valid_resubmit_time, first_fill_time
attempt_count_before_valid
initial_venue, fallback_venue
session_phase_local, session_phase_venue
constraint_flags (short/buy-in/auction/type/TIF/locate/etc)
was_reject_expected (based on current local policy model)

Without snapshot IDs and reason codes, eligibility drift is invisible in TCA.

4) Core metrics

4.1 Eligibility Drift Rate (EDR)

Fraction of intents where first route violated true eligibility.

[ EDR = P(\text{first attempt rejected for eligibility}) ]

4.2 Reject-to-Valid Lag (RVL)

[ RVL_i = t_{first_valid_resubmit}^{(i)} - t_{first_reject}^{(i)} ]

Track q50/q90/q95 by venue and instrument tier.

4.3 Stale Snapshot Ratio (SSR)

[ SSR = P(eligibility_snapshot_age_ms > \tau) ]

Use per-regime thresholds (open/close windows should be stricter).

4.4 Fallback Cost Delta (FCD)

Realized slippage difference between fallback execution and counterfactual primary-path estimate.

4.5 Unexpected Reject Share (URS)

Rejects with reason not predicted by local eligibility compiler.

High URS means the rules model itself is stale, not just timing.

5) Modeling approach

Use a two-layer model:

Layer A — baseline execution cost

(\hat{C}_{base}): spread, depth, volatility, queue pressure, urgency.

Layer B — eligibility drift residual

[ \hat{\epsilon}_{elig} = f(EDR, RVL, SSR, FCD, URS, attempt_count, phase_boundary) ]

Total forecast:

[ \hat{C}{total} = \hat{C}{base} + \hat{\epsilon}_{elig} ]

Train quantiles (q50/q90/q95), since drift events are sparse but tail-heavy.

6) State machine

ALIGNED — rejects mostly non-eligibility; low snapshot age
DRIFTING — eligibility rejects rising near phase/rule boundaries
DEGRADED — repeated first-attempt rejects + growing RVL/FCD
SAFE_COMPILE_ONLY — hard gate: no send without fresh eligibility compile

Example transitions:

ALIGNED -> DRIFTING: EDR > 0.02 or SSR > 0.10
DRIFTING -> DEGRADED: q95(RVL) breach + URS spike
DEGRADED -> SAFE_COMPILE_ONLY: tail-budget burn or repeated unexpected reject bursts
Step-down only after sustained clean window (hysteresis)

7) Control policy by state

ALIGNED

Normal route scoring
Standard cache TTL for eligibility snapshots

DRIFTING

Shorten eligibility TTL
Recompile eligibility on boundary-sensitive intents
Penalize venues with rising eligibility rejects

DEGRADED

Pre-send “compile-and-validate” mandatory for all urgent orders
Increase fallback penalty in route objective
Cap retries before switching to deterministic safe path

SAFE_COMPILE_ONLY

Block sends with stale snapshot
Route only through validated venue/type combinations
Promote completion safety over fee/rebate optimization

8) Stress replay design

Replay intent streams under three perturbations:

Boundary shock: rapid session-phase transitions
Rule flip shock: synthetic intraday policy/entitlement change
Reason-code ambiguity shock: delayed/coarse reject taxonomy

Evaluate:

q50/q90/q95 implementation shortfall
first-attempt validity rate
retry burst frequency
completion reliability

Promotion gate: lower q95 with no material completion regression.

9) Production guardrails

Eligibility snapshot TTL by regime (tight near open/close/rule events)
Unexpected-reject breaker (if URS spikes, force safe mode)
Retry budget caps per symbol/venue
Fallback toxicity floor (block panic reroutes into known-toxic paths)
Rule-source freshness monitor (metadata/control-plane lag alerts)

10) Minimal implementation sketch

if snapshot_age_ms > ttl_for_regime(current_phase):
    recompile_eligibility(intent)

if first_reject_is_eligibility:
    metrics.EDR += 1
    state = escalate(state)
    if state >= DEGRADED:
        enforce_compile_before_send()

cost_forecast = base_cost(features) + eligibility_residual(features)

if state == SAFE_COMPILE_ONLY:
    block_stale_intents()
    route_validated_paths_only()

11) Common mistakes

Treating eligibility metadata as static config instead of time-series state
Using one global cache TTL across calm and boundary regimes
Ignoring reject reason quality (coarse reason codes hide drift cause)
Optimizing only fee/spread while retry loops quietly burn queue priority

12) Practical takeaway

Eligibility drift is a control-plane slippage source: the order is “correct in theory” but invalid in real-time rule context.

If you model reject-path costs explicitly and enforce fresh eligibility compilation in unstable windows, you can usually reduce q95 slippage without sacrificing completion stability.