Routing Eligibility Drift Slippage Playbook
Date: 2026-03-14
Category: research
Focus: Hidden slippage from stale eligibility assumptions (session/rule/entitlement drift)
1) Why this deserves its own model
Many execution stacks treat order eligibility as a static pre-check:
- symbol is tradable
- venue is open
- side/type/TIF flags are allowed
- account permissions are valid
But in live markets, eligibility is time-varying:
- session phase flips (auction/open/close/after-hours)
- short-sale constraints or borrow gates activate
- venue-specific rule flags switch intraday
- broker risk/entitlement policies change on the fly
If the router acts on stale eligibility state, the first order wave gets rejected, then emergency fallback logic fires. The resulting cost is usually mislabeled as “market moved.”
2) Hidden cost branches
When local eligibility truth diverges from venue/broker truth, slippage leaks through four channels:
Reject-Loop Delay Tax (RDT)
Time lost between first reject and valid re-submit.Fallback Toxicity Tax (FTT)
Re-routed flow lands in worse liquidity (wider spread, thinner touch, higher markout).Retry Amplification Tax (RAT)
Extra message churn (reject/retry/re-route) increases queue resets and control-plane stress.Opportunity Loss Tax (OLT)
Missed favorable windows while intent is trapped in invalid paths.
Incremental cost decomposition:
[ \Delta C_{elig} = RDT + FTT + RAT + OLT ]
3) Data contract (must-have fields)
Persist eligibility metadata per order intent:
intent_id,parent_id,child_ideligibility_snapshot_id(hash of compiled rule context)eligibility_snapshot_age_msrouter_decision_time,first_send_timefirst_reject_time,reject_reason_codefirst_valid_resubmit_time,first_fill_timeattempt_count_before_validinitial_venue,fallback_venuesession_phase_local,session_phase_venueconstraint_flags(short/buy-in/auction/type/TIF/locate/etc)was_reject_expected(based on current local policy model)
Without snapshot IDs and reason codes, eligibility drift is invisible in TCA.
4) Core metrics
4.1 Eligibility Drift Rate (EDR)
Fraction of intents where first route violated true eligibility.
[ EDR = P(\text{first attempt rejected for eligibility}) ]
4.2 Reject-to-Valid Lag (RVL)
[ RVL_i = t_{first_valid_resubmit}^{(i)} - t_{first_reject}^{(i)} ]
Track q50/q90/q95 by venue and instrument tier.
4.3 Stale Snapshot Ratio (SSR)
[ SSR = P(eligibility_snapshot_age_ms > \tau) ]
Use per-regime thresholds (open/close windows should be stricter).
4.4 Fallback Cost Delta (FCD)
Realized slippage difference between fallback execution and counterfactual primary-path estimate.
4.5 Unexpected Reject Share (URS)
Rejects with reason not predicted by local eligibility compiler.
High URS means the rules model itself is stale, not just timing.
5) Modeling approach
Use a two-layer model:
Layer A — baseline execution cost
(\hat{C}_{base}): spread, depth, volatility, queue pressure, urgency.
Layer B — eligibility drift residual
[ \hat{\epsilon}_{elig} = f(EDR, RVL, SSR, FCD, URS, attempt_count, phase_boundary) ]
Total forecast:
[ \hat{C}{total} = \hat{C}{base} + \hat{\epsilon}_{elig} ]
Train quantiles (q50/q90/q95), since drift events are sparse but tail-heavy.
6) State machine
- ALIGNED — rejects mostly non-eligibility; low snapshot age
- DRIFTING — eligibility rejects rising near phase/rule boundaries
- DEGRADED — repeated first-attempt rejects + growing RVL/FCD
- SAFE_COMPILE_ONLY — hard gate: no send without fresh eligibility compile
Example transitions:
- ALIGNED -> DRIFTING:
EDR > 0.02orSSR > 0.10 - DRIFTING -> DEGRADED:
q95(RVL)breach +URSspike - DEGRADED -> SAFE_COMPILE_ONLY: tail-budget burn or repeated unexpected reject bursts
- Step-down only after sustained clean window (hysteresis)
7) Control policy by state
ALIGNED
- Normal route scoring
- Standard cache TTL for eligibility snapshots
DRIFTING
- Shorten eligibility TTL
- Recompile eligibility on boundary-sensitive intents
- Penalize venues with rising eligibility rejects
DEGRADED
- Pre-send “compile-and-validate” mandatory for all urgent orders
- Increase fallback penalty in route objective
- Cap retries before switching to deterministic safe path
SAFE_COMPILE_ONLY
- Block sends with stale snapshot
- Route only through validated venue/type combinations
- Promote completion safety over fee/rebate optimization
8) Stress replay design
Replay intent streams under three perturbations:
- Boundary shock: rapid session-phase transitions
- Rule flip shock: synthetic intraday policy/entitlement change
- Reason-code ambiguity shock: delayed/coarse reject taxonomy
Evaluate:
- q50/q90/q95 implementation shortfall
- first-attempt validity rate
- retry burst frequency
- completion reliability
Promotion gate: lower q95 with no material completion regression.
9) Production guardrails
- Eligibility snapshot TTL by regime (tight near open/close/rule events)
- Unexpected-reject breaker (if URS spikes, force safe mode)
- Retry budget caps per symbol/venue
- Fallback toxicity floor (block panic reroutes into known-toxic paths)
- Rule-source freshness monitor (metadata/control-plane lag alerts)
10) Minimal implementation sketch
if snapshot_age_ms > ttl_for_regime(current_phase):
recompile_eligibility(intent)
if first_reject_is_eligibility:
metrics.EDR += 1
state = escalate(state)
if state >= DEGRADED:
enforce_compile_before_send()
cost_forecast = base_cost(features) + eligibility_residual(features)
if state == SAFE_COMPILE_ONLY:
block_stale_intents()
route_validated_paths_only()
11) Common mistakes
- Treating eligibility metadata as static config instead of time-series state
- Using one global cache TTL across calm and boundary regimes
- Ignoring reject reason quality (coarse reason codes hide drift cause)
- Optimizing only fee/spread while retry loops quietly burn queue priority
12) Practical takeaway
Eligibility drift is a control-plane slippage source: the order is “correct in theory” but invalid in real-time rule context.
If you model reject-path costs explicitly and enforce fresh eligibility compilation in unstable windows, you can usually reduce q95 slippage without sacrificing completion stability.