Venue Hop Tax: Cross-Venue Reroute Slippage Modeling Playbook
Date: 2026-03-07
Category: research (execution / slippage modeling)
Why this playbook exists
Most fragmented-market execution stacks optimize each child order at send time, then quietly assume venue choice is stable. In production, venue choice often flips mid-flight due to:
- quote fade or reject bursts,
- gateway throttling / venue micro-outages,
- sudden toxicity changes,
- policy constraints (auction locks, short-sale states, maker/taker mode switches).
Every forced reroute pays a hidden cost I call Venue Hop Tax (VHT):
- you lose queue age accumulated at the original venue,
- you re-enter at a worse queue rank (or wider spread),
- you add decision + routing latency,
- urgency often increases after the first miss.
This tax is usually scattered across logs and blamed on “market noise.” It is modelable and controllable.
Core failure mode
For a buy parent order in fragmented lit venues:
- Child posted on Venue A (good queue position building).
- Local signal sees fading touch / rising reject risk.
- Router cancels and hops to Venue B.
- Venue B queue is already crowded; fill hazard drops.
- Remaining quantity now has less time, so controller increases aggression.
- Later fills print through worse levels than if the strategy had stayed or crossed earlier in a controlled way.
This is not “one bad decision.” It is a branching path-dependent cost from repeated queue resets + latency accumulation.
Data contract (minimum)
At child-order granularity:
parent_id,child_id,symbol,side,qtyvenue_from,venue_to,hop_reason(fade/reject/throttle/toxicity/policy/manual)post_ts,cancel_send_ts,cancel_ack_ts,new_send_ts,new_ack_tsqueue_estimate_before,queue_estimate_after(or proxy rank buckets)spread_before,spread_after,depth_topk_before/aftermarkout_5s,markout_30s,fill_px,arrival_px,decision_pxvenue_latency_ms,throttle_state,reject_code,toxicity_scoretime_to_deadline,remaining_parent_qty,urgency_state
Without venue_from -> venue_to transitions and cancel/ack timestamps, VHT cannot be measured.
Metrics that expose reroute cost
1) Hop Rate (HR)
[ HR = \frac{#{\text{child orders that changed venue at least once}}}{#{\text{all child orders}}} ]
Track by symbol × time-of-day × volatility regime.
2) Queue Reset Loss (QRL)
[ QRL = E\left[\Delta rank\right],\quad \Delta rank = rank_{after} - rank_{counterfactual,stay} ]
If direct rank is unavailable, use expected fill-hazard drop as a proxy.
3) Hop Latency Drag (HLD)
[ HLD = (t_{new_ack} - t_{cancel_send}) ]
This captures dead time where neither old nor new venue is fill-capable.
4) Venue Hop Tax (VHT, bps)
[ VHT = 10^4 \cdot \frac{\sum_i q_i,(p_i^{actual} - p_i^{cf,no_hop})}{\sum_i q_i,p_i^{cf,no_hop}} ]
Counterfactual keeps same constraints but disallows non-essential hops.
5) Hop Cascade Ratio (HCR)
[ HCR = \frac{#{\text{parents with }\ge 2\text{ hops}}}{#{\text{parents with }\ge 1\text{ hop}}} ]
High HCR indicates control-loop instability rather than single-event adaptation.
Modeling blueprint
Treat child execution as a multi-branch survival process with transition penalties.
State representation
[ X_t = (book_t, tox_t, lat_t, queue_t, rem_t, ttl_t, venue_t) ]
Action set
[ a_t \in {stay_passive, improve_passive, take_local, hop_passive, hop_aggressive, pause} ]
Cost decomposition
[ C_t(a) = C_{spread} + C_{impact} + C_{delay} + C_{miss} + C_{hop} ]
with explicit hop penalty:
[ C_{hop} = \beta_1,HLD + \beta_2,QRL + \beta_3,\mathbf{1}(\text{urgency escalates after hop}) ]
Transition-aware objective
[ a_t^* = \arg\min_a; E[C_t(a)\mid X_t] + \lambda,CVaR_{95}(C_t(a)\mid X_t) ]
Key: include transition penalty directly in optimization; otherwise the policy over-hops under noisy short-horizon signals.
Control design
Control 1) Non-essential hop budget
Per parent order, cap discretionary hops (e.g., max 1 in calm, 2 in stress). Beyond cap, only risk-critical hops allowed.
Control 2) Dwell-time hysteresis
After a venue switch, require minimum dwell or strong evidence before the next hop to prevent ping-pong.
Control 3) Hop reason hierarchy
Priority of allowed reasons:
- hard policy/compliance block,
- deterministic venue outage/throttle block,
- extreme toxicity breach,
- soft alpha preference.
Only (1)-(2) bypass hop budget automatically.
Control 4) Queue-value-aware decisioning
Estimate queue option value before canceling:
[ QV = P(fill\mid stay,\Delta t) \cdot edge_{maker} - P(miss\mid stay,\Delta t) \cdot chase_cost ]
Hop only if expected gain exceeds transition penalty buffer.
Control 5) Cascade breaker
If parent enters repeated-hop pattern (HCR condition), auto-fallback to SAFE policy (single-venue conservative or controlled-cross completion).
Calibration workflow
- Build event timeline and transition graph (
venue_from -> venue_to) per parent. - Estimate fill hazards for stay vs hop branches with competing-risks survival modeling.
- Fit hop penalty parameters (
beta) from historical realized costs. - Run replay backtests with hop-budget and dwell constraints.
- Shadow-run in production and compare VHT, completion, and tail slippage.
- Promote in canary waves with explicit rollback triggers.
Promotion gates (example)
Promote only if canary period shows:
- VHT reduction >= 15%
- p95 parent slippage improvement >= 2.5 bps
- HCR reduction >= 20%
- completion ratio change >= -0.8pp
- reject-rate non-inferior (no >0.3pp degradation)
Rollback if any two hold for two consecutive sessions:
- p95 slippage worsens > 4 bps
- deadline misses worsen > 1.0pp
- hop cascade incidents exceed expected envelope
Common false conclusions
"More rerouting always means smarter adaptation."
Often it means noisy control loops paying repeated queue-reset tax."Cancel sent means opportunity preserved."
Dead time between cancel-send and new-ack is pure exposure."Per-child local optimum equals parent optimum."
Parent cost accumulates path-dependently across transitions."One global hop rule is enough."
Hop economics differ by symbol liquidity, venue microstructure, and session phase.
Practical implementation checklist
- Transition logs include both
venue_fromandvenue_to - Cancel/ack/new-ack clocks validated and synchronized
- Hop reasons standardized (no free-text ambiguity)
- Counterfactual replay for no-hop / reduced-hop baselines
- Hop-budget + dwell constraints exposed as runtime controls
- SAFE fallback tested in chaos-style venue-throttle drills
Bottom line
In fragmented markets, a venue switch is not free adaptation—it is a capital-consuming state transition.
If your slippage model treats reroutes as neutral plumbing, you are underpricing tail cost. Model the Venue Hop Tax explicitly, and execution policy will stop paying queue-reset tuition disguised as “smart routing.”