Transfer-Entropy Lead-Lag Toxicity Routing Slippage Playbook

2026-03-02 · finance

Transfer-Entropy Lead-Lag Toxicity Routing Slippage Playbook

Date: 2026-03-02 Category: finance / execution research

Why this model

In fragmented markets, slippage is often caused less by "not enough displayed depth" and more by information arriving unevenly across venues.

Typical router logic (best price + fee + queue estimate) misses a key dynamic:

A lead-lag toxicity model based on transfer entropy (TE) helps detect directional information flow between venues and route away from "about-to-turn-toxic" books.


Core idea

Build a real-time graph where each directed edge (i \to j) measures how much recent flow/toxicity on venue (i) improves prediction of venue (j)'s next-state toxicity beyond (j)'s own history.

If (TE_{i\to j}) is high and recent signals on (i) deteriorate, reduce passive exposure on (j) before the shock fully transmits.


Feature schema

At 100ms–1s slices (symbol-level), collect per venue (v):

Define a venue toxicity state (T^v_t) (continuous score or discrete bins LOW/MID/HIGH):

[ T^v_t = w_1 \cdot z(M^v_{t+\tau}) + w_2 \cdot z(C^v_t) + w_3 \cdot z(|Q^v_t|) + w_4 \cdot z(\text{spread jump}) ]


Transfer-entropy layer

For directed pair ((i,j)):

[ TE_{i\to j}(L) = I\left(T^i_{t-L:t-1};, T^j_t ,\middle|, T^j_{t-L:t-1}\right) ]

where (I(\cdot)) is conditional mutual information and (L) is lag window.

Practical estimation options:

  1. Discrete binning + Miller-Madow correction (fast, robust)
  2. kNN entropy estimators (more flexible, noisier)
  3. Logistic surrogate: compare predictive cross-entropy with/without source-venue history

Use rolling estimation (e.g., 15–30 min windows) with exponential decay to keep edge weights adaptive.


Lead-lag shock score

For destination venue (j), define propagated shock risk:

[ S^j_t = \sum_{i\neq j} \hat{TE}_{i\to j,t} \cdot \psi\left(T^i_t - \bar T^i_t\right) ]

Then combine with local conditions:

[ \text{Risk}^j_t = a,S^j_t + b,T^j_t + c,\text{QueueFragility}^j_t ]

This creates a forward-looking risk score: "venue (j) is currently okay, but incoming toxicity probability is high."


Router policy mapping

Translate (\text{Risk}^j_t) into route controls:

Example policy:

Use hysteresis bands to prevent route thrashing.


Training & validation design

1) Offline research track

2) Counterfactual simulation

Compare:

  1. fee+spread baseline router
  2. toxicity-only router (no TE graph)
  3. TE-augmented router (proposed)

Metrics:

3) Live shadow mode

Run TE risk engine without control for 2+ weeks:


Production guardrails

  1. Data quality hard checks

    • venue clock skew bounds
    • stale book detection
    • crossed/locked quote sanitization
  2. Estimator stability checks

    • minimum sample requirement per rolling window
    • edge-weight shrinkage to prior during sparse flow
    • cap total incoming TE mass to avoid runaway scoring
  3. Fail-safe behavior

    • if TE engine unhealthy, revert to baseline toxicity router
    • keep emergency completion path independent of TE layer

Common failure modes

  1. Spurious causality from common shocks

    • Mitigate with conditioning on market-wide factors and auction/event flags.
  2. Non-stationary lag structure

    • Re-estimate lag buckets by session and volatility regime.
  3. Overreaction in thin names

    • Use stronger shrinkage + coarser bins + higher action thresholds.
  4. Operational churn (too many cancels)

    • Penalize cancel intensity directly in policy objective.

Minimal implementation checklist


One-line takeaway

Slippage in fragmented markets is often a propagation problem; a transfer-entropy lead-lag graph lets the router act on where toxicity is going next, not just where it already is.