Aggressor-Side Misclassification and Toxicity-Label Drift in Slippage Models

2026-04-11 · finance

Aggressor-Side Misclassification and Toxicity-Label Drift in Slippage Models

Date: 2026-04-11
Category: research (execution / slippage modeling)

Why this playbook exists

A lot of execution stacks quietly assume they know which trades were buyer-initiated and which were seller-initiated.

That assumption leaks everywhere:

The problem is that many production pipelines do not observe the true aggressor side directly. They infer it from:

When that inference is wrong, the damage is not limited to a noisy dashboard. It becomes a control error.

The model starts learning from mislabeled toxicity:

This note turns that problem into a production modeling and control framework.

A useful companion note is:

That file is about how trade signing methods work. This file is about what happens when those labels feed live slippage and routing models.


The core failure mode

Let:

Your model often builds features like:

But the model actually sees:

[ \hat{F}t = \sum{i \in W_t} \hat{s}_i v_i ]

instead of the flow you really care about:

[ F^t = \sum{i \in W_t} s^_i v_i ]

If sign error is random, the signal gets attenuated. If sign error is state-dependent, the signal becomes biased.

State-dependent error is the dangerous case. That happens when misclassification rises specifically during:

In other words:

your labels get worst exactly when microstructure is hardest and slippage matters most.


Why this hurts slippage models more than people expect

Trade-sign noise does not only degrade one feature. It contaminates the whole feedback loop.

1. Toxicity models underreact to real informed flow

If real buyer-initiated sweeps are partially mislabeled as sells or unknowns, the model underestimates buy-side pressure. Then:

2. Passive fills look safer than they really are

A passive fill is often evaluated against what happened right after it. If post-fill aggressive flow is mislabeled, the system understates adverse selection. That flatters passive tactics in backtests and shadow evaluation.

3. Aggressive routes learn the wrong venue map

A venue may look safe because its toxic flow is harder to sign correctly:

The router then mistakes measurement weakness for venue quality.

4. Attribution shifts from model error to market error

When labels are noisy, teams often conclude:

Sometimes the real answer is simpler:

your aggressor labels degraded, so the model stopped seeing toxicity correctly.


Mechanism map

1) Trade/quote clock skew

Classic quote-rule and Lee-Ready-style logic depend on the quote being the one the trade actually interacted with. If the quote stream is too early or too late relative to the trade print, you sign against the wrong midpoint.

That flips labels especially when:

2) Midpoint and hidden-liquidity executions

A midpoint print is weak evidence of direction from price alone. If your fallback logic forces a buy/sell guess anyway, sign quality collapses in midpoint-heavy venues.

3) Odd-lot and sub-round-lot regimes

Odd lots can carry real informational content while interacting awkwardly with displayed-touch logic. A classifier that still mentally lives in a 100-share-touch world will misread modern flow.

4) Off-exchange and delayed reporting

If trade reports arrive late or in bursts, the quote state visible at arrival time may be unrelated to the economic state at execution time. That makes post hoc signing look cleaner than live signing.

5) Bulk classification used as trade-level truth

Bulk Volume Classification (BVC) can be useful for interval-level signed volume. It is not a per-trade aggressor oracle. Using it as trade-level truth poisons event-level toxicity and fill models.

6) Corrections, cancels, and sale-condition drift

If a print is corrected, canceled, or reclassified later, your signed-flow history is rewritten after the model may already have acted on it. That creates label inconsistency across:


A more useful abstraction: sign quality is a latent state

Instead of pretending every trade label is equally trustworthy, define:

Then your model can use:

[ \tilde{F}t = \sum{i \in W_t} \ell_i v_i ]

instead of naïve signed flow.

This immediately separates two very different situations:

  1. strong, trustworthy flow imbalance,
  2. apparent flow imbalance generated by weak sign evidence.

That difference matters operationally. A routing model should react strongly to the first and cautiously to the second.


Cost decomposition

A practical decomposition is:

[ C_{total} = C_{base} + C_{signal_loss} + C_{wrong_reaction} + C_{venue_misrank} + C_{attribution_drift} ]

Where:

A simple way to think about the live piece:

[ C_{wrong_reaction} \approx \kappa_1 \cdot |F^*_t - \tilde{F}_t| + \kappa_2 \cdot \text{policy flip rate} + \kappa_3 \cdot \text{false passive dwell time} ]

That last term matters a lot. A toxic flow signal that arrives late or inverted often does not cause one bad fill. It causes a sequence:

  1. keep passive order live too long,
  2. get negatively selected,
  3. cancel late,
  4. cross later in a worse book,
  5. attribute the pain to “market move” instead of label contamination.

The exact model bug people miss

Many teams validate trade signing by aggregate statistics:

That is not enough.

A classifier can look acceptable in aggregate while still being disastrous in the exact subset that matters for execution:

This is the same pathology as a slippage model with fine RMSE and terrible q95.

Average correctness is not the relevant KPI. Correctness in decision-critical regimes is.


Public grounding

A few public references make this problem very real:

The punchline is simple:

trade sign is not a timeless label living inside the print. It is a reconstruction whose quality depends on clocks, venue semantics, and market regime.


Features that belong in a slippage stack

A. Sign-quality features

B. Venue / print-semantics features

C. Market-state features

D. Strategy-state features

The important idea:

label quality is itself a first-class feature. Do not hide it in preprocessing and pretend the downstream model is working with truth.


Metrics worth monitoring

1. SIR — Sign Inversion Rate

Estimated rate at which the inferred sign disagrees with better ground truth on a benchmark subset.

Break it out by:

2. QAD — Quote Alignment Drift

Distribution of trade-to-quote timing mismatch used by the classifier.

If QAD shifts, sign quality may shift even when market behavior itself does not.

3. MDS — Method Dispersion Share

Share of labels produced by each method:

A rising fallback share is often an early-warning indicator.

4. MSD — Markout Sign Disagreement

Compare markout statistics when grouped by inferred sign vs higher-confidence sign on a labeled subset.

This measures how much toxicity inference is being bent by label noise.

5. VCR — Venue Contamination Ratio

Fraction of flow for a venue that lands in low-confidence sign buckets.

This catches the classic failure mode where a venue looks “safer” only because your labels are weaker there.

6. PFD — Passive Fill Distortion

Difference in expected passive-fill markout using naïve signs vs confidence-aware signs.

This is the business metric that often reveals the problem fastest.


State machine for live control

LABEL_CLEAN

QUOTE_SKEWED

Triggered when quote-age / trade-quote alignment deteriorates.

HIDDEN_FLOW_HEAVY

Triggered when midpoint / odd-lot / non-standard print share rises.

OFFEX_DELAYED

Triggered when delayed reporting or correction activity rises.

SAFE_ABSTAIN

Triggered when sign quality is persistently poor.

A bad sign regime should degrade into safer, simpler control, not into random overfitting.


Modeling blueprint

Layer 1 — Preserve label provenance

For every signed trade, store at least:

If you do not store provenance, you will never debug label drift cleanly.

Layer 2 — Build a confidence model

Estimate:

[ q_i = P(\hat{s}_i = s^*_i \mid x_i) ]

using benchmark subsets such as:

Layer 3 — Replace hard labels with soft labels

Instead of training on (\hat{s}_i) alone, use:

Layer 4 — Train toxicity and markout models conditional on sign quality

Predict:

[ E[C \mid x_t, \tilde{F}_t, q_t] ]

not merely:

[ E[C \mid x_t, \hat{F}_t] ]

That lets the model learn different responses for:

Layer 5 — Freeze or regularize online adaptation under label stress

If venue scores adapt online from mislabeled toxicity, the router will drift the wrong way. Use:


Practical policy rules

Rule 1: unknown is a valid label

Forcing weak guesses is often worse than carrying uncertainty.

Rule 2: keep trade signing and slippage modeling loosely coupled

Signing logic will change. If your whole feature store assumes one eternal sign classifier, future fixes become painful and misleading.

Rule 3: benchmark subsets must match decision-critical regimes

A high-confidence benchmark only on calm large-cap lit flow is not enough. You need validation where the live model bleeds:

Rule 4: venue quality and label quality are different things

If a venue looks safe, ask whether it is actually safe or just hard to sign.

Rule 5: TCA must distinguish live-as-of labels from hindsight-repaired labels

Otherwise replay studies flatter the model by grading it with cleaner labels than the live controller had.


30-day rollout plan

Week 1 — Instrument label provenance

Week 2 — Build benchmark subsets

Week 3 — Confidence-aware shadow models

Week 4 — Controlled activation


Common anti-patterns


What good looks like

A production execution stack should be able to answer:

  1. Which trades were signed with strong evidence vs weak evidence?
  2. How does sign quality change by venue, symbol, and time regime?
  3. How much passive-fill markout worsens when sign confidence is low?
  4. Whether a venue looks safe because of real outcomes or because its flow is hard to classify?
  5. Whether online adaptation slows down or becomes reckless during label-stress windows?

If you cannot answer those, your toxicity model may be learning from mislabeled flow.

And mislabeled flow is one of the cleanest ways to pay real slippage for imaginary signal.


Selected public references

Bottom line

Trade-sign classification error is not just a data-cleaning nuisance.

It is a slippage-model contamination channel.

When aggressor-side labels degrade, toxicity features attenuate or invert, passive fills get misgraded, venue rankings drift, and online adaptation starts learning from the wrong market.

The right response is not “pick one better classifier and forget it.” It is:

In short:

before you trust signed-flow alpha, make sure you trust the signs.