Exchange Message-Throttle Saturation Slippage Playbook

Modeling Per-Port Messaging Limits as a First-Class Execution Risk (Not Just an Infra Alert)

Why this note: Many execution desks model spread, impact, and fill risk, but treat exchange/session message throttles as “ops incidents.” In reality, throttle saturation is a repeatable microstructure regime that changes cancel/replace behavior, queue aging, and forced aggression costs.

1) Failure Mode in One Sentence

When cancel/replace bursts hit per-session message limits, you lose quote agility exactly when market conditions demand it, and slippage tails blow out through stale queue exposure + late forced crossing.

2) Production Objective (Add Throttle-Risk Term)

For action (a) in context (x):

[ J(a|x)=\mathbb{E}[IS|x,a] + \lambda,\text{CVaR}_{q}(IS|x,a) + \eta,\text{MissRisk}(x,a) + \rho,\text{ThrottleRisk}(x,a) ]

Where:

(\mathbb{E}[IS]): expected implementation shortfall
(\text{CVaR}_{q}): tail loss penalty (e.g., q=95%)
(\text{MissRisk}): deadline/non-completion risk
(\text{ThrottleRisk}): expected incremental cost from message-limit proximity, soft-throttle delay, and reject cascades

If you don’t price (\text{ThrottleRisk}), the router overuses high-churn tactics in already saturated sessions.

3) Minimal Throttle Dynamics You Can Actually Deploy

Model session capacity with token-bucket state (conceptual):

[ B_{t+\Delta}=\min(B_{\max},; B_t + r\Delta - m_t) ]

(B_t): available messaging budget
(r): refill rate (msgs/sec)
(m_t): outgoing message count in ([t,t+\Delta])

Define latent throttle regime (S_t \in {\text{GREEN},\text{AMBER},\text{RED}}):

GREEN: budget healthy, normal ack/reject profile
AMBER: budget near depletion, ack-latency convexity starts
RED: reject loop / hard throttle, message admission degraded

A simple HMM or Markov-switching classifier over telemetry is enough; do not wait for perfect exchange internals.

4) Telemetry Contract (Required Fields)

Per child-order decision and per session/port window:

A) Message-Budget Signals

msgs_per_sec, new_per_sec, cancel_per_sec, replace_per_sec
cancel_to_new_ratio
burst_index (e.g., p99/p50 mps or Fano factor)
local_throttle_wait_us (if your gateway shapes before send)

B) Exchange Response Signals

ack_latency_us_p50/p95/p99
business_reject_count (with reject reason class)
reject_streak_len
pending_unacked_msgs

C) Execution Consequence Signals

quote_age_at_fill/cancel
stale_quote_markout_1s/5s
forced_aggression_bps (extra cost vs intended passive path)
deadline_residual_qty

D) Context Features

spread, top-of-book depth, volatility, imbalance
urgency state, participation cap, symbol liquidity tier
venue/session ID and gateway path

Without this contract, throttle incidents become anecdotal and impossible to model.

5) Label Design (Don’t Use Rejects Only)

Use three labels:

HardThrottleEvent
- direct business-level rejects attributable to rate limits
SoftThrottleEvent
- reject=0 but ack-latency and pending-unacked jump above calibrated regime thresholds
ThrottleCostEvent
- realized incremental IS linked to stale queue + delayed cancel/replace + forced aggression

Only labeling hard rejects misses the majority of cost (soft saturation often dominates PnL drag).

6) Modeling Stack (Practical)

Layer A — Throttle Onset Hazard

Estimate (P(\text{AMBER/RED in }\tau | x_t, a_t)) with survival/discrete-hazard model.

Layer B — Regime-Conditional Slippage

Model IS distribution conditioned on regime:

[ p(IS|x,a)=\sum_s p(IS|x,a,S=s),P(S=s|x,a) ]

Use quantile models for p50/p90/p99 to keep tail control actionable.

Layer C — Counterfactual Burst Simulator

Replay message streams with token-bucket approximation to estimate:

reject probability under candidate policy
expected stale-quote exposure time
forced aggression uplift in bps

This gives router-safe policy comparisons before rollout.

7) KPIs That Expose Hidden Throttle Tax

Throttle Probability Calibration Error (TPCE) [ TPCE = \hat p_{throttle} - p_{throttle,emp} ]
Reject-With-Residual Ratio (RWRR)

fraction of rejects when meaningful residual parent quantity remains

Stale-Quote Cost per 1k Messages (SQC-1k)

markout/IS attributed to quote aging normalized by traffic

Forced-Aggression Uplift (FAU) [ FAU = IS_{forced_agg} - IS_{planned_path} ]
Throttle Recovery Half-Life (TRH)

time from RED trigger to stable GREEN telemetry

If SQC-1k and FAU rise while fill-rate looks stable, you’re paying hidden throttle rent.

8) Control Policy (GREEN → SAFE)

GREEN: normal tactic set
AMBER_PRETHROTTLE:
- cap replace cadence
- widen min reprice interval
- prioritize high-value cancel/replace intents
- reduce low-value queue churn
RED_THROTTLE_ACTIVE:
- freeze non-essential quote updates
- switch to lower-churn tactics
- route urgent residual to completion-safe path
SAFE_FALLBACK:
- deterministic low-message policy until TRH guard passes

Use hysteresis + minimum dwell times to avoid flapping.

9) Rollout Blueprint

Shadow phase (2 weeks): compute throttle states + SQC/FAU only.
Paper control: replay AMBER/RED actions on historical message traces.
Canary by session: low-notional symbols, strict rollback triggers.
Promotion gates: improve p95/p99 IS and reduce reject/stale metrics without completion collapse.
Drills: synthetic burst storms (open/close/news windows) and verify SAFE fallback.

10) Common Mistakes

Treating message limits as a binary reject/not-reject variable.
Ignoring ack-latency convexity before first reject appears.
Optimizing fill/capture KPIs while cancel churn silently burns throttle budget.
No per-session modeling (pooling across sessions hides true saturation states).
Missing reject reason taxonomy, making root-cause attribution impossible.

11) Fast Implementation Checklist

[ ] Log per-session message budget proxies + ack/reject telemetry
[ ] Label HARD + SOFT throttle events, not rejects only
[ ] Add ThrottleRisk term to routing objective
[ ] Build regime-conditioned IS tails (p90/p99)
[ ] Ship GREEN/AMBER/RED/SAFE controller with hysteresis
[ ] Gate rollout on SQC-1k + FAU + completion reliability

References

Nasdaq Nordic (2020), Changes of throttling limits for FIX and OUCH order entry ports (example of explicit per-port MPS controls and reject behavior).
CME Group Client Systems Wiki (updated 2025), Messaging Controls (session-level TPS controls, reject/terminate threshold concepts for iLink).
ESMA MiFID II, Article 17 & Article 48 (algorithmic-trading risk controls, capacity, and order-to-trade / message-flow governance requirements).
Heinanen & Guerin (1999), RFC 2697, A Single Rate Three Color Marker (token-bucket style control intuition for burst/rate policing).
Mounjid & Lehalle (2016/2018), Limit Order Strategic Placement with Adverse Selection Risk and the Role of Latency (latency-adverse-selection interaction in limit-order control).
Almgren & Chriss (2000), Optimal Execution of Portfolio Transactions (baseline execution-cost control framing).

TL;DR

Exchange/session messaging throttles are not just infrastructure alerts; they are a predictable slippage regime. Model throttle onset and soft saturation explicitly, price throttle risk in action selection, and deploy a regime controller that reduces low-value churn before reject cascades force expensive aggression.