QUIC ACK_FREQUENCY Feedback-Aliasing Slippage Playbook

Why this matters

Many low-latency execution paths now run over QUIC-based gateways. When operators enable ACK_FREQUENCY to reduce ACK traffic, they often improve aggregate efficiency but accidentally degrade feedback timing resolution.

For execution engines, this can create a hidden failure mode:

transport feedback gets coarser,
RTT/loss estimates react later,
retransmission and pacing corrections bunch,
child orders arrive in bursts instead of a smooth cadence,
realized slippage rises despite no obvious application-layer bug.

This note gives a practical model/control framework for that failure mode.

Mechanism: from ACK thinning to execution slippage

1) ACK thinning changes estimator responsiveness

ACK_FREQUENCY allows a sender to request less frequent ACKs (higher ack-eliciting threshold and larger max ACK delay budget). Fewer ACK events means fewer fresh RTT samples and slower correction of path changes.

2) Loss and PTO timing become less precise under stress

QUIC loss detection and PTO logic rely on timely ACK information. Coarser ACK cadence can delay or destabilize loss inference when path RTT shifts quickly.

3) Recovery traffic bunches

When feedback arrives in lumps, sender pacing/rate adaptation can also lump (especially around slow-start exits, path transitions, or transient queueing), producing burstier packet delivery.

4) Execution cadence aliases

Execution routers using transport-latency signals for urgency gating can misclassify urgency state and release child slices in clumps. That increases:

queue-priority resets,
adverse selection after delayed reactions,
market-impact convexity from microburst participation.

Observability: transport-to-slippage bridge metrics

Track these at 1s–5s windows per venue path.

Transport side

AFR (ACK Frequency Ratio)
- AFR = acked_packets / ack_frames
- Higher AFR = more ACK thinning.
MADU (Max Ack Delay Utilization)
- MADU = p95(observed_ack_delay) / configured_max_ack_delay
- Near 1.0 means ACK delay budget is fully consumed.
RRS (RTT Responsiveness Score)
- RRS = corr(Δqueueing_proxies, Δsrtt_next) over short lag set
- Falling RRS indicates RTT estimate lagging reality.
PBR (PTO Burst Ratio)
- PBR = bytes_sent_in_10ms_after_PTO / median_10ms_bytes
- Captures recovery bunching.
AIV (ACK Inter-arrival Variability)
- coefficient of variation of ACK inter-arrival times.

Execution side

CDI (Child Dispatch Irregularity)
- CV of inter-child dispatch intervals.
QRT (Queue Reset Tax)
- bps cost attributable to cancel/replace or missed queue aging.
RSL (Residual Slippage Lift)
- realized bps − baseline model bps (without ACK-thinning features).

Coupling signal

TAI (Transport Aliasing Index)
- TAI = z(AFR) + z(MADU) + z(PBR) + z(AIV) - z(RRS)
- High TAI should co-move with CDI/QRT if aliasing is real.

Modeling architecture

Use a two-layer setup.

Layer A: baseline slippage model

Your normal microstructure model (spread, depth, imbalance, volatility, participation, queue features).

Layer B: transport-aliasing correction

Predict residual uplift:

RSL_t = f(TAI_t, AFR_t, MADU_t, PBR_t, AIV_t, CDI_t, regime_t)

Recommended model:

monotonic GBM / GAM with sign constraints:
- increasing in TAI, PBR, CDI;
- weakly increasing in AFR, MADU.

Final forecast:

Slippage_hat = Baseline_hat + RSL_hat

Why residual modeling works:

keeps market microstructure signal stable,
isolates transport-policy effects,
supports cleaner rollback decisions when network tuning changes.

Regime controller (production)

Define policy states from TAI + realized residual error.

GREEN: TAI < 1.0 and RSL_p50 <= 0
- normal ACK_FREQUENCY profile.
AMBER: 1.0 <= TAI < 2.0 or RSL_p50 > 0
- reduce packet tolerance,
- lower max ACK delay request,
- tighten child-order cadence caps.
RED: TAI >= 2.0 or RSL_p90 breach for N windows
- force denser ACK behavior (or request IMMEDIATE_ACK on critical transitions),
- switch to conservative execution template (lower burst size, narrower urgency steps),
- freeze aggressive participation ramps.

Recovery hysteresis:

require M consecutive GREEN windows before restoring aggressive profile.

Experimental design (to prove causality)

Path-level A/B
- control: default ACK behavior
- treatment: thinned ACK behavior
- stratify by symbol liquidity, volatility, and time bucket.
Switchback schedule
- alternate treatment/control by short time blocks to reduce confounding by market regime.
Primary endpoints
- RSL, CDI, QRT, tail slippage (p95/p99).
Guardrail endpoints
- reject rate, timeout rate, PTO incidence, missed participation.
Promotion rule
- only keep ACK thinning if infra savings are positive and slippage tail cost stays within budget.

Practical rollout checklist

Instrument ACK_FREQUENCY parameters in packet/connection metadata (versioned).
Persist transport metrics at the same clock domain as execution events.
Add TAI and regime state to TCA dashboards.
Enforce config parity checks across gateways (avoid mixed policy drift).
Keep one-click rollback to dense-ACK profile.

Common mistakes

Looking only at median latency
- aliasing mostly hurts tails and cadence regularity, not just p50.
Blaming venue microstructure only
- transport policy can inject synthetic burstiness that mimics liquidity deterioration.
No hysteresis in controller
- frequent toggling between ACK profiles worsens instability.
Training without policy flags
- model silently entangles transport regime with market features.

References

Iyengar, J. & Thomson, M. RFC 9000: QUIC: A UDP-Based Multiplexed and Secure Transport (IETF, 2021). https://www.rfc-editor.org/rfc/rfc9000
Iyengar, J. & Swett, I. RFC 9002: QUIC Loss Detection and Congestion Control (IETF, 2021). https://www.rfc-editor.org/rfc/rfc9002
QUIC WG. QUIC Acknowledgment Frequency (IETF Internet-Draft / QUIC WG draft). https://datatracker.ietf.org/doc/draft-ietf-quic-ack-frequency/
Cheng, N., Cardwell, N., Dukkipati, N., & Yeganeh, S. RFC 8985: The RACK-TLP Loss Detection Algorithm for TCP (IETF, 2021). https://www.rfc-editor.org/rfc/rfc8985
Cardwell, N. et al. BBR: Congestion-Based Congestion Control. ACM Queue 14(5), 2016. https://queue.acm.org/detail.cfm?id=3022184

One-line takeaway

ACK_FREQUENCY is not just a network tuning knob; for execution systems it is a slippage regime switch unless you model and control feedback aliasing explicitly.