Clock-Servo Correction Shock Slippage Playbook
Why this exists
Most desks model slippage as market microstructure + strategy behavior. But when clock discipline enters a correction phase (PTP/NTP servo ramps, holdover exit, source-switch), feature age and event ordering can become temporarily biased without obvious hard failures.
That bias silently changes routing urgency, queue confidence, and toxicity estimation.
Result: execution overpays while dashboards still look “within latency SLO.”
Core failure mode
During clock-servo correction shocks:
- event-time and ingest-time drift apart transiently,
- quote/trade/depth age features are misestimated,
- stale quotes look fresh (or fresh quotes look stale),
- urgency logic flips at the wrong boundary,
- tail slippage rises before conventional alerts fire.
This is not generic “clock drift.” It is control-loop transition risk.
Slippage decomposition with clock-correction terms
For parent order (i):
[ IS_i = C_{delay} + C_{impact} + C_{miss} + C_{correction-shock} ]
Where:
[ C_{correction-shock} = C_{age-bias} + C_{ordering-bias} + C_{controller-flip} ]
- Age bias: stale-feature overtrust or undertrust
- Ordering bias: causal inversion around tight timing windows
- Controller flip: wrong state transitions due to biased risk scores
Feature set (production-ready)
1) Clock correction stress features
servo_offset_ns(abs and signed)servo_freq_ppbservo_state(locked / converging / holdover / source-switch)offset_velocity_ns_per_soffset_accel_ns_per_s2time_source_quality_score
2) Feature-age integrity features
quote_age_ns_modelvsquote_age_ns_wiretrade_age_ns_modelvstrade_age_ns_wiredepth_age_ns_modelvsdepth_age_ns_wireage_bias_ratio = model_age / wire_age
3) Causality integrity features
cross_channel_ordering_conflict_ratenegative_interarrival_rateevent_resequence_distance_p95
4) Execution outcome features
- branch labels:
passive_fill,miss_then_chase,toxic_fill - horizon markouts (10ms/100ms/1s/5s)
- completion deficit under deadline
Model architecture
Use a two-layer setup.
- Baseline slippage model (existing desk model)
- impact + fill + markout components
- Correction-shock overlay model
- predicts incremental uplift on mean and q95 slippage:
delta_is_meandelta_is_q95
- predicts incremental uplift on mean and q95 slippage:
Final forecast:
[ \hat{IS}{final} = \hat{IS}{baseline} + \Delta\hat{IS}_{correction} ]
Train overlay on episodes around servo transitions and source changes; include non-transition controls to avoid confounding with pure volatility spikes.
Regime state machine
State A: CLOCK_STABLE
- low offset velocity, low ordering conflicts
- normal participation policy
State B: CLOCK_WATCH
- correction dynamics rising, mild age-bias divergence
- tighten passive timeout, reduce queue-confidence prior
State C: CLOCK_CORRECTION_ACTIVE
- high offset dynamics and/or ordering conflict burst
- reduce passive exposure, cap child size, widen safety buffers
State D: SAFE_TIME_INTEGRITY
- severe causality degradation
- fail to conservative routing template, suppress fragile tactics
Use hysteresis and minimum dwell times to avoid state flapping.
Key metrics
- CABI (Clock-Age Bias Index): p95(|log(age_bias_ratio)|)
- OCI (Ordering Conflict Index): weighted conflict/resequence score
- CFR (Controller Flip Rate): unexpected tactic-state transitions per minute
- CSU (Correction Shock Uplift): realized IS minus baseline IS during clock events
- TSC (Time Source Churn): source-switch count per session
Track these by venue + symbol-liquidity bucket.
Rollout plan
Phase 0: Observe-only
- compute all clock-shock features and metrics
- no policy changes
- validate that CABI/OCI lead CSU tail events
Phase 1: Shadow control
- run state machine + actions in shadow
- compare counterfactual vs live using replay/OPE
Phase 2: Canary
- 5–10% flow on conservative symbols
- hard rollback on q95 deterioration or completion breach
Phase 3: Progressive expansion
- scale by liquidity tier
- keep symbol-level kill switch
Failure drills (must run)
- PTP source-switch drill
- verify transition to
CLOCK_WATCHwithin SLO
- verify transition to
- Holdover exit drill
- ensure no controller thrash when offset re-converges
- Synthetic resequencing drill
- verify escalation to
SAFE_TIME_INTEGRITY
- verify escalation to
- Rollback drill
- prove one-command disable of overlay logic
Anti-patterns
- Treating “clock offset < threshold” as sufficient
- Using only average age bias (tail is what hurts)
- Allowing controller state transitions without hysteresis
- Blaming all tail slippage on market volatility
Bottom line
When time discipline enters correction dynamics, your execution model can misread reality while still looking healthy at coarse observability layers.
Model clock-servo transition risk as a first-class slippage component, and convert it into explicit state controls before hidden timing debt turns into basis-point loss.