Clock-Servo Correction Shock Slippage Playbook

Why this exists

Most desks model slippage as market microstructure + strategy behavior. But when clock discipline enters a correction phase (PTP/NTP servo ramps, holdover exit, source-switch), feature age and event ordering can become temporarily biased without obvious hard failures.

That bias silently changes routing urgency, queue confidence, and toxicity estimation.

Result: execution overpays while dashboards still look “within latency SLO.”

Core failure mode

During clock-servo correction shocks:

event-time and ingest-time drift apart transiently,
quote/trade/depth age features are misestimated,
stale quotes look fresh (or fresh quotes look stale),
urgency logic flips at the wrong boundary,
tail slippage rises before conventional alerts fire.

This is not generic “clock drift.” It is control-loop transition risk.

Slippage decomposition with clock-correction terms

For parent order (i):

[ IS_i = C_{delay} + C_{impact} + C_{miss} + C_{correction-shock} ]

Where:

[ C_{correction-shock} = C_{age-bias} + C_{ordering-bias} + C_{controller-flip} ]

Age bias: stale-feature overtrust or undertrust
Ordering bias: causal inversion around tight timing windows
Controller flip: wrong state transitions due to biased risk scores

Feature set (production-ready)

1) Clock correction stress features

servo_offset_ns (abs and signed)
servo_freq_ppb
servo_state (locked / converging / holdover / source-switch)
offset_velocity_ns_per_s
offset_accel_ns_per_s2
time_source_quality_score

2) Feature-age integrity features

quote_age_ns_model vs quote_age_ns_wire
trade_age_ns_model vs trade_age_ns_wire
depth_age_ns_model vs depth_age_ns_wire
age_bias_ratio = model_age / wire_age

3) Causality integrity features

cross_channel_ordering_conflict_rate
negative_interarrival_rate
event_resequence_distance_p95

4) Execution outcome features

branch labels: passive_fill, miss_then_chase, toxic_fill
horizon markouts (10ms/100ms/1s/5s)
completion deficit under deadline

Model architecture

Use a two-layer setup.

Baseline slippage model (existing desk model)
- impact + fill + markout components
Correction-shock overlay model
- predicts incremental uplift on mean and q95 slippage:
  - delta_is_mean
  - delta_is_q95

Final forecast:

[ \hat{IS}{final} = \hat{IS}{baseline} + \Delta\hat{IS}_{correction} ]

Train overlay on episodes around servo transitions and source changes; include non-transition controls to avoid confounding with pure volatility spikes.

Regime state machine

State A: `CLOCK_STABLE`

low offset velocity, low ordering conflicts
normal participation policy

State B: `CLOCK_WATCH`

correction dynamics rising, mild age-bias divergence
tighten passive timeout, reduce queue-confidence prior

State C: `CLOCK_CORRECTION_ACTIVE`

high offset dynamics and/or ordering conflict burst
reduce passive exposure, cap child size, widen safety buffers

State D: `SAFE_TIME_INTEGRITY`

severe causality degradation
fail to conservative routing template, suppress fragile tactics

Use hysteresis and minimum dwell times to avoid state flapping.

Key metrics

CABI (Clock-Age Bias Index): p95(|log(age_bias_ratio)|)
OCI (Ordering Conflict Index): weighted conflict/resequence score
CFR (Controller Flip Rate): unexpected tactic-state transitions per minute
CSU (Correction Shock Uplift): realized IS minus baseline IS during clock events
TSC (Time Source Churn): source-switch count per session

Track these by venue + symbol-liquidity bucket.

Rollout plan

Phase 0: Observe-only

compute all clock-shock features and metrics
no policy changes
validate that CABI/OCI lead CSU tail events

Phase 1: Shadow control

run state machine + actions in shadow
compare counterfactual vs live using replay/OPE

Phase 2: Canary

5–10% flow on conservative symbols
hard rollback on q95 deterioration or completion breach

Phase 3: Progressive expansion

scale by liquidity tier
keep symbol-level kill switch

Failure drills (must run)

PTP source-switch drill
- verify transition to CLOCK_WATCH within SLO
Holdover exit drill
- ensure no controller thrash when offset re-converges
Synthetic resequencing drill
- verify escalation to SAFE_TIME_INTEGRITY
Rollback drill
- prove one-command disable of overlay logic

Anti-patterns

Treating “clock offset < threshold” as sufficient
Using only average age bias (tail is what hurts)
Allowing controller state transitions without hysteresis
Blaming all tail slippage on market volatility

Bottom line

When time discipline enters correction dynamics, your execution model can misread reality while still looking healthy at coarse observability layers.

Model clock-servo transition risk as a first-class slippage component, and convert it into explicit state controls before hidden timing debt turns into basis-point loss.

Clock-Servo Correction Shock Slippage Playbook

Clock-Servo Correction Shock Slippage Playbook

Why this exists

Core failure mode

Slippage decomposition with clock-correction terms

Feature set (production-ready)

1) Clock correction stress features

2) Feature-age integrity features

3) Causality integrity features

4) Execution outcome features

Model architecture

Regime state machine

State A: CLOCK_STABLE

State B: CLOCK_WATCH

State C: CLOCK_CORRECTION_ACTIVE

State D: SAFE_TIME_INTEGRITY

Key metrics

Rollout plan

Phase 0: Observe-only

Phase 1: Shadow control

Phase 2: Canary

Phase 3: Progressive expansion

Failure drills (must run)

Anti-patterns

Bottom line

State A: `CLOCK_STABLE`

State B: `CLOCK_WATCH`

State C: `CLOCK_CORRECTION_ACTIVE`

State D: `SAFE_TIME_INTEGRITY`