Linux Timer-Slack Coalescing & Timer-Migration Slippage Playbook

Date: 2026-03-24
Category: research
Scope: How Linux timer coalescing (timerslack) and cross-CPU timer migration leak into execution cadence and create hidden slippage tails

Why this matters

Execution stacks often focus on wire/network microbursts, but many child-order bursts are born before the socket write.

A common hidden path:

strategy threads paced by nanosleep / clock_nanosleep / epoll_wait / futex timeouts,
kernel groups near-expiry timers via timer slack,
timer callbacks can be migrated away from idle CPUs (/proc/sys/kernel/timer_migration=1),
wakeups arrive a bit late and sometimes together,
scheduler/dispatcher catches up in bursts,
market impact convexity turns small timing errors into large cost tails.

The median loop latency can still look fine while q95/q99 slippage degrades.

Failure mechanism (operator timeline)

Parent execution loop targets smooth cadence (e.g., every 200–500µs).
Critical thread keeps default timer slack (often inherited), or slack drifts too large for loop period.
Under load, timer expirations are coalesced and/or wakeups land on a different CPU path.
Effective wakeups are delayed by tens of microseconds to sub-millisecond bursts.
Child emission becomes “quiet then clustered” instead of evenly spaced.
Schedule deficit accumulates; urgency logic increases aggression.
Burst re-entry crosses thinner queue depth and pays impact + queue-reset tax.

Key point: this is OS timer-policy leakage into execution cost, not purely market randomness.

Extend slippage decomposition with timer-policy term

[ IS = IS_{market} + IS_{impact} + IS_{timing} + IS_{fees} + \underbrace{IS_{timer}}_{\text{coalescing/migration cadence tax}} ]

Practical approximation:

[ IS_{timer,t} \approx a\cdot TSR_t + b\cdot WJL_t + c\cdot CBI_t + d\cdot TMR_t + e\cdot PHE_t ]

Where:

(TSR): Timer Slack Ratio,
(WJL): Wakeup Jitter Lag,
(CBI): Coalesced Burst Index,
(TMR): Timer Migration Rate,
(PHE): Phase Error between target and actual child dispatch.

Metrics to add in production

1) Timer Slack Ratio (TSR)

[ TSR = \frac{timerslack_ns}{loop_period_ns + \epsilon} ]

If loop period is 200µs and slack is 50µs, TSR=0.25 (already material).

2) Wakeup Jitter Lag (WJL)

[ WJL = p99(t_{wake,actual} - t_{wake,target}) ]

Measured in microseconds from monotonic timestamps.

3) Coalesced Burst Index (CBI)

[ CBI = \frac{p95(\text{childs per 250}\mu s)}{median(\text{childs per 250}\mu s)+\epsilon} ]

High CBI indicates “dribble then burst” behavior.

4) Timer Migration Rate (TMR)

[ TMR = \frac{#(timer\ wakeups\ where\ target\ CPU\neq\ dispatch\ CPU)}{#(timer\ wakeups)} ]

Proxy with scheduler tracepoint joins if direct timer ownership is hard.

5) Phase Error (PHE)

[ PHE = p95\left(|t_{child,actual} - t_{child,target}|\right) ]

Directly translates kernel timing drift into execution timing damage.

6) Slack-At-Risk Exposure (SARE)

[ SARE = P(TSR > \tau \land urgency > u^*) ]

This interaction is usually where tail IS explodes.

Modeling architecture

Stage 1: timer-regime detector

Features:

timerslack_ns by thread/process,
wakeup lateness distribution,
CPU residency / isolate-core status,
timer_migration setting and wakeup CPU mismatch proxies,
short-horizon CBI/PHE trends,
urgency and schedule deficit.

Output:

(P(\text{TIMER_DISTORTION_REGIME}))

Stage 2: conditional slippage uplift model

[ \Delta IS \sim \beta_1,urgency + \beta_2,p_{timer} + \beta_3,(urgency\times p_{timer}) ]

Interpretation: urgency alone hurts, timer distortion alone hurts, but the interaction hurts the most.

Controller state machine

GREEN — CADENCE_STABLE

Low WJL/CBI/PHE, TSR below threshold.
Baseline participation.

YELLOW — COALESCING_RISK

TSR high or WJL drifting up.
Actions:
- raise sampling rate,
- tighten alerting on dispatch phase,
- limit discretionary burst fanout.

ORANGE — DISTORTION_ACTIVE

Sustained WJL + CBI elevation, timer migration proxies active.
Actions:
- cap catch-up slope,
- switch to smoother schedule template,
- reduce route churn and avoid aggressive venue hopping.

RED — TAIL_CONTAINMENT

Tail budget breach with active timer distortion.
Actions:
- hard-limit urgency escalation,
- temporarily move to conservative completion policy,
- pin to known-stable thread/CPU profile.

Use hysteresis to avoid flip-flopping.

Engineering mitigations (high ROI first)

Set explicit low timer slack on critical execution threads
Use prctl(PR_SET_TIMERSLACK, ...) or /proc/<pid>/timerslack_ns policy. Keep non-critical threads relaxed for power.
Separate critical and non-critical work
Don’t let logging/housekeeping threads share timing policy with dispatch-critical loops.
Review kernel.timer_migration and CPU isolation strategy together
Co-tune with core pinning/isolcpus/nohz_full design; avoid one-size-fits-all toggles.
Prefer absolute-time pacing (TIMER_ABSTIME) over drift-prone relative loops
Reduces cumulative phase walk when wakeups are occasionally late.
Add short spin window only near deadline cliffs
Hybrid sleep-then-spin can reduce worst-tail phase error while controlling thermal burn.
Promote by tail metrics, not mean latency
Gate on q95/q99 slippage + completion quality.

Validation protocol

Label windows by timer regime (stable / distorted) from telemetry.
Match cohorts by symbol, spread, volatility, urgency, participation.
Compare mean + q95/q99 slippage and markout between cohorts.
Canary mitigations:
- explicit low slack on critical threads,
- CPU affinity/isolation adjustments,
- migration/pacing policy changes.
Promote only when tail improves without reliability regressions.

Observability checklist

/proc/<pid>/timerslack_ns snapshots for strategy/execution PIDs
wakeup target vs actual timestamps (monotonic)
dispatch target vs actual child timestamp (phase error)
per-CPU run queue and wakeup CPU mismatch proxy
burstiness series (CBI) around schedule cutoffs
slippage and short-horizon markout conditioned on timer regime

Success criterion: smaller tail slippage during urgency windows, not just lower average wakeup delay.

Pseudocode sketch

obs = collect_timer_obs()  # TSR, WJL, CBI, TMR, PHE, urgency
p_timer = timer_regime_model.predict_proba(obs)
state = decode_state(p_timer, obs)

if state == "GREEN":
    params = baseline_policy()
elif state == "YELLOW":
    params = guarded_policy()
elif state == "ORANGE":
    params = smooth_catchup_policy()
else:  # RED
    params = containment_policy()

apply_execution_params(params)
log(state=state, p_timer=p_timer)

Bottom line

Timer policy is a real execution variable.

If your slippage stack ignores timer slack and timer-migration-induced wakeup distortion, you will over-attribute losses to “market conditions” and under-fix the true cadence problem.

References

Linux man page: PR_SET_TIMERSLACK
https://man7.org/linux/man-pages/man2/pr_set_timerslack.2const.html
Linux man page: /proc/<pid>/timerslack_ns
https://man7.org/linux/man-pages/man5/proc_pid_timerslack_ns.5.html
Linux kernel docs: hrtimers subsystem
https://docs.kernel.org/timers/hrtimers.html
Linux kernel docs: NO_HZ (tick reduction / jitter trade-offs)
https://docs.kernel.org/timers/no_hz.html
Linux kernel docs: /proc/sys/kernel/timer_migration
https://docs.kernel.org/admin-guide/sysctl/kernel.html#timer-migration
Linux Foundation RT wiki: cyclictest latency methodology
https://wiki.linuxfoundation.org/realtime/documentation/howto/tools/cyclictest/start