Cgroup CPU Quota Throttling Burst Slippage Playbook

Date: 2026-03-18
Category: research

Why this exists

Execution stacks are increasingly containerized. That helps isolation, but it also introduces a subtle slippage tax: CFS quota throttling (cpu.max) can pause a latency-critical strategy exactly when microstructure is moving.

The host can still look “fine” on average CPU usage while the strategy experiences:

short hard stalls,
period-boundary burst release,
queue-priority decay,
panic catch-up that crosses spread.

This playbook treats quota throttling as a first-class slippage factor.

Core failure mode

In cgroup v2, cpu.max = <quota> <period> grants CPU runtime budget each period. If budget is exhausted, runnable threads in that cgroup are throttled until the next period refill.

For an execution engine, this creates a repeated pattern:

Budget depletion during bursty compute/IO windows.
Throttle pause in decision / amend / cancel loop.
Unthrottle surge at period boundary.
Dispatch clumping (late child-order burst).
Queue reset + adverse selection.

Result: tail implementation shortfall worsens, even if mean completion stays acceptable.

Slippage decomposition with quota-throttle term

For parent order (i):

[ IS_i = C_{spread} + C_{impact} + C_{opportunity} + C_{throttle} ]

Where:

[ C_{throttle} = C_{pause} + C_{catchup} + C_{queue_decay} + C_{timing_alias} ]

(C_{pause}): missed passive windows during throttled intervals.
(C_{catchup}): convex impact from post-throttle burst sends.
(C_{queue_decay}): queue position loss from delayed amend/cancel decisions.
(C_{timing_alias}): pathological synchronization with fixed quota period boundaries.

Production observability (minimum)

1) cgroup CPU telemetry

From cpu.stat (cgroup v2):

nr_periods
nr_throttled
throttled_usec
usage_usec

2) Scheduler / pressure context

CPU PSI (/proc/pressure/cpu) for global contention backdrop
per-thread run-queue delay quantiles (if available from scheduler telemetry)

3) Execution-path telemetry

decision-to-send latency (p50/p95/p99)
cancel-to-replace turnaround
inter-dispatch gap variance
child-order burst index
passive fill ratio by latency bucket

4) Outcome telemetry

IS by parent and urgency bucket
short-horizon markout ladder (10ms / 100ms / 1s)
completion deficit near deadline

Desk metrics to track

Define these over rolling windows (e.g., 1m / 5m):

TRR (Throttle Ratio Rate)

[ TRR = \frac{\Delta nr_throttled}{\Delta nr_periods} ]

TDR (Throttle Duty Ratio)

[ TDR = \frac{\Delta throttled_usec}{\Delta window_usec} ]

PBA (Period-Boundary Aliasing)

Correlation between dispatch spikes and quota period refill boundaries.

CBI (Catch-up Burst Index)

Post-unthrottle child-send rate divided by baseline send rate.

QDL (Queue Decay Loss)

Passive fill-ratio drop conditioned on throttle events vs matched non-throttle windows.

Modeling approach

Use baseline + throttle overlay architecture.

Stage A: Baseline slippage model

Normal spread/impact/fill model with market-state features.

Stage B: Quota-throttle uplift model

Predict incremental uplift:

delta_is_mean
delta_is_q95

using features:

TRR, TDR, PBA, CBI, QDL
cpu.max effective core ceiling (quota/period)
latency-tail stretch metrics
urgency, participation, symbol-liquidity regime

Final estimate:

[ \hat{IS}{final} = \hat{IS}{baseline} + \Delta\hat{IS}_{throttle} ]

Train with matched market windows (same symbol/session/volatility buckets) to isolate infra-induced uplift from market confounders.

Controller state machine

State 1 — `NORMAL`

TRR < 0.01
TDR < 0.5%
no period-locking pattern

Action: normal pacing.

State 2 — `QUOTA_EDGE`

TRR >= 0.01 or TDR >= 0.5%
latency tails widening

Action: reduce replace churn, smooth child spacing, raise passive selectivity.

State 3 — `THROTTLE_BURST`

TRR >= 0.05 or TDR >= 3%
burst index elevated after unthrottle

Action: cap burst size, disable panic catch-up, enforce inter-send minimum gap.

State 4 — `SAFE_CONTAIN`

persistent THROTTLE_BURST + deadline pressure

Action: conservative completion mode, stricter aggression cap, optional route to non-throttled host pool.

Use hysteresis + minimum dwell times to prevent flapping.

Mitigation ladder

Right-size quota headroom
- Avoid sizing from average CPU use; size from p95/p99 burst demand.
- Explicitly budget risk checks + serialization + retry spikes.
Tune quota period deliberately
- Shorter period reduces max single stall, but can increase scheduling overhead.
- Validate period choice with latency-tail experiments, not CPU-average metrics.
Use cpu.weight + topology isolation together
- Quota alone is a blunt tool.
- Combine fair-share (cpu.weight) and cpuset isolation for critical paths.
Throttle-aware pacer
- Detect recent throttle events and suppress catch-up bursts.
- Prefer bounded repayment over immediate backlog flush.
Host-pool policy
- Keep urgent flow off aggressively capped multi-tenant pools.
- Separate “latency-critical” and “batch-contended” deployment classes.

Validation drills (must run)

Quota squeeze drill
- Temporarily tighten cpu.max; verify uplift detector catches IS tail rise.
Period sensitivity drill
- Sweep period values (e.g., 100ms -> 50ms -> 20ms) with fixed effective quota.
- Compare q95 dispatch latency, CBI, and q95 IS.
Catch-up policy A/B
- naive backlog flush vs bounded repayment policy.
- Select policy by q95 IS + completion stability, not mean IS alone.
Confounder separation drill
- Distinguish quota-throttle signatures from network/venue incidents.

Anti-patterns

“CPU usage is only 40%, so throttling cannot matter.”
fixed 100ms period with no aliasing analysis
quota-capped strategy colocated with bursty batch jobs
panic catch-up logic after unthrottle
monitoring only host CPU, not cgroup cpu.stat

Practical rollout checklist

Add cgroup cpu.stat sampling to execution telemetry pipeline.
Build TRR/TDR/PBA/CBI dashboards by host pool and strategy class.
Label throttle windows in historical TCA.
Fit throttle-uplift overlay model and backtest contribution.
Enable state-machine controller in shadow mode.
Run quota/period canary and compare q95 IS before broad rollout.

Bottom line

In containerized execution systems, cgroup quota is not just a resource-control setting; it is a microstructure timing control.

If you do not model and govern quota-throttle bursts explicitly, you will keep paying a hidden tail-slippage tax that looks like “market noise” but is mostly self-inflicted.

References

Linux kernel docs — CFS bandwidth control:
https://www.kernel.org/doc/html/latest/scheduler/sched-bwc.html
Linux kernel docs — cgroup v2 CPU controller:
https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html
Linux kernel docs — PSI (Pressure Stall Information):
https://www.kernel.org/doc/html/latest/accounting/psi.html

Cgroup CPU Quota Throttling Burst Slippage Playbook

Cgroup CPU Quota Throttling Burst Slippage Playbook

Why this exists

Core failure mode

Slippage decomposition with quota-throttle term

Production observability (minimum)

1) cgroup CPU telemetry

2) Scheduler / pressure context

3) Execution-path telemetry

4) Outcome telemetry

Desk metrics to track

Modeling approach

Stage A: Baseline slippage model

Stage B: Quota-throttle uplift model

Controller state machine

State 1 — NORMAL

State 2 — QUOTA_EDGE

State 3 — THROTTLE_BURST

State 4 — SAFE_CONTAIN

Mitigation ladder

Validation drills (must run)

Anti-patterns

Practical rollout checklist

Bottom line

References

State 1 — `NORMAL`

State 2 — `QUOTA_EDGE`

State 3 — `THROTTLE_BURST`

State 4 — `SAFE_CONTAIN`