RCU vs Hazard Pointers vs Epoch Reclamation Selection Playbook

2026-03-17 · software

RCU vs Hazard Pointers vs Epoch Reclamation Selection Playbook

Date: 2026-03-17
Category: knowledge

Why this matters

Lock-free or low-lock data structures fail in production less from CAS bugs than from memory reclamation mistakes:

If you run read-heavy infra (matching gateways, symbol maps, routing tables, session registries), choosing the right reclamation strategy is a first-order latency and reliability decision.


1) Mental model in one paragraph

All three families solve the same problem: a node is logically removed now, but some thread might still hold a pointer to it.

The design tension is always: fast reader path vs reclamation latency guarantees under slow/stuck threads.


2) What changes operationally (not just academically)

Dimension RCU / QSBR Epoch-based reclamation Hazard pointers
Reader fast path Usually the cheapest; can be near-zero overhead in QSBR-style read sections Cheap, but includes pin/unpin protocol Heavier: protect/retry loops + hazard slot stores
Sensitivity to stalled readers High: grace periods can delay reclamation High: pinned/stalled thread can block old-epoch reclaim Lower: stalled thread only blocks nodes it protects
Reclamation batching efficiency Excellent for bulk retire Excellent when threads progress normally Moderate; per-retire scans/threshold tuning matter
Memory bound behavior Can inflate under grace-period lag Can inflate under pinned participants Better bounded by hazard slots + retire threshold design
Implementation complexity Medium (API discipline + grace-period reasoning) Medium (epoch discipline + pin lifecycle) High (pointer protection protocol correctness)
Tail-latency risk source Expedited grace periods may disturb CPUs (IPIs) Epoch advancement stalls under blocked participants Retry/scanning overhead on hot read/write paths

Takeaway: there is no universal winner. You are choosing where to pay: reader overhead, memory headroom, or worst-case reclaim latency.


3) When each strategy tends to win

A) Prefer RCU when

Typical fits:

Practical note: Linux docs emphasize RCU’s performance comes from very cheap reads, and that reclamation requires waiting for a grace period. Expedited grace periods exist but are intentionally more disruptive to reduce latency.

B) Prefer EBR when

Typical fits:

Practical note: EBR often gives excellent throughput until one pinned participant goes pathological; then retire queues can balloon.

C) Prefer hazard pointers when

Typical fits:

Cost: more complicated APIs/protocols and usually higher steady-state reader overhead.


4) Failure modes you will actually see

A) “Phantom memory leak” under RCU/EBR

Symptom:

Root cause:

B) Pinned-thread hostage in EBR

Symptom:

Root cause:

C) Hazard-pointer retry storms

Symptom:

Root cause:

D) “Fixing latency” with expedited RCU and hurting the rest

Symptom:

Root cause:


5) Selection heuristic (fast and practical)

Start with these questions:

  1. Can any reader stall for long/unbounded time?
    • Yes → bias toward HP (or isolate that component).
  2. Is read-path overhead budget ultra-tight (single-digit ns concerns)?
    • Yes → bias toward RCU/QSBR or EBR.
  3. Is memory headroom tight and bursty backlog unacceptable?
    • Yes → bias toward HP, or strict watchdog + bounded pin scopes with EBR.
  4. Can your team reliably enforce API discipline in all code paths?
    • No → choose simpler model even if slower; operational correctness beats benchmark wins.

A good default for many teams:


6) Instrumentation you should have from day 1

Whatever scheme you pick, export:

Alert on trend, not just absolute value:

Without these, incidents become allocator blame games.


7) Safe rollout plan

Step 1 — Benchmark with adversarial scenarios

Do not benchmark only steady-state throughput. Include:

Step 2 — Define reclaim SLO explicitly

Example:

Then tune to that SLO (thresholds, batch sizes, grace-period policy).

Step 3 — Add hard guards

Step 4 — Chaos drills

Quarterly test:


8) Mapping to common ecosystems


9) 30-minute incident runbook (memory growth in lock-free subsystem)

  1. Confirm whether growth correlates with retired backlog (not generic heap fragmentation first).
  2. Check active readers/pinned participants and longest duration.
  3. If EBR/RCU backlog is blocked:
    • identify offending thread,
    • force-safe restart or isolate it,
    • shorten critical sections before touching allocator knobs.
  4. If HP overhead storm:
    • inspect retry loops and hazard scan thresholds,
    • reduce contention hot spots (sharding, read-copy snapshots).
  5. Only after reclaim-path diagnosis, revisit allocator tuning.
  6. Capture pre/post reclaim-latency and backlog metrics; update guardrails.

10) Bottom line

Pick reclamation like an SRE choice, not a paper choice:

In production, the winning strategy is the one your team can observe, debug, and keep safe at 3 a.m.


References