Linux MGLRU Playbook (Multi-Gen LRU for Memory-Reclaim Stability)
Date: 2026-03-18
Category: knowledge
Why this matters
Under memory pressure, many latency incidents are not CPU-bound compute issues. They are reclaim-path issues:
- kswapd spikes,
- direct reclaim stalls in request paths,
- cache thrash that repeatedly evicts and re-faults hot pages,
- tail-latency blowups while average metrics still look “fine”.
MGLRU (Multi-Gen LRU) improves reclaim decisions by aging pages across multiple generations instead of only active/inactive lists. In practice, this often means better cold-page targeting and less reclaim chaos during pressure.
1) Mental model: reclaim quality > reclaim volume
Traditional reclaim tuning often focuses on "how much" to reclaim. MGLRU shifts focus to "which pages" to reclaim first.
Goal:
- preserve true working set longer,
- evict genuinely cold pages earlier,
- reduce reclaim-induced jitter and refault churn.
If reclaim quality improves, you usually get:
- lower direct-reclaim incidence,
- lower memory-pressure tail latency,
- fewer user-visible stalls/janks under stress.
2) Core controls you should know
/sys/kernel/mm/lru_gen/enabled
- Main runtime switch and feature bitmask.
yusually enables all supported components.ndisables.
/sys/kernel/mm/lru_gen/min_ttl_ms
- Thrash-prevention guardrail.
- Protects a recent working set window from eviction (time-based).
- If memory cannot satisfy that protection, OOM kill can happen earlier (by design).
From kernel docs:
N=1000ms is a common baseline for reducing visible jank on desktop-ish workloads.- Larger values can smooth UX further but increase premature-OOM risk.
3) Quick verification checklist
Check kernel support
grep -E 'CONFIG_LRU_GEN|CONFIG_LRU_GEN_ENABLED' /boot/config-$(uname -r)
Check runtime interface
ls /sys/kernel/mm/lru_gen
cat /sys/kernel/mm/lru_gen/enabled
cat /sys/kernel/mm/lru_gen/min_ttl_ms
Enable all supported components
echo y | sudo tee /sys/kernel/mm/lru_gen/enabled
cat /sys/kernel/mm/lru_gen/enabled
Expected common output after full enable:
0x0007 (platform-dependent).
4) Practical rollout profiles
Profile A: Server/API (conservative)
- Enable MGLRU
- Keep
min_ttl_ms=0initially - Measure reclaim/latency changes first
echo y | sudo tee /sys/kernel/mm/lru_gen/enabled
echo 0 | sudo tee /sys/kernel/mm/lru_gen/min_ttl_ms
Why: server workloads often prefer avoiding surprise OOM behavior before confidence is built.
Profile B: Interactive desktop/workstation
- Enable MGLRU
- Start with
min_ttl_ms=1000
echo y | sudo tee /sys/kernel/mm/lru_gen/enabled
echo 1000 | sudo tee /sys/kernel/mm/lru_gen/min_ttl_ms
Why: prioritizes responsiveness under pressure.
5) Observability: what to track during canary
Pair reclaim metrics with user-facing latency.
Reclaim pressure / stalls
cat /proc/pressure/memory
vmstat 1
Fault/refault health (proxy)
grep -E 'pgfault|pgmajfault|pgscan|pgsteal' /proc/vmstat
kswapd + direct reclaim symptoms
- kswapd CPU usage bursts
- elevated
allocstall/ direct reclaim indicators - request latency outliers correlated with memory pressure windows
SLO view
- p95/p99 latency during memory stress
- throughput under constrained memory
- OOM frequency / kill targets
6) Tuning guidance (safe order)
- Enable MGLRU only (
enabled=y,min_ttl_ms=0) - Compare canary vs baseline on pressure scenarios
- If interactive stalls remain, test
min_ttl_msladder:- 300 → 1000 → 2000 (only if needed)
- Stop increasing once:
- tail-latency benefit plateaus, or
- OOM risk rises meaningfully
- Persist chosen values with distro-appropriate boot/service config
Rule of thumb:
- lower
min_ttl_ms= safer memory headroom, - higher
min_ttl_ms= stronger thrash protection + higher OOM aggressiveness.
7) Common footguns
Turning on high
min_ttl_msglobally without canary- can trade jank for unexpected OOM kills.
Reading average latency only
- reclaim problems are mostly tail problems.
Ignoring memcg policy interactions
memory.high/memory.maxand reclaim tuning can dominate outcome.
No pressure replay in tests
- idle benchmarks won’t show reclaim-path improvements.
Assuming every kernel/distro default is identical
- support and default enablement vary.
8) Suggested pressure test recipe
Run the same workload with fixed memory limits in two configs:
- Baseline (old/default reclaim behavior)
- MGLRU enabled (and optionally one
min_ttl_mscandidate)
Collect per run:
- p50/p95/p99 latency,
- major fault rate,
- PSI memory pressure,
- kswapd CPU,
- OOM incidents.
Promote only if p99 improves without unacceptable OOM/throughput regressions.
Closing
MGLRU is not a magic "faster kernel" switch. It is a reclaim-quality upgrade that can materially reduce memory-pressure tail pain when rolled out with metrics discipline.
Treat it as a controlled change:
- enable,
- observe under stress,
- tune
min_ttl_mscarefully, - keep rollback easy.
That approach captures most of the upside without turning memory pressure into an OOM lottery.