Linux MGLRU Playbook (Multi-Gen LRU for Memory-Reclaim Stability)

2026-03-18 · software

Linux MGLRU Playbook (Multi-Gen LRU for Memory-Reclaim Stability)

Date: 2026-03-18
Category: knowledge

Why this matters

Under memory pressure, many latency incidents are not CPU-bound compute issues. They are reclaim-path issues:

MGLRU (Multi-Gen LRU) improves reclaim decisions by aging pages across multiple generations instead of only active/inactive lists. In practice, this often means better cold-page targeting and less reclaim chaos during pressure.


1) Mental model: reclaim quality > reclaim volume

Traditional reclaim tuning often focuses on "how much" to reclaim. MGLRU shifts focus to "which pages" to reclaim first.

Goal:

If reclaim quality improves, you usually get:


2) Core controls you should know

/sys/kernel/mm/lru_gen/enabled

/sys/kernel/mm/lru_gen/min_ttl_ms

From kernel docs:


3) Quick verification checklist

Check kernel support

grep -E 'CONFIG_LRU_GEN|CONFIG_LRU_GEN_ENABLED' /boot/config-$(uname -r)

Check runtime interface

ls /sys/kernel/mm/lru_gen
cat /sys/kernel/mm/lru_gen/enabled
cat /sys/kernel/mm/lru_gen/min_ttl_ms

Enable all supported components

echo y | sudo tee /sys/kernel/mm/lru_gen/enabled
cat /sys/kernel/mm/lru_gen/enabled

Expected common output after full enable: 0x0007 (platform-dependent).


4) Practical rollout profiles

Profile A: Server/API (conservative)

echo y | sudo tee /sys/kernel/mm/lru_gen/enabled
echo 0 | sudo tee /sys/kernel/mm/lru_gen/min_ttl_ms

Why: server workloads often prefer avoiding surprise OOM behavior before confidence is built.

Profile B: Interactive desktop/workstation

echo y | sudo tee /sys/kernel/mm/lru_gen/enabled
echo 1000 | sudo tee /sys/kernel/mm/lru_gen/min_ttl_ms

Why: prioritizes responsiveness under pressure.


5) Observability: what to track during canary

Pair reclaim metrics with user-facing latency.

Reclaim pressure / stalls

cat /proc/pressure/memory
vmstat 1

Fault/refault health (proxy)

grep -E 'pgfault|pgmajfault|pgscan|pgsteal' /proc/vmstat

kswapd + direct reclaim symptoms

SLO view


6) Tuning guidance (safe order)

  1. Enable MGLRU only (enabled=y, min_ttl_ms=0)
  2. Compare canary vs baseline on pressure scenarios
  3. If interactive stalls remain, test min_ttl_ms ladder:
    • 300 → 1000 → 2000 (only if needed)
  4. Stop increasing once:
    • tail-latency benefit plateaus, or
    • OOM risk rises meaningfully
  5. Persist chosen values with distro-appropriate boot/service config

Rule of thumb:


7) Common footguns

  1. Turning on high min_ttl_ms globally without canary

    • can trade jank for unexpected OOM kills.
  2. Reading average latency only

    • reclaim problems are mostly tail problems.
  3. Ignoring memcg policy interactions

    • memory.high / memory.max and reclaim tuning can dominate outcome.
  4. No pressure replay in tests

    • idle benchmarks won’t show reclaim-path improvements.
  5. Assuming every kernel/distro default is identical

    • support and default enablement vary.

8) Suggested pressure test recipe

Run the same workload with fixed memory limits in two configs:

Collect per run:

Promote only if p99 improves without unacceptable OOM/throughput regressions.


Closing

MGLRU is not a magic "faster kernel" switch. It is a reclaim-quality upgrade that can materially reduce memory-pressure tail pain when rolled out with metrics discipline.

Treat it as a controlled change:

That approach captures most of the upside without turning memory pressure into an OOM lottery.