Linux zswap vs zram Memory-Compression Playbook

Date: 2026-04-07
Category: knowledge
Domain: software / linux / memory management / platform operations

Why this matters

A lot of Linux memory advice still collapses two different tools into one vague sentence:

“Turn on compressed swap.”

That is usually not enough.

zswap and zram are related, but they solve different problems with different trade-offs:

zswap = a compressed cache in front of a real backing swap device
zram = a compressed block device living in RAM, often used as swap itself

If you pick the wrong one, or stack them carelessly, you can get:

reclaim behavior that is harder to reason about,
extra CPU burn without enough memory win,
hibernation surprises,
hidden writeback to disk/flash,
or misleading “we enabled memory compression” confidence while OOM behavior barely improved.

The practical goal is not “use the coolest kernel feature.” The goal is:

stretch usable memory under pressure without creating worse latency, worse write amplification, or harder-to-debug reclaim paths.

1) Fast mental model

zswap in one sentence

zswap intercepts pages on their way to swap, compresses them into an in-RAM pool, and only writes them to the real swap device when that pool needs to evict pages.

Implications:

requires a real backing swap device/file/partition,
reduces swap I/O when compressed pages can stay in RAM,
still preserves the normal concept of “swap exists on disk,”
fits best when you want a swap I/O buffer/cushion, not a pure-RAM swap device.

zram in one sentence

zram creates a compressed RAM-backed block device, which you can format as swap or a filesystem.

Implications:

pages written to zram are compressed and stored in RAM itself,
no backing disk is required for basic use,
often used as a high-priority swap device,
fits best when you want more effective use of RAM before touching disk.

The simplest distinction

zswap = “compress before disk swap”
zram = “make compressed RAM behave like a swap device”

2) The architecture difference that actually changes operations

zswap path

Pressure path, simplified:

kernel decides a page should be swapped,
zswap tries to compress and keep it in its in-memory pool,
if the pool is full or the page is rejected/evicted, it goes to the backing swap device.

Operational meaning:

zswap is primarily about reducing swap I/O,
not eliminating the need for swap infrastructure,
and not removing the possibility of eventual disk writes.

This is why zswap often makes sense on:

laptops/desktops that already have disk swap,
VMs sharing a congested storage path,
SSD-backed systems where you want fewer swap writes,
systems that still need hibernate-compatible disk swap.

zram path

Pressure path, simplified:

kernel swaps to /dev/zramN,
that page is compressed and stored in RAM-backed zram memory,
the system effectively trades CPU for storing more cold pages in memory.

Operational meaning:

zram is primarily about RAM efficiency under pressure,
not about reducing disk swap writes to an existing swap device,
and by default it keeps the whole swap path in memory rather than disk.

This is why zram often makes sense on:

small-RAM laptops/desktops,
developer machines that want responsiveness over deep disk swap,
edge devices / SBCs,
systems where swap-to-disk is undesirable or absent.

3) Do not casually stack zswap in front of zram swap

This is the most practical footgun.

If you configure zram as swap and also leave zswap enabled, zswap can act as a compressed cache in front of that zram swap device. That means you are effectively putting one RAM compression layer in front of another.

In practice, that usually means:

more complexity in reclaim behavior,
less transparent accounting,
extra CPU work,
and a less useful zram device because zswap intercepts swap traffic first.

For that reason, a good default rule is:

If your primary swap strategy is zram-as-swap, disable zswap.

There are edge cases where people experiment with combining them, but as an operational default it is harder to justify than either:

zswap + normal disk swap, or
zram swap alone.

If you cannot clearly explain why both layers are helping your workload, keep only one.

4) The decision matrix

Choose zram when

A) You want RAM extension before disk

This is the classic zram fit:

memory pressure happens,
you want compressed cold pages to stay in RAM,
and disk swap either does not exist or is too slow / undesirable.

B) You care about desktop interactivity under moderate pressure

For many desktops and laptops, zram is a very good “soft landing” before the system starts thrashing on disk.

C) You do not need hibernation to that swap device

This matters because hibernating to swap on zram is not supported. If hibernation is a hard requirement, pure zram swap is usually the wrong core design.

D) You want a simple systemd-managed setup

zram-generator makes this operationally easy on many distros.

Choose zswap when

A) You already have real swap and want to cut swap I/O

This is zswap’s home turf. It is especially attractive when backing swap exists for policy reasons, hibernation reasons, or operational compatibility.

B) You want a compressed buffer before slower swap storage

zswap can significantly reduce disk activity when many swapped pages remain cold enough to stay in the compressed pool.

C) You want to preserve “real swap exists” semantics

This matters for environments where:

swap accounting assumptions already exist,
operational tooling expects a real swap device,
or reclaim needs a real place to evict cold pages eventually.

D) You are in overcommitted virtualized environments

The kernel docs explicitly call out overcommitted guests sharing I/O as a strong zswap use case, because compressed caching can reduce swap pressure on the shared storage layer.

Choose neither until tested when

A) You are running strict low-latency services

Compressed memory is not free:

it burns CPU,
adds reclaim-path complexity,
and can mask a basic capacity problem.

For latency-sensitive servers, the right answer may be:

better workload isolation,
PSI/cgroup tuning,
better memory budgeting,
or earlier admission control,

before introducing compressed swap paths.

B) Your pressure is dominated by anonymous-memory explosions that should just fail fast

If the system should shed work or OOM specific cgroups instead of “trying harder,” compressed swap can hide policy problems.

5) A practical rule of thumb by machine type

Laptop / desktop, hibernation not required

Default starting point: zram

Why:

simple,
fast,
keeps moderate pressure from turning immediately into disk-thrash,
often improves perceived responsiveness.

Laptop / desktop, hibernation required

Default starting point: zswap + real disk swap

Why:

you still have real swap infrastructure,
you can reduce swap I/O while preserving hibernation-friendly architecture.

Small-RAM edge device / SBC

Default starting point: zram

Why:

best chance to stretch limited RAM,
avoids constant writes to cheap flash storage,
usually simpler than maintaining large disk swap behavior.

General-purpose VM with disk-backed swap already present

Default starting point: zswap

Why:

fewer changes to swap semantics,
reduced swap I/O on shared or slower storage,
easier to layer onto existing host/guest expectations.

Latency-sensitive server

Start conservative

If you experiment at all:

prefer a controlled canary,
keep blast radius small,
watch CPU + PSI + tail latency together,
and do not treat memory compression as a free optimization.

6) The two biggest tuning ideas people miss

A) For in-memory swap, swappiness can reasonably be >100

The kernel VM docs explicitly note that for in-memory swap like zram or zswap, as well as hybrids where swap random I/O is faster than filesystem I/O, values beyond 100 can make sense.

That surprises people because they still think of swappiness as “0–100 and higher is crazy.” That mental model is too old.

Why this matters:

if swap is effectively memory-compressed and fast,
the kernel can prefer evicting colder anonymous pages sooner,
leaving more room for page cache or hotter working-set memory.

This does not mean “set swappiness to 200 everywhere.” It means:

When swap is memory-like rather than disk-like, low swappiness is not automatically optimal.

For zram desktops, a higher swappiness is often worth testing.

B) Priority matters when zram coexists with disk swap

A common and useful pattern is:

zram at higher swap priority,
disk swap at lower priority.

That gives you:

compressed-RAM swap first,
slower disk swap second.

This can be a good compromise when you want:

zram as the first pressure cushion,
but still want deep spillover capacity on disk.

If you do this, keep zswap disabled unless you have a very deliberate reason otherwise.

7) zram-specific operator guidance

Good starting profile

A pragmatic zram starting point is usually:

one zram swap device,
size around RAM / 2 as a first pass,
high swap priority,
and explicit monitoring of real memory consumed vs apparent zram disk size.

Important nuance:

zram disksize is the maximum uncompressed payload size, not the physical RAM it will consume.

So a large zram size is not immediately fully reserved. But gigantic values are still not free, because metadata and runaway expectations matter.

Kernel docs note that there is little point in sizing zram much beyond roughly what a plausible compression ratio can support. As a first-pass operational default, “half of RAM” is a sane start, not a universal law.

What to watch

The most useful zram checks are:

zramctl summary:
- DATA = uncompressed data stored,
- COMPR = compressed payload size,
- TOTAL = actual RAM consumed including metadata/fragmentation.
/sys/block/zramN/mm_stat:
- orig_data_size,
- compr_data_size,
- mem_used_total,
- same_pages,
- huge_pages.

Interpret those numbers correctly

If TOTAL keeps climbing close to your comfort limit while the compression ratio is poor, zram is not saving you much.

If huge_pages / incompressible pages are substantial, your workload may not be a good fit for aggressive zram expectations.

If same_pages is high, zram is getting unusually efficient wins from same-filled or highly repetitive pages.

Useful advanced features

zram has some more serious capabilities than many operators realize:

mem_limit to bound actual memory usage,
writeback to a backing device for idle/incompressible pages,
writeback budgets (writeback_limit) to control flash wear,
recompression with secondary algorithms,
and idle-page tracking.

This means zram is not just “tiny laptop trickery.” It can be a more configurable compressed-memory tier than people expect.

Important caution on writeback

The moment you enable zram writeback to backing storage, you are no longer in a pure “RAM-only compressed swap” world. Now you are operating:

compression,
a RAM block device,
plus selective spillover to storage.

That can be useful, but it changes the durability/wear/latency story. If the backing device is flash, write budget and wear become first-class concerns.

8) zswap-specific operator guidance

Good starting profile

A pragmatic zswap starting point is usually:

keep a normal backing swap device,
enable zswap,
keep the pool bounded,
and watch whether it actually reduces swap I/O instead of just adding CPU cost.

The kernel exposes key policy knobs such as:

max_pool_percent = max RAM share for the compressed pool,
compressor = active compression algorithm,
accept_threshold_percent = hysteresis for when zswap starts accepting pages again after hitting the limit,
optional cgroup control over zswap writeback behavior.

Why hysteresis matters

This is an underappreciated detail. If zswap is allowed to refill too aggressively right after becoming full, pages can churn in and out of the compressed pool with little real benefit.

That is why the kernel added accept_threshold_percent:

it gives the pool room before accepting pages again,
reducing pointless flip-flop behavior under heavy sustained pressure.

Runtime enable/disable nuance

You can enable or disable zswap at runtime, but disabling it does not instantly flush all compressed pages out of the pool. Stored pages remain until they are invalidated or faulted back in. To force them all out, a swapoff on the backing swap device(s) is the hard reset path.

That matters during experiments:

“I turned it off” is not the same as “the system has fully returned to pre-zswap state.”

What to watch

At minimum, watch:

swap I/O before vs after,
zswap pool growth,
rejection / writeback behavior,
CPU cost of compression,
and whether reclaim latency improved or worsened.

If zswap fills quickly and spends its life evicting to backing swap anyway, the benefit may be small.

9) Compression algorithm selection: do not over-romanticize it

People can spend too much time cargo-culting “best” algorithms. The correct framing is:

faster algorithms = lower CPU overhead, worse compression ratio,
denser algorithms = better memory savings, higher CPU overhead,
best choice depends on workload compressibility and stall sensitivity.

For many real systems, the better question is not:

Which algorithm wins benchmark screenshots?

It is:

Does this algorithm improve pressure behavior enough to justify its CPU cost on this machine class?

As a rough intuition:

lower-latency goals often prefer faster algorithms,
tighter-memory devices may benefit from better compression,
but only if CPU is not already the bottleneck.

For zram specifically, recompression and algorithm parameters can make this more nuanced than “pick one forever.”

10) Failure modes and bad interpretations

Failure mode A — treating compression as capacity instead of delay

Compressed swap is usually best thought of as a pressure absorber or delay mechanism, not true free memory. If the working set is simply too large, compression only postpones the problem.

Failure mode B — hiding a cgroup policy problem

If one workload should be constrained or killed earlier, compressed swap can make the box look healthier while masking the actual bad actor. Use it with cgroup memory policy, not instead of cgroup memory policy.

Failure mode C — CPU becomes the new bottleneck

Compression/decompression work is real. On CPU-starved machines, memory compression can shift the pain from I/O stalls to CPU reclaim stalls.

Failure mode D — bad expectations around hibernation

If you need hibernation, do not design around zram swap as though it were ordinary disk swap. It is not.

Failure mode E — combining zram and zswap because “more compression must be better”

Usually not. More layers can mean more CPU and harder-to-predict reclaim with little incremental win.

11) A sane rollout ladder

Stage 1 — Establish baseline pressure behavior

Before enabling anything, capture:

PSI memory pressure,
swap-in / swap-out rates,
page-fault behavior,
OOM frequency,
application tail latency,
and CPU headroom.

Without that, you cannot tell whether compression actually helped.

Stage 2 — Pick one strategy

Pick exactly one starting design:

zram-first, or
zswap + disk swap.

Do not start by stacking both.

Stage 3 — Canary on one machine class

Test on a clear cohort:

one desktop image,
one edge-device fleet,
one VM pool,
or one server canary.

Stage 4 — Tune only the obvious knobs first

For zram:

size,
priority,
swappiness,
optional mem_limit.

For zswap:

enablement,
compressor,
max_pool_percent,
maybe accept_threshold_percent.

Do not jump immediately into exotic multi-algorithm experiments.

Stage 5 — Judge by outcomes, not ideology

Success looks like:

lower user-visible stalls,
fewer deep swap-I/O events,
lower OOM frequency where that matters,
acceptable CPU overhead,
and more predictable degradation under pressure.

If the machine still falls apart, only with more CPU heat and more reclaim complexity, roll back.

12) Recommended defaults by intent

“I want my laptop to stay usable when RAM gets tight”

Start with:

zram swap,
high swap priority,
moderately elevated swappiness,
zswap disabled.

“I need hibernation and want less swap pain”

Start with:

normal disk swap,
zswap enabled,
conservative pool limit,
watch actual disk-write reduction.

“I run tiny devices and hate flash wear”

Start with:

zram swap,
no zswap,
careful memory limit sizing,
avoid writeback unless you truly need it.

“I run low-latency services and I’m not sure this is worth it”

Start with:

no change by default,
small canary only,
strict measurement of CPU + PSI + p99 latency + reclaim behavior.

13) Bottom line

The practical answer is simpler than the internet often makes it:

Use zram when you want compressed RAM to act as your first swap tier.
Use zswap when you already have real swap and want a compressed buffer in front of it.
Do not casually run both together when zram is your swap device.
Treat swappiness and priority as part of the design, not afterthoughts.
Judge success by pressure behavior, not by the fact that zramctl or /sys/module/zswap shows something enabled.

Memory compression is useful. But like most kernel knobs, it is most useful when the operator is clear about the actual bottleneck:

RAM scarcity,
disk swap pain,
flash wear,
latency sensitivity,
or policy failures around memory containment.

Pick the tool that matches that bottleneck.

Sources and further reading

Linux kernel docs: zswap overview, design, pool limits, runtime enable/disable, compressor selection, hysteresis, cgroup writeback behavior
- https://docs.kernel.org/admin-guide/mm/zswap.html
Linux kernel docs: zram block device, disksize vs memory usage, mem_limit, writeback, recompression, stats
- https://docs.kernel.org/admin-guide/blockdev/zram.html
Linux kernel VM sysctl docs: swappiness guidance, including values beyond 100 for in-memory swap scenarios
- https://docs.kernel.org/admin-guide/sysctl/vm.html
systemd zram-generator README: systemd-native configuration workflow and defaults
- https://github.com/systemd/zram-generator
ArchWiki zram: practical operator notes on zram sizing, priorities, disabling zswap when using zram swap, and example VM tuning
- https://wiki.archlinux.org/title/Zram