B-tree vs LSM-tree Storage Engine Selection Playbook

Why this matters

Choosing a storage engine is less about benchmark screenshots and more about which pain you want to pay:

B-tree families tend to pay more on random writes, but keep reads simpler.
LSM families absorb writes efficiently, but push complexity into compaction, read amplification, and operational tuning.

If this choice is wrong, teams usually discover it late—under production write pressure, p99 latency incidents, and disk-cost surprises.

A practical way to reason about both engines:

Write amplification (WA): how many physical writes per logical write
Read amplification (RA): how many structures/pages/SSTables touched per read
Space amplification (SA): how much extra storage overhead you carry

You can improve one, but usually worsen another.

Strong point-read and short range-read behavior when working set is indexed well.
Simpler read path under steady state.
Operationally easier to reason about for many OLTP workloads.

Compaction debt can create p99/p999 latency spikes.
Read path may consult multiple levels/SSTables (especially under stale tombstones or poor compaction state).
Operational complexity is materially higher (many knobs, many failure modes).

Choose B-tree-first when:

Choose LSM-first when:

“LSM is always faster for writes”
- True only until compaction debt catches up.
“B-tree is old, so it’s slower”
- For many real OLTP shapes, B-tree gives cleaner latency.
Benchmarking only average throughput
- p95/p99 under mixed read/write + maintenance windows decides production reality.
Ignoring deletion semantics in LSM
- Tombstones are not free; poor lifecycle handling quietly taxes reads and storage.

Before final selection, test both engines with a workload harness that includes:

Run each candidate long enough for background maintenance behavior to appear. Short tests mostly measure cache warmth, not operational truth.

B-tree: usually simpler and stronger for stable read-latency OLTP.
LSM: usually better for sustained high write ingest, but only with mature compaction operations.

The right question is not “which engine is better?”

The right question is: “Which failure mode can we detect early and operate reliably at 2 a.m.?”