SQLite Replication Selection Playbook: LiteFS vs Litestream

Date: 2026-04-12
Category: knowledge
Domain: software / sqlite / distributed systems

Why this matters

A lot of teams want SQLite because it is fast, boring, embeddable, and cheap to operate—and then immediately run into the same question:

How do I make it survive machine loss, region loss, or multi-instance deployment without accidentally rebuilding a bad database cluster?

Two of the most interesting answers in the SQLite ecosystem are:

Litestream = stream SQLite changes to replica storage for backup, restore, point-in-time recovery, and read-only follower patterns.
LiteFS = replicate SQLite live across machines so every app node can keep a local copy, while one primary handles writes.

They are related, but they are not interchangeable.

If you pick the wrong one, you usually get one of these failure modes:

you use LiteFS for a simple single-node app and inherit unnecessary distributed-systems complexity,
you use Litestream for a multi-node app and later discover it does not give you automatic primary failover,
you expect either tool to behave like a synchronous consensus database,
or you confuse “read replica” with “authoritative failover path” and design an unpleasant recovery story.

The clean split is:

Litestream is primarily about disaster recovery and replica storage.
LiteFS is primarily about live multi-node replication for SQLite apps.

That one sentence is the shortcut.

1) Fast mental model

Litestream

Think of Litestream as:

a background replication/backup process for SQLite,
built around SQLite WAL replication,
shipping snapshots + incremental changes to object/file storage,
optimized for restoreability,
and useful when you want SQLite to survive host failure without standing up Postgres.

Good default fit:

single-server SQLite apps,
low-ops SaaS or internal tools,
edge/VPS deployments with object storage,
backup + point-in-time recovery,
warm read-only followers for analytics or reporting.

LiteFS

Think of LiteFS as:

a distributed filesystem layer for SQLite,
sitting in front of SQLite through FUSE,
capturing committed transactions and replicating them live,
keeping a current primary plus read-only replicas,
and optimizing for multi-node local reads + automatic failover patterns.

Good default fit:

one SQLite app deployed on multiple machines,
apps that want local disk reads on every node,
multi-region read distribution with a single write leader,
small-to-medium write rates with strong single-writer discipline,
“I want SQLite, but on more than one box.”

The core difference:

Litestream protects the database.
LiteFS distributes the database.

2) The decisive architectural question

The first question is not:

“Do I need replication?”

It is:

“Am I solving disaster recovery, or am I solving live multi-node serving?”

Pick Litestream when the center of gravity is recovery

Choose Litestream when you mainly want:

durable off-host copies,
point-in-time restore,
very low operational complexity,
a single writer node,
or backup/restore workflows that feel more like storage engineering than cluster management.

Typical examples:

a Rails, Node, Go, or Python app with one active instance,
a small SaaS on a VPS,
a Cloud VM that should be recoverable to object storage,
cron/sidecar backup of SQLite without app changes,
analytics snapshots restored elsewhere.

Pick LiteFS when the center of gravity is live topology

Choose LiteFS when you mainly want:

several app instances with the same SQLite database,
reads served from local disk on every node,
one elected primary for writes,
automatic promotion/handoff behavior,
and an app topology that behaves more like streaming replication than backup shipping.

Typical examples:

a Fly.io-style globally distributed app,
an app with regional replicas serving local reads,
a system where failover time matters more than perfect simplicity,
a deployment where each instance should keep a hot local DB copy.

3) How Litestream actually works

Litestream continuously copies SQLite changes to replica storage. Historically people think of it as “WAL shipping,” which is directionally right, but the operationally important idea is:

SQLite writes changes to the WAL,
Litestream prevents ordinary checkpointing from breaking its replication chain,
it maintains a shadow WAL / snapshot lineage,
and restore works by combining a snapshot with subsequent change files.

Important operational traits:

asynchronous replication,
single active writer is the normal mental model,
replicas usually live in object or remote storage,
restore can target latest, TXID, or timestamp,
retention policy determines how much time-travel window you really have.

Recent Litestream docs also make two useful points that are easy to miss:

follow mode can keep a restored DB continuously updated as a read-only follower,
the newer VFS replica path can serve read-only queries directly from replica storage without restoring the full DB to local disk first.

That means Litestream is no longer just “backup and pray later.” It can support read-only workflows more elegantly than older summaries imply.

But the important limit remains:

Litestream does not turn SQLite into an automatically failing-over primary/replica cluster.

4) How LiteFS actually works

LiteFS sits between SQLite and the underlying filesystem using FUSE. It intercepts SQLite file operations, captures committed page changes, packages them as LTX transactions, and replicates those transactions to other nodes.

Important operational traits:

one node is the primary,
replicas are read-only copies,
reads are local on each node,
write routing must go to the current primary,
lease management determines who may be primary,
replication is asynchronous.

LiteFS also tracks a replication position using TXID + checksum, which helps detect divergence and recover from split-brain-style drift by resnapshotting from the current authoritative primary.

The design is elegant, but it has real consequences:

it is still a single-writer system,
asynchronous replication means there is still a small data-loss window on catastrophic primary failure,
failover is simpler than running a full database server cluster, but it is still real cluster behavior,
and FUSE introduces throughput/latency tradeoffs that matter for write-heavy workloads.

The LiteFS docs explicitly call out a rough write-throughput ceiling around ~100 transactions/sec from the FUSE approach, which is a huge clue:

LiteFS is for “SQLite but distributed,” not “SQLite but suddenly a high-write consensus database.”

5) The most practical comparison

A) If you have one app instance

Use Litestream by default.

Why:

less moving parts,
no lease backend,
no write forwarding logic,
excellent DR story,
point-in-time restore is straightforward.

This is the most common “don’t get cute” answer.

If the app is single-writer and your real problem is “I need backups that aren’t on the same disk,” Litestream is the clean winner.

B) If you have several app instances and want local reads everywhere

Use LiteFS.

Why:

every node gets a local copy,
reads avoid a remote DB hop,
the cluster can hand off primacy,
the model matches “one writer, many readers.”

This is where Litestream starts to feel stretched. It can support read-only replica patterns, but not the same live primary/replica serving model.

C) If you need automatic failover of the write node

Use LiteFS, but be honest about the semantics.

LiteFS can do primary handoff/election via Consul-based lease management. That gives you a real failover path, but not synchronous quorum durability.

So the promise is:

better write availability than single-node Litestream,
not zero-RPO consensus durability.

If you need “a committed write survives leader loss unless quorum is gone,” you are in rqlite / dqlite / Postgres / another client-server DB territory.

D) If you want the smallest operational surface area

Use Litestream.

The object-storage replication story is easier to reason about than FUSE mounts, lease coordination, primary routing, and replica-consistency behavior.

6) Selection table

Situation	Better fit	Why
Single VM / single container SQLite app	Litestream	DR without cluster complexity
Simple SQLite app with S3/R2/GCS backups	Litestream	Strong backup/restore ergonomics
Need PITR / restore by timestamp	Litestream	Built around snapshot + WAL/LTX recovery
Multi-node deployment with one write leader	LiteFS	Live replication across instances
Multi-region app serving mostly local reads	LiteFS	Local disk reads on replicas
Need automatic primary failover	LiteFS	Lease/election model supports it
Need multi-writer cluster	Neither	SQLite single-writer model still rules
Need synchronous quorum durability	Neither	Use Raft-based or client/server DB
Need read-only analytics off replica storage	Litestream	Restore/follow mode or VFS can work well
Very write-heavy workload	Usually neither	LiteFS FUSE overhead and SQLite single-writer limits become painful

7) Litestream’s strengths in real life

A) Disaster recovery is the product, not a side effect

Litestream is excellent when your top concern is:

“If this box disappears, how fast and how confidently can I reconstruct the DB?”

That maps well to:

object storage replicas,
timestamp-based recovery,
idempotent bootstrap scripts,
cold/warm standby workflows,
dev/staging restore from prod backups.

The newer restore docs are especially nice operationally because they support:

-timestamp point-in-time restore,
-txid restore,
-f follow mode for continuously updated read-only copies,
flags like -if-db-not-exists and -if-replica-exists for idempotent startup automation.

That last bit matters more than it sounds:

good restore tooling reduces the amount of shell duct tape your deployment has to invent.

B) It fits the “single primary app” shape naturally

Many teams do not actually need multi-node writes or local replicas on every server. They need:

one cheap app node,
one SQLite DB,
reliable backups,
simple recovery.

Litestream is almost purpose-built for this.

C) Read-only offload is possible without pretending it is HA

Follow mode and the VFS path open useful read-only patterns:

dashboards,
reporting,
ad-hoc queries,
backup validation,
ephemeral analysis nodes.

That is valuable because it lets you separate serving writes from reading historical/replicated state without demanding a full database migration.

8) LiteFS’s strengths in real life

A) Local reads on every app node are the whole game

When you deploy an app globally, a remote database hop can dominate latency. LiteFS’s magic trick is simple but powerful:

keep the DB local on each node,
keep reads local,
send writes to the elected primary.

This can feel dramatically better than remote-DB patterns for read-heavy web apps.

B) The built-in proxy solves a real class of pain

LiteFS includes an HTTP proxy that can:

forward write requests to the primary,
and help avoid reading stale data immediately after a write.

That sounds small, but it closes an annoying consistency gap for normal web applications.

The catch is that your app needs to respect the proxy’s assumptions:

GET must not perform writes,
clients should support cookies,
and your traffic shape needs to look like a reasonably well-behaved web app.

If your app violates REST semantics or uses weird side effects on reads, the proxy becomes less helpful.

C) It gives you a real failover story while staying in SQLite land

LiteFS is compelling because it lets teams push SQLite farther before graduating to a client/server DB.

That can be the right choice when:

the working set is still SQLite-sized,
writes are moderate,
read latency matters globally,
and you want to postpone the complexity cost of Postgres/MySQL clusters.

9) The traps and failure modes

Litestream traps

1) Expecting HA when you only bought DR

This is the big one. Litestream gives you recovery, not automatic live-primary election.

If your app truly needs:

fast automatic write failover,
hot replicas that become primary without a restore/promote flow,
multi-node serving as a first-class topology,

then Litestream alone is the wrong abstraction.

2) Retention that looks fine until you actually restore

Point-in-time recovery is only as good as:

snapshot cadence,
WAL/LTX retention,
replica completeness,
and restore drills.

If you never test restores, you do not have a DR system. You have a storage bill.

3) Misusing VFS for heavy analytics

The VFS feature is cool, but it is still read-only and network-bound. It is great for:

moderate read fan-out,
validation,
lightweight dashboards,
ad-hoc queries.

It is not the first thing I would choose for:

huge scans,
long-running analytical transactions during heavy write churn,
or “let’s make object storage feel like OLAP.”

LiteFS traps

1) Forgetting it is still asynchronous

LiteFS can fail over, but asynchronous replication means a just-committed write can still be lost if the primary dies before replicas receive it.

That is not a bug. That is the durability tradeoff.

2) Underestimating write-path constraints

FUSE is clever, but it is not free. If your workload is chatty, write-heavy, or filled with tiny high-frequency transactions, LiteFS may become the thing you fight.

3) Pretending single-writer discipline will disappear

LiteFS does not change SQLite’s fundamental single-writer shape. It operationalizes it across nodes. That is very useful—but it is not multi-primary magic.

4) Ignoring lease/primary-routing details

If the application has no clean strategy for:

finding the primary,
routing writes,
handling short failover windows,
and avoiding stale read-after-write behavior,

then deployment pain will leak into the app.

10) A simple decision framework

Use this in order.

Choose Litestream if most of these are true

I have one active write node.
My main fear is disk/host loss, not live cluster failover.
I want S3/R2/GCS-style replicas.
I want point-in-time restore.
I want the simplest ops story possible.
I do not need automatic writer election.

Choose LiteFS if most of these are true

I want multiple app nodes to share one SQLite database.
I want reads to be local on each node.
I can live with one write leader.
I want automatic primary handoff/election.
My write rate is modest enough for SQLite + FUSE realities.
I am willing to own some distributed-systems behavior.

Choose neither if most of these are true

I need high write throughput.
I need synchronous replication / quorum commit.
I need multi-writer behavior.
I need mature DBA-grade tooling for a growing team.
I already know this app is graduating out of SQLite.

At that point, stop being sentimental and use a client/server database.

11) Recommended deployment patterns

Pattern A: Single-node app, cheap and solid

Use:

app + SQLite on one machine,
Litestream replicating to object storage,
regular restore drill in CI or staging,
optional periodic full export for belt-and-suspenders recovery.

This is the default answer for a surprising number of real products.

Pattern B: Primary app node + DR standby

Use:

one active app node,
Litestream to replica storage,
a bootstrap process that restores to standby on demand,
documented promote/runbook for failover.

This is not fully automatic HA, but it is often plenty.

Pattern C: Multi-region read-heavy app on SQLite

Use:

LiteFS across multiple nodes,
a clear primary region,
built-in proxy or equivalent write-routing logic,
backup of LiteFS data path as a separate DR concern.

LiteFS solves live replication. You still need a backup story.

Pattern D: Read-only analytics from production SQLite

Use:

Litestream replication to object storage,
restore -f or VFS-based read-only follower,
query isolation away from the write-serving path.

This is a nice way to offload reads without forcing a full database migration.

12) My opinionated default

If you are unsure:

start with Litestream for single-instance deployments,
move to LiteFS only when you actually need multi-node local-read serving,
and skip both if your write path is obviously outgrowing SQLite.

That bias exists because:

Litestream fails in a simpler way,
its recovery model is easier to reason about,
and many teams overestimate how much “distributed SQLite” they really need.

LiteFS is impressive and real, but it should be chosen because the topology demands it, not because it sounds cool.

13) Bottom line

If you remember only one thing, remember this:

Litestream is the right answer when your main problem is recoverability.
LiteFS is the right answer when your main problem is live multi-node serving with one writer.
Neither is the right answer when your main problem is high-write consensus-grade database behavior.

SQLite can go farther than many people think. But the smartest move is usually not “how far can I stretch it?” It is:

“Which failure mode am I actually buying, and am I buying it on purpose?”

References

Fly.io Docs — LiteFS FAQ
Fly.io Docs — How LiteFS Works
Fly.io Docs — LiteFS Config Reference
Fly.io Docs — Getting Started with LiteFS on Fly.io
Litestream Docs — How It Works
Litestream Docs — Alternatives
Litestream Docs — restore command reference
Litestream Docs — VFS read replicas guide