E-Graphs + Equality Saturation: Optimizer Design Playbook

Date: 2026-03-08
Category: knowledge
Domain: software / compiler engineering / optimization systems

Why this matters

Traditional optimizers are usually rewrite pipelines:

pass A rewrites program,
pass B sees only A’s output,
pass ordering becomes destiny.

This creates the classic phase-ordering problem: good rewrites can block later better rewrites depending on sequence.

Equality saturation (EqSat) changes the shape of the problem:

represent many equivalent programs at once,
apply rewrites without immediately committing to one path,
extract the best candidate at the end using a cost model.

If you build query planners, ML graph optimizers, symbolic simplifiers, or codegen superoptimizers, this is often a better architecture than hand-tuned pass ordering.

Core mental model

1) E-graph = compact set of equivalent expressions

An e-graph stores expressions by equivalence class:

e-node: one operator application (Add(x,y), Mul(a,b), ...)
e-class: set of e-nodes known equivalent

With congruence closure, if children are equivalent, parent expressions can become equivalent too.

2) EqSat loop

Start from initial expression/program.
Repeatedly apply rewrite rules to add equal forms into the e-graph.
Stop by budget/saturation limit.
Run extraction: choose best representative from root e-class under objective.

So optimization is no longer “which rewrite first?” but “what search space + what objective?”

Why modern EqSat became practical

The egg work (POPL 2021) pushed EqSat from theory into an engineering-friendly toolkit:

rebuilding: amortized invariant restoration for speed
e-class analyses: attach domain analyses (e.g., type/range/constant info) directly in e-graph workflow

This matters because real optimizers need more than purely syntactic rewrites; they need semantic side conditions and analysis-aware pruning.

Where this works well (with evidence)

A) Tensor graph superoptimization (Tensat)

EqSat was used to optimize DL tensor graphs and reported:

up to 16% runtime speedup over prior SOTA,
about 48x less optimization time on average (vs compared approaches in that paper).

Takeaway: EqSat is not only elegant; it can win on both quality and compile-time in graph-optimization regimes.

B) DSP vectorization (Diospyros)

Diospyros pipeline uses:

symbolic specification,
EqSat rewrite engine,
extraction to DSP intrinsic code.

Takeaway: EqSat is viable for low-level performance codegen, not just algebra simplification toys.

C) Datalog + EqSat unification (egglog)

egglog shows a useful direction when your optimizer needs:

incremental execution,
cooperating analyses,
lattice/fixpoint reasoning,
plus EqSat extraction.

Takeaway: for long-running or continuously updated optimization workloads, fixpoint-style systems can reduce ad-hoc glue code.

Practical architecture blueprint

1) IR boundary first

Before rewrites, decide the optimization IR:

expression IR (math DSL, query algebra, tensor ops)
graph IR (DAG-like compute graph)
typed vs untyped nodes

Bad IR design sinks EqSat projects faster than bad rewrite rules.

2) Rule system with explicit safety contracts

Each rule should carry:

pattern / replacement
side conditions (types, no-overflow assumptions, non-zero divisors, monotonicity domains, etc.)
validation tests (property-based and differential)

Do not put semantic assumptions only in comments.

3) Saturation controls

Use hard budgets from day one:

max iterations
max e-nodes / e-classes
per-rule fire limits
timeouts

EqSat can explode combinatorially; budget controls are product features, not temporary hacks.

4) Extraction objective

Use a multi-term cost function, e.g.:

[ J = w_1 \cdot \text{latency} + w_2 \cdot \text{memory} + w_3 \cdot \text{code size} + w_4 \cdot \text{risk penalty} ]

Where risk penalty can encode numerical stability loss, portability loss, or backend-specific hazards.

5) Explainability artifacts

Keep optimization explainable:

winning expression/program
top-k near-optimal candidates
rule-trace summary (which rule families mattered)
cost breakdown by term

Without this, debugging objective drift becomes painful.

Rollout strategy (minimal regret)

Phase 1 — Shadow mode

Run EqSat offline against production-like workloads:

compare outputs to incumbent optimizer
collect quality deltas and compile-time deltas
log rule explosions and extraction failures

Phase 2 — Guarded canary

Enable EqSat for low-risk segments only:

specific operators/models/queries
strict fallback to incumbent optimizer
enforce compile-time SLA cutoffs

Phase 3 — Objective tuning loop

Tune weights and rule sets with explicit KPI targets:

p50/p95 execution latency delta
compile-time overhead budget
correctness diff rate (must stay ~0)

Phase 4 — Expand + specialize

Split rule packs by domain:

numeric kernels
tensor fusion/layout
query algebra
backend-specific lowering rules

One giant global rule pack is usually slower and harder to govern.

Common failure modes

Unsound rewrites hidden behind missing side conditions
Cost model mismatch (extracts expressions fast in model, slow in reality)
No budget policy, leading to saturation blow-ups
Premature backend coupling (IR polluted with target-specific details too early)
No fallback path when extraction times out or confidence is low

10-point implementation checklist

Optimization IR has clear typing/shape semantics
Rewrite rules include machine-checkable side conditions
Property-based tests cover rule soundness
Saturation has strict node/time/iteration budgets
Extraction uses a documented multi-objective cost model
Cost model is calibrated with real benchmark telemetry
Optimizer emits top-k + cost breakdown for debugging
Fallback to incumbent optimizer is automatic
Canary rollout has explicit kill criteria
Rule packs are modular by domain/backend

One-line takeaway

EqSat turns optimization from brittle pass-order engineering into search-space + objective engineering—usually a better abstraction for complex optimizers.

References

Tate et al. — Equality Saturation: A New Approach to Optimization (arXiv:1012.1802)
https://arxiv.org/abs/1012.1802
Willsey et al. — egg: Fast and Extensible Equality Saturation (POPL 2021 / DOI 10.1145/3434304)
https://arxiv.org/abs/2004.03082
Yang et al. — Equality Saturation for Tensor Graph Superoptimization (arXiv:2101.01332)
https://arxiv.org/abs/2101.01332
Wang et al. — Vectorization for Digital Signal Processors via Equality Saturation (ASPLOS 2021)
https://dl.acm.org/doi/10.1145/3445814.3446707
Zhang et al. — Better Together: Unifying Datalog and Equality Saturation (arXiv:2304.04332)
https://arxiv.org/abs/2304.04332
EGRAPHS community / egg project homepage
https://egraphs-good.github.io/