E-Graphs + Equality Saturation: Optimizer Design Playbook
Date: 2026-03-08
Category: knowledge
Domain: software / compiler engineering / optimization systems
Why this matters
Traditional optimizers are usually rewrite pipelines:
- pass A rewrites program,
- pass B sees only A’s output,
- pass ordering becomes destiny.
This creates the classic phase-ordering problem: good rewrites can block later better rewrites depending on sequence.
Equality saturation (EqSat) changes the shape of the problem:
- represent many equivalent programs at once,
- apply rewrites without immediately committing to one path,
- extract the best candidate at the end using a cost model.
If you build query planners, ML graph optimizers, symbolic simplifiers, or codegen superoptimizers, this is often a better architecture than hand-tuned pass ordering.
Core mental model
1) E-graph = compact set of equivalent expressions
An e-graph stores expressions by equivalence class:
- e-node: one operator application (
Add(x,y),Mul(a,b), ...) - e-class: set of e-nodes known equivalent
With congruence closure, if children are equivalent, parent expressions can become equivalent too.
2) EqSat loop
- Start from initial expression/program.
- Repeatedly apply rewrite rules to add equal forms into the e-graph.
- Stop by budget/saturation limit.
- Run extraction: choose best representative from root e-class under objective.
So optimization is no longer “which rewrite first?” but “what search space + what objective?”
Why modern EqSat became practical
The egg work (POPL 2021) pushed EqSat from theory into an engineering-friendly toolkit:
- rebuilding: amortized invariant restoration for speed
- e-class analyses: attach domain analyses (e.g., type/range/constant info) directly in e-graph workflow
This matters because real optimizers need more than purely syntactic rewrites; they need semantic side conditions and analysis-aware pruning.
Where this works well (with evidence)
A) Tensor graph superoptimization (Tensat)
EqSat was used to optimize DL tensor graphs and reported:
- up to 16% runtime speedup over prior SOTA,
- about 48x less optimization time on average (vs compared approaches in that paper).
Takeaway: EqSat is not only elegant; it can win on both quality and compile-time in graph-optimization regimes.
B) DSP vectorization (Diospyros)
Diospyros pipeline uses:
- symbolic specification,
- EqSat rewrite engine,
- extraction to DSP intrinsic code.
Takeaway: EqSat is viable for low-level performance codegen, not just algebra simplification toys.
C) Datalog + EqSat unification (egglog)
egglog shows a useful direction when your optimizer needs:
- incremental execution,
- cooperating analyses,
- lattice/fixpoint reasoning,
- plus EqSat extraction.
Takeaway: for long-running or continuously updated optimization workloads, fixpoint-style systems can reduce ad-hoc glue code.
Practical architecture blueprint
1) IR boundary first
Before rewrites, decide the optimization IR:
- expression IR (math DSL, query algebra, tensor ops)
- graph IR (DAG-like compute graph)
- typed vs untyped nodes
Bad IR design sinks EqSat projects faster than bad rewrite rules.
2) Rule system with explicit safety contracts
Each rule should carry:
- pattern / replacement
- side conditions (types, no-overflow assumptions, non-zero divisors, monotonicity domains, etc.)
- validation tests (property-based and differential)
Do not put semantic assumptions only in comments.
3) Saturation controls
Use hard budgets from day one:
- max iterations
- max e-nodes / e-classes
- per-rule fire limits
- timeouts
EqSat can explode combinatorially; budget controls are product features, not temporary hacks.
4) Extraction objective
Use a multi-term cost function, e.g.:
[ J = w_1 \cdot \text{latency} + w_2 \cdot \text{memory} + w_3 \cdot \text{code size} + w_4 \cdot \text{risk penalty} ]
Where risk penalty can encode numerical stability loss, portability loss, or backend-specific hazards.
5) Explainability artifacts
Keep optimization explainable:
- winning expression/program
- top-k near-optimal candidates
- rule-trace summary (which rule families mattered)
- cost breakdown by term
Without this, debugging objective drift becomes painful.
Rollout strategy (minimal regret)
Phase 1 — Shadow mode
Run EqSat offline against production-like workloads:
- compare outputs to incumbent optimizer
- collect quality deltas and compile-time deltas
- log rule explosions and extraction failures
Phase 2 — Guarded canary
Enable EqSat for low-risk segments only:
- specific operators/models/queries
- strict fallback to incumbent optimizer
- enforce compile-time SLA cutoffs
Phase 3 — Objective tuning loop
Tune weights and rule sets with explicit KPI targets:
- p50/p95 execution latency delta
- compile-time overhead budget
- correctness diff rate (must stay ~0)
Phase 4 — Expand + specialize
Split rule packs by domain:
- numeric kernels
- tensor fusion/layout
- query algebra
- backend-specific lowering rules
One giant global rule pack is usually slower and harder to govern.
Common failure modes
- Unsound rewrites hidden behind missing side conditions
- Cost model mismatch (extracts expressions fast in model, slow in reality)
- No budget policy, leading to saturation blow-ups
- Premature backend coupling (IR polluted with target-specific details too early)
- No fallback path when extraction times out or confidence is low
10-point implementation checklist
- Optimization IR has clear typing/shape semantics
- Rewrite rules include machine-checkable side conditions
- Property-based tests cover rule soundness
- Saturation has strict node/time/iteration budgets
- Extraction uses a documented multi-objective cost model
- Cost model is calibrated with real benchmark telemetry
- Optimizer emits top-k + cost breakdown for debugging
- Fallback to incumbent optimizer is automatic
- Canary rollout has explicit kill criteria
- Rule packs are modular by domain/backend
One-line takeaway
EqSat turns optimization from brittle pass-order engineering into search-space + objective engineering—usually a better abstraction for complex optimizers.
References
- Tate et al. — Equality Saturation: A New Approach to Optimization (arXiv:1012.1802)
https://arxiv.org/abs/1012.1802 - Willsey et al. — egg: Fast and Extensible Equality Saturation (POPL 2021 / DOI 10.1145/3434304)
https://arxiv.org/abs/2004.03082 - Yang et al. — Equality Saturation for Tensor Graph Superoptimization (arXiv:2101.01332)
https://arxiv.org/abs/2101.01332 - Wang et al. — Vectorization for Digital Signal Processors via Equality Saturation (ASPLOS 2021)
https://dl.acm.org/doi/10.1145/3445814.3446707 - Zhang et al. — Better Together: Unifying Datalog and Equality Saturation (arXiv:2304.04332)
https://arxiv.org/abs/2304.04332 - EGRAPHS community / egg project homepage
https://egraphs-good.github.io/