BTC
ETH
SOL
BNB
GOLD
XRP
DOGE
ADA
Back to home
Tech

The acyclic e-graph: Cranelift’s mid-end optimizer

Cranelift, the code generator powering Wasmtime and other WebAssembly runtimes, overhauled its mid-end optimizer in 2022 with an acyclic e-graph—or aegraph.

Cranelift, the code generator powering Wasmtime and other WebAssembly runtimes, overhauled its mid-end optimizer in 2022 with an acyclic e-graph—or aegraph. This data structure unifies optimizations that previously fought over pass order, delivering measurable speedups like 5% on meshoptimizer benchmarks. It sidesteps the explosion of traditional e-graphs while enabling aggressive rewriting. Why does this matter? Compilers waste cycles on suboptimal code because passes like global value numbering (GVN) and redundant load elimination (RLE) interfere unless scheduled perfectly. Aegraphs fix that by representing code as a shared graph where equivalences propagate naturally.

The Pass-Ordering Trap

Compilers run fixpoint loops: repeat optimizations until no changes. But passes interact unpredictably. Take this IR snippet:

v2 = load.i64 v0+8
v3 = iadd v2, v1  ;; array indexing
v4 = load.i8 v3

... (no stores)

v10 = load.i64 v0+8
v11 = iadd v10, v1
v12 = load.i8 v11

RLE spots v10 aliases v2, replaces it, then GVN should merge v11 with v3. Run GVN first? It canonicalizes loads but misses the alias, so RLE later can’t chain the add elimination. Reverse order works once, but loops regress: GVN undoes RLE’s aliases. Repeat indefinitely or cap iterations arbitrarily. Cranelift hit this in 2022 adding alias analysis—great on isolates, but fused with LICM, constant prop, and rewrites? Chaos.

Traditional solutions: careful topological ordering or fused passes. Both brittle as rules grow. Enter e-graphs: equality saturation builds a graph of equivalent expressions, exhaustively applies rewrites, extracts optimal code. Egg and other tools prove this rewrites aggressively. But full e-graphs explode memory on large functions—cycles let enodes multiply unchecked.

Sea-of-Nodes First, Aegraph Second

Cranelift flips the script: start with sea-of-nodes, a flat intermediate where every operation is a node with direct value inputs. No dominator tree, no SSA upfront—values flow freely. Translate IR to sea-of-nodes, fuse simple opts (const prop, canonicalization) during translation. Reverse for extraction, fusing more.

This alone solves much: GVN becomes trivial union-find on values. No pass order needed—everything sees the shared graph. But sea-of-nodes lacks multi-representation: one node per value, no alternatives like iadd a b vs iadd b a or strength reductions.

Aegraph adds it scalably. Each value gets an e-class: a set of enodes (expressions). Crucially, acyclic: enodes point only to prior e-classes, enforced by numbering classes topologically. No cycles, no infinite growth. Union enodes into classes via rewrites, but extract picks the “best” enode per class based on heuristics like size or cost.

Efficiency hacks: limit enode fanout, batch rewrites, recycle during extraction. After one rewrite and community contributions (hundreds of rules by 2024), it runs in production. PLDI 2023 workshop slides and Dagstuhl talks refined it—e.g., “patches” like dynamic rule disabling prevent bloat.

Does It Deliver? Numbers and Tradeoffs

Yes, but selectively. On CoreMark, aegraph shrinks code 10-20% vs prior mid-end. Speedups: 3-7% geometric mean on SPECint-like wasm benchmarks, up to 15% on compute-heavy. Meshoptimizer: 5% faster. Vs LLVM’s mid-end? Cranelift generates tighter wasm—aegraph closes the gap further.

Skeptical lens: not universal. Memory doubles on hot loops (e-classes bloat), extraction heuristics matter (wrong cost model regresses). Alternatives like MLIR’s pattern rewrites scale differently—aegraph wins on interaction density. Why care? WebAssembly hits servers (Kubernetes), edges (Cloudflare Workers), browsers. 1-5% runtime savings compound: billions of cycles yearly across fleets.

Implications run deeper. Aegraphs generalize to GPU shaders, ML graphs (TVM uses similar). Cranelift’s open-source (Bytecode Alliance) invites ports—Rust crates already experiment. Pitfalls persist: rule authoring needs care, or saturation stalls. Still, it proves e-graphs productionize if you prune cycles and fuse smartly. For optimizer hackers, clone Cranelift, benchmark your IR. The sea-of-nodes pivot? Genius pedagogy—build simple, layer equivalences.

April 10, 2026 · 3 min · 11 views · Source: Lobsters

Related