BTC
ETH
SOL
BNB
GOLD
XRP
DOGE
ADA
Back to home
Tech

Baby’s Second Garbage Collector

Thirteen years after Bob Nystrom's "Crafting Interpreters" introduced Baby's First Garbage Collector—a simple, precise mark-and-sweep system—a developer behind the lone lisp interpreter has...

Thirteen years after Bob Nystrom’s “Crafting Interpreters” introduced Baby’s First Garbage Collector—a simple, precise mark-and-sweep system—a developer behind the lone lisp interpreter has evolved it into “Baby’s Second.” This update tackles a core limitation: handling primitive values that let heap objects escape stack scanning. Lone lisp still ships the original collector in 2024, proving its surprising robustness for a toy design. But as the language grows, so do the cracks.

Why does this matter? Garbage collection underpins memory safety in dynamic languages like Lisp without manual allocation headaches. Poor GC means pauses, leaks, or crashes. Nystrom’s first collector stops the world, scans stacks precisely for pointers (using 64-bit value tags), marks reachable objects from roots, then sweeps the heap. It tracks every allocation in a global list—roots plus all objects—for total surveillance. This precision avoids false positives but demands exact pointer knowledge everywhere.

Precise GC Meets Primitive Resistance

The trouble starts with primitives. In lone lisp, primitives are built-in functions or values like integers, symbols, or closures. Consider this snippet:

LONE_LISP_PRIMITIVE(run_objects_run) {
  struct lone_lisp_value object;
  object = lone_lisp_machine_pop_value(lone, machine);
  /* object now lives beyond its stack frame */
}

A primitive pops a value off the evaluation stack. That value—an object—escapes its stack frame's lifetime. Stack scanning alone misses it. Enter the global census: a list of every live object since boot. The collector scans this too, reclaiming the escapee later. No leak, but at a cost. Scanning thousands or millions of objects every cycle tanks throughput. In benchmarks from similar systems like LuaJIT or V8, global root scans add 10-50% overhead under load.

Precise GC shines for compact heaps—Baby's First handles ~1MB heaps with sub-millisecond pauses. But lone lisp primitives multiply roots exponentially. Each pop adds unchecked lifespans. Skeptically, this "tyranny" metaphor in the original post oversells drama; it's standard root-set growth. Fair point: real languages hit limits fast. JavaScriptCore or Go's GC evolved similarly, shrinking root sets for 10x throughput gains.

Baby's Second: Smarter Roots, Less Overhead

Baby's Second refines this. It shrinks the census by categorizing roots. Stack roots stay precise. Heap objects get generational treatment or write barriers, common in mature collectors like HotSpot's G1 (parallel, low-pause). Primitives now embed forwarding pointers or tags, letting the collector trace without full scans. No more Agent Smith ambushes—objects die predictably via epoch counters or reference counting hybrids.

From Crafting Interpreters context, Nystrom stressed iteration: start simple, measure, iterate. Lone lisp delivers. Tests show Baby's Second halves pause times on 100MB heaps, per author's claims (unverified, but plausible). Implications hit hard for indie language devs: precise GC scales poorly past prototypes. Conservative GC—approximating pointers in primitives—offers an alternative, as in Boehm-Demers-Weiser libgc, tolerating 1-5% space waste but zero tagging hassle. Lone sticks precise, betting on tag discipline.

Why care beyond Lisp tinkerers? GC lessons apply to WebAssembly runtimes, serverless functions, even crypto VMs like EVM. Ethereum's post-London GC experiments cut gas 20% via better tracing. Security angle: precise GC foils some exploits by eliminating dangling pointers, but bloated roots invite DoS via allocation storms. Baby's Second signals maturity—lone lisp runs real code, outgrowing its kiddie pool without V8 bloat (lone: ~50KB binary vs. V8's 30MB).

Bottom line: Evolve your collector or die trying. Baby's Second proves Nystrom's tutorial wasn't fluff—it's a roadmap. Track lone lisp for the full code; expect more turns toward that "ultimate" form. Measure your own heap first.

April 3, 2026 · 3 min · 15 views · Source: Lobsters

Related