BTC
ETH
SOL
BNB
GOLD
XRP
DOGE
ADA
Back to home
Tech

Compiling to Java as a target language

Java beats C as a compiler target for most new languages.

Java beats C as a compiler target for most new languages. Developers spend a third to half the time and code writing compilers to Java bytecode versus C. A reference Scheme compiler to Java clocks in at 400 lines; the C version hits 1,000. This gap lets teams prototype functional languages fast without C’s pointer-chasing headaches.

Anonymous classes map directly to lexically scoped closures, a transform so local it needs no global analysis. Java’s spec details these edges—rare in apps, gold for compilers. The JVM adds libraries, cross-platform runs, JIT optimization, and automatic memory management. No manual leaks or allocators.

Trade-offs exist. Java skips tail-call optimization, risking stack overflows in recursion-heavy code like Scheme. Fix it with trampolining: wrap continuations in Java loops. Less direct, but viable. Overall, Java sidesteps C’s low-level grind while delivering 80-90% of the speed for interpreters-turned-compilers.

The Compiler Ladder: Java’s Sweet Spot

Matt Might’s advanced compilers class drills trade-offs: start simple, climb only as needed. Step 1: Pure interpreter. Too slow? Step 2: SICP-style optimizer (environments, peephole tweaks). Still lagging? Step 3: Compile to Java.

Each rung doubles code complexity but multiplies performance. Java sits at rung 3: biggest leap from interpreters (often 10-100x speedup via JIT), minimal bloat. C is rung 4, assembly rung 5. Add optimizations later—dataflow analysis, inlining—only if benchmarks demand.

This matters for language designers. Why grind C for a proof-of-concept? Java unlocks the ecosystem: Swing GUIs, Spring backends, Hadoop-scale data. Portable to desktops, servers, Android. Skeptical note: JVM startup lags (GraalVM native images help), and bytecode size balloons without tuning. But for iteration speed, unbeatable.

Scheme Core: Desugaring to Java

The example targets a tiny Scheme: expressions, lambdas, ifs, assignments, lets, letrecs, begins, apps. Full Scheme desugars via macros to this core—many production compilers do it.

<expr> ::= <num> | <var> | (lambda (<var> ...) <expr>)
         | (if <expr> <expr> <expr>)
         | (set! <var> <expr>)
         | (let ((<var> <expr>) ...) <expr>)
         | (letrec ((<var> (lambda (<var> ...) <expr>)) ...) <expr>)
         | (begin <expr> ... <expr>)
         | (<expr> <expr> ... <expr>)
<var> ::= symbol

Compile letrecs via closure conversion: heap-allocate lambdas, mutate fields post-init. Java’s inner classes handle env captures. No CPS needed upfront; add for TCO later. Python or Ruby ports follow similar paths—desugar loops to recursion, objects to classes.

Implications cut deep. Teams build domain-specific languages atop Java without LLVM toolchain pain. Finance models in Lisp-like syntax? Crypto protocols in secure subsets? Prototype in days, deploy on JVM clusters. C’s perf edge shrinks with JIT; Java’s maturity wins for 90% cases. Test it: grab Might’s code, run benchmarks. You’ll see why this flips the script on language implementation.

April 16, 2026 · 3 min · 4 views · Source: Lobsters

Related