Show HN: Libretto – Making AI browser automations deterministic

Libretto tackles a core pain in browser automation: making AI-generated scripts reliable and inspectable.

Libretto tackles a core pain in browser automation: making AI-generated scripts reliable and inspectable. Developers at a healthcare startup built it after a year wrestling with EHR and payer portal integrations. Instead of runtime AI agents that guess at tasks, Libretto uses coding agents to produce actual Playwright scripts upfront. You inspect, debug, and run them deterministically. This matters because legacy websites—think clunky healthcare portals—break runtime tools constantly, wasting time and money.

The shift sounds simple but addresses real flaws. Runtime agents like Browseruse or Stagehand prompt an LLM on the fly to navigate sites. They parse DOM elements, which fails on outdated or complex pages. Healthcare sites, loaded with custom JavaScript and anti-bot measures, exacerbate this. Libretto skips heavy DOM reliance. It mixes Playwright for UI actions with direct network requests inside the browser session. Network calls prove faster and dodge bot detection better, as they mimic real user traffic.

Why Runtime Agents Fall Short

Runtime tools rack up costs from repeated AI calls. Complex workflows defy caching, so each run bills fresh tokens. A single EHR login might chain 20 steps; failures multiply expenses. Opaqueness compounds it—you prompt, hope, and pray. Inconsistent sites demand precise instructions, but legacy flows hide quirks like pop-up modals or session timeouts.

Debugging? Minimal help. No code to step through. The startup tried these tools and ditched them for unreliability in high-stakes settings. One failed automation could delay claims processing, costing thousands in reimbursements or compliance fines. Healthcare processes 10 billion claims yearly in the US alone; portals from Epic or Cerner vary wildly, demanding robust tools.

Libretto flips this. Agents generate scripts at dev time. You own the code: version it in Git, tweak selectors, add error handling. Record manual sessions to train agents on real flows. Step-through debugging catches issues early. A read-only mode blocks accidental submits—critical for live data.

How It Works in Practice

Install via CLI:

npm install -g @libretto/cli

Then, libretto generate with a task description. Agents output Playwright TypeScript matching your repo’s style. Run with libretto run script.ts. Docs cover setup in minutes: here. Demo video shows it handling a login flow: watch.

Hybrid automation shines. Playwright handles clicks; network interception grabs APIs without DOM scraping. This evades CAPTCHAs and rate limits better than pure UI bots. Security angle: inspectable code reduces risks of rogue agent actions leaking data. In crypto or finance—Njalla’s turf—similar portals guard wallets or trades. Determinism prevents surprise behaviors under stress.

Implications for Automation Builders

Browser automation powers RPA markets worth $2.9 billion in 2023, growing 40% yearly. AI promises efficiency, but runtime hype ignores maintenance hell. Libretto’s dev-time model cuts that. Expect 5-10x faster iterations once scripted. Agents bootstrap from recordings, easing new tasks.

Skeptical take: It’s early, from one team’s healthcare grind. Scales to general web? Unclear without broader tests. Playwright dependency ties it to Chromium ecosystems; edge cases like Safari persist. Still, fair props—solves observability gap. If you’re scripting messy sites, try it. Site: libretto.sh.

Broader why: AI agents excel at code gen, flop at execution without guardrails. This hybrid owns that. In regulated fields, auditability trumps speed. Expect forks for Puppeteer or Selenium. Feedback loops could refine agents on niche domains like banking UIs. Bottom line: pragmatic step from agent chaos to engineer control.

Show HN: Libretto – Making AI browser automations deterministic

Why Runtime Agents Fall Short

How It Works in Practice

Implications for Automation Builders

Related

Building the foundation for running extra-large language models

Compiling to Java as a target language

€54k spike in 13h from unrestricted Firebase browser key accessing Gemini APIs

The Boy That Cried Mythos: Verification is Collapsing Trust in Anthropic

I Let Claude Opus Write a Chrome Exploit: The Next Model (Mythos?) Won’t Need My Help?

Reimplementing the Space Protocol Stack from Scratch