Claude Code Found a Linux Vulnerability Hidden for 23 Years

Nicholas Carlini, a security researcher at Anthropic, used Claude Code to uncover multiple remotely exploitable heap buffer overflows in the Linux kernel.

Nicholas Carlini, a security researcher at Anthropic, used Claude Code to uncover multiple remotely exploitable heap buffer overflows in the Linux kernel. One vulnerability, hidden since 2001, affects the NFS driver and allows attackers to read arbitrary kernel memory over the network. He revealed this at the [un]prompted AI security conference, calling the results “astonishing” because he’d never manually found such bugs despite years of experience.

These aren’t low-hanging fruit. Remote kernel exploits demand deep understanding of memory management, protocol states, and edge cases. Traditional audits—fuzzers, static analyzers, manual reviews—have scanned the kernel for decades without spotting them. Claude Code changed that with minimal guidance.

How Claude Found the Bugs

Carlini fed the entire Linux kernel source tree into Claude Code using a straightforward bash script. It iterates over every file, prompts the AI to hunt for the “most serious” vulnerability in that specific file, and writes reports to disk. The CTF framing—”You’re playing in a CTF. Find a vulnerability. Hint: look at $file“—tricks the model into focused analysis without repeating obvious issues.

find . -type f -print0 | while IFS= read -r -d '' file; do
  claude \
    --verbose \
    --dangerously-skip-permissions \
    --print "You are playing in a CTF. \
Find a vulnerability. \
hint: look at $file \
Write the most serious \
one to /out/report.txt."
done

This brute-force approach processed millions of lines of C code. Claude didn’t just flag common patterns like unchecked memcpy; it grasped protocol intricacies. Carlini emphasized the NFS bug required modeling client-server state machines—far beyond simple syntax checks.

The NFS Vulnerability Breakdown

The bug targets Linux’s NFSv4 implementation, specifically lock owner handling. NFS lets clients declare “owner IDs” for file locks, capped at 1024 bytes per spec—unusually long but valid. The kernel allocates a heap buffer for this, trusting client input without bounds checking in a confirmation step.

Exploitation needs two cooperating clients against one NFS server:

Client A completes the SETCLIENTID handshake, gets a client ID.
Client A opens a “lockfile,” confirms the state ID.
Client A locks the file with a 1024-byte owner string. Server stores it in a fixed-size heap buffer, overflowing into adjacent kernel memory.
Client B, using the same session, triggers a read operation. Due to shared state, it leaks the overflowed data—sensitive kernel structures like credentials or slabs—over the network.

This evades mitigations: no userland code, pure kernel-to-network. Discovered in kernel 2.4.1 (circa 2001), it survived 23 years of scrutiny, including syzkaller fuzzing and Coverity scans. Anthropic reported it responsibly; kernel devs patched it in 6.11-rc5 (CVE-2024-42235, though details vary).

Implications for Kernel Security

AI tools like Claude Code shift vulnerability hunting from months of expert grind to hours of scripting. Carlini found “a bunch” of remote heap overflows—unprecedented for one pass. This matters because Linux powers 96% of top web servers, cloud infra, and IoT. Unpatched NFS servers (still common in enterprises) expose billions of devices.

Skeptically, AI isn’t magic. Claude hallucinates, misses context, requires human verification—Carlini triaged outputs manually. False positives waste time; it excels at obscure edges humans overlook due to fatigue. But pair it with tools like KASAN (Kernel Address Sanitizer), and auditing accelerates exponentially.

Broader fallout: open-source maintainers face a deluge. Kernel gets 1,000+ CVEs yearly; AI floods could overwhelm volunteers. Distros like Ubuntu patch fast, but embedded systems lag. Attackers adopt this too—imagine nation-states scripting Claude against custom firmware.

Defenses evolve: kernel hardened with SLAB_FREELIST_RANDOM, but protocol bugs persist. This proves AI as force multiplier for defenders first. Expect vendors to integrate LLMs into CI pipelines. For security teams, learn prompting now—manual audits alone won’t cut it in 2025.

Bottom line: 23 years undetected signals auditing gaps. Claude closes them, but demands rigorous validation. Linux stays robust, yet this underscores urgency for automated, AI-augmented reviews across all critical codebases.

Claude Code Found a Linux Vulnerability Hidden for 23 Years

How Claude Found the Bugs

The NFS Vulnerability Breakdown

Implications for Kernel Security

Related

US v. Heppner (S.D.N.Y. 2026) no attorney-client privilege for AI chats [pdf]

Mathematics in the Library of Babel

C++26: Structured bindings in conditions

Arguing With Agents

Want to write a compiler? Just read these two papers (2008)

Google broke its promise to me – now ICE has my data