LangChain’s prompt template system has a flaw in its f-string validation. Attackers supplying untrusted templates can execute attribute access and indexing on objects passed to the formatter. This affects langchain-core, specifically DictPromptTemplate and ImagePromptTemplate, plus a bypass in format specifier handling. Rated medium severity, it demands attention if your app takes user-controlled templates.
The core problem hits two spots. First, PromptTemplate enforces checks on f-string attributes to block dangerous expressions like {obj.secret}. But DictPromptTemplate and ImagePromptTemplate skip this. They parse f-strings with indexing or dots—such as "{message.additional_kwargs[secret]}" or "https://example.com/{image.__class__.__name__}.png"—and evaluate them raw during format(). Python’s f-string machinery runs the expressions unchecked.
template = DictPromptTemplate.from_dict({"image_url": "https://example.com/{image.__class__.__name__}.png"})
formatted = template.format(image=some_object) # Evaluates image.__class__.__name__
Second issue: Validation scans top-level field names but misses nests in format specs. Take "{name:{name.__class__.__name__}}". The inner {name.__class__.__name__} slips past, triggering at runtime. Python raises a format error often, but with crafty inputs, it probes deeper.
Who Gets Hit
This bites apps accepting untrusted full templates, not just variable values. LangChain powers LLM chains for chatbots, agents, RAG—over 100k GitHub stars, millions of downloads. Most use hardcoded templates: safe. User-authored prompts with simple vars like {user_input}? Low risk, as basic strings/numbers crash on bad access.
Danger spikes with rich objects—think message dicts carrying metadata, user profiles, or DB rows. Pass a message with additional_kwargs={'secret': 'token'}, and {message.additional_kwargs[secret]} dumps it to the prompt. That prompt feeds your LLM, logs, or vector store. Multi-tenant SaaS? One bad template leaks across users.
Rare combo: Few apps mix user templates with fat objects. Devs lock templates, expose vars. But open platforms—custom AI builders, no-code tools—fit the bill. Check your stack: langchain-core before the patch (advisory doesn’t pin versions; scan your deps).
Why It Matters: Leak Paths and Costs
Impact? Confidentiality hits. Secrets in object attrs flow to model context—your API bill balloons on verbose leaks, or worse, models hallucinate from exposed data. Logs capture formatted prompts: attacker grabs internals via debug endpoints. In agents, this chains: leaked token calls external APIs.
Quantify: LangChain apps handle finance (crypto trades), health (patient notes), enterprise (CRM data). A leaked AWS key in kwargs costs thousands in compute, or pivots to breaches. Medium tag fits—not RCE, but in LLM land, prompt injection already reigns; this adds attr traversal.
Skeptical lens: Python f-strings tempt power, but eval-like risks scream sandbox. LangChain aimed to harden; half-measures leave gaps. Fair play—docs warn on untrusted inputs, most deploys sidestep. Still, 2024’s AI boom means sloppy copies of tutorials amplify flaws.
Broader context: Similar bugs plague templating libs (Jinja2 SSTI history). LLMs amplify: prompts are the new SQL. NIST counts prompt inj as OWASP Top 10 contender. Fix now, or regret in audit.
Fix It
Patch pronto: Update langchain-core to latest (advisory implies fixed; verify changelog). Audit code for user templates:
$ pip install --upgrade langchain-core
$ grep -r "PromptTemplate\|DictPromptTemplate" your_app/
Runtime guards: Whitelist templates server-side. Parse/validate f-strings fully—scan AST for getattr/subscr nodes beyond inputs. Use str.format() over f-strings for templates; slower but safer sans expr eval.
import ast
def safe_template_check(template: str):
tree = ast.parse(f'f{repr(template)}')
for node in ast.walk(tree):
if isinstance(node, (ast.Attribute, ast.Subscript)):
raise ValueError("Unsafe expression")
Sandbox format() if paranoid—heavy, but Python’s restrictedpython or pyjail help. Shift left: Hardcode templates, user vars only. Test with gadget chains: Craft payloads probing your objects.
Bottom line: LangChain fixes fast—good sign. But own your inputs. In security, “plausible deniability” kills; assume users evil, objects juicy. Patch, audit, sleep better. (612 words)