[CRITICAL] Security Advisory: Kedro has Arbitrary Code Execution via Malicious Logging Configuration (kedro)

Kedro, the open-source Python framework for building reproducible data pipelines, ships with a critical remote code execution vulnerability.

Kedro, the open-source Python framework for building reproducible data pipelines, ships with a critical remote code execution vulnerability. Attackers can trigger arbitrary command execution at startup by manipulating the KEDRO_LOGGING_CONFIG environment variable. Upgrade to version 1.3.0 now—it’s the only full fix.

This flaw affects all prior versions. Kedro pulls logging configuration from a file specified by that environment variable, then feeds it directly into Python’s logging.config.dictConfig() without checks. Python’s logging system lets configs include a () key, which instantiates any callable with arbitrary arguments. Craft a YAML file like this, point KEDRO_LOGGING_CONFIG to it, and Kedro executes your payload on load:

version: 1
disable_existing_loggers: false
formatters:
  default:
    (): [os, system]
    args: ['id && whoami']
handlers:
  console:
    class: logging.StreamHandler
    formatter: default
    stream: ext://sys.stdout
root:
  level: INFO
  handlers: [console]

Run os.system('id && whoami') or drop a shell—your choice. No authentication required if the env var is reachable.

Why This Hits Hard

Kedro targets data engineers and ML teams at firms like QuantumBlack (McKinsey-owned). It structures pipelines for reproducibility, often in CI/CD, Docker, or cloud runs. GitHub shows 1,200+ stars and 100k+ downloads monthly via PyPI. Teams deploy it in Kubernetes, AWS SageMaker, or GitHub Actions—environments where env vars are set dynamically.

Implications cut deep. An attacker with env var write access—say, via CI secrets, shared infra, or compromised upstream—owns the host. In ML workflows, this means data exfil, model poisoning, or lateral movement. Finance and crypto shops using Kedro for analytics face amplified risk: leaked keys, tampered trades. We’ve seen similar Python logging RCEs before, like in Celery (CVE-2023-41258) or structlog misconfigs. Python’s flexibility bites when unvalidated.

Skeptical take: Exposure depends on your setup. If you hardcode logging and lock env vars, risk drops. But defaults invite trouble—Kedro docs push KEDRO_LOGGING_CONFIG for overrides. No CVSS score yet, but it’s pre-auth RCE: critical by any metric.

Fix It

Patch lands in Kedro 1.3.0 (released October 2024). It scans configs upfront, rejects any () keys before dictConfig(). Verify with:

$ pip install kedro==1.3.0
$ kedro --version

Test your pipelines post-upgrade. No breaking changes reported.

Can’t upgrade? Layer defenses:

Block untrusted control of KEDRO_LOGGING_CONFIG—audit CI YAML, Dockerfiles, .env files.
Lock write access to config dirs (e.g., chmod 644 conf/logging.yml).
Skip dynamic configs; stick to built-ins.
Validate YAML manually: grep for '()' or use a schema checker like yamllint with custom rules.

These blunt the edge but leave gaps—dictConfig() has other tricks, like filters or nested callables.

Broader lesson: Vet env vars in data tools. Kedro’s maintainers (now under Alfresco) acted fast post-disclosure. Still, scan deps with pip-audit or Snyk. In security-conscious stacks—crypto exchanges, banks—treat logging as hostile input. This vuln underscores supply chain perils: one bad config in a pipeline cascades.

Bottom line: If Kedro’s in your stack, patch today. Delay invites exploits. Track GitHub advisory for updates—stay sharp.

[CRITICAL] Security Advisory: Kedro has Arbitrary Code Execution via Malicious Logging Configuration (kedro)

Why This Hits Hard

Fix It

Related

Which messaging app takes the most limited approach to permissions on Android?

[HIGH] Security Advisory: Antrea has Missing Encryption of Sensitive Data (antrea.io/antrea, antrea.io/antrea, antrea.io/antrea)

[LOW] Security Advisory: Signal K Server: Arbitrary Prototype Read via `from` Field Bypass (signalk-server)

[CRITICAL] Security Advisory: fast-jwt: Cache Confusion via cacheKeyBuilder Collisions Can Return Claims From a Different Token (Identity/Authorization Mixup) (fast-jwt)

[CRITICAL] Security Advisory: goshs: Improper Limitation of a Pathname to a Restricted Directory (‘Path Traversal’) in goshs POST multipart upload (github.com/patrickhener/goshs)

[MEDIUM] Security Advisory: DOMPurify ADD_ATTR predicate skips URI validation (dompurify)