Kubernetes control plane for MXL. mxl-k8s is the
cluster-side complement to dmf-mxl: it links upstream
libmxl and libmxl-fabrics and turns MXL's cross-node transport
into a cluster feature so individual media functions don't have to
carry it themselves.
MXL gives media functions intra-node, zero-copy flow exchange on tmpfs. The moment two functions land on different machines, crossing the wire is a separate stack: discover where the flow lives, open a libmxl-fabrics Target on the consumer node, run an Initiator on the producer node, exchange addresses and memory keys, register memory regions, drive a transfer loop, recover from drops. That is a lot to ask of a function whose job is to produce or consume grains.
mxl-k8s lifts that stack into the cluster. A producer or
consumer pod keeps the same shape it has on a single node: it
reads or writes against its local /run/mxl/domain and never
mentions libmxl-fabrics. The cluster makes sure the flow shows
up there.
Four pieces, all running inside the cluster:
- A per-node agent DaemonSet watches
/run/mxl/domainviafanotifyand publishes each flow on the Kubernetes API. - A per-node gateway DaemonSet owns the libmxl-fabrics handles: FlowReader on producer nodes, FlowWriter on consumer nodes, Initiator/Target on the fabric, and the per-grain transfer loop.
- A cluster-scoped operator reconciles
MxlReceiverintent ("this pod wants to consume that flow") into oneMxlFlowMirrorper (flow, target-node). Multiple consumers on the same node share a single mirror; the share is refcounted through the mirror'smetadata.ownerReferences, so removing one consumer leaves the mirror in place for the others and the operator tears the mirror down only when the last owner ref is gone. - An LD_PRELOAD shim in consumer pods turns the first
libmxl probe (
access,stat,open, ...) for a not-yet- materialised flow into a synchronous wait on the local agent. Consumer code callsmxlCreateFlowReaderthe same way it does on a single node.
The CRDs (MxlFlow, MxlReceiver, MxlFlowMirror,
MxlDomain, MxlNodeCapabilities) describe flows, who wants
them, and what each node can carry.
For the full architecture walkthrough (per-node anatomy,
control-plane and data-plane sequences, lifecycle diagrams) see
docs/architecture/.
mxl-k8s lives in the Media Exchange row of the
EBU Dynamic Media Facility Reference Architecture (V2.0,
April 2026).
It covers the cross-node flow lifecycle: discovery and
registration, the per-mirror libmxl-fabrics handshake, and recovery
on writer or gateway restarts. Container orchestration, identity, and
per-function on-node behaviour stay with Kubernetes, the cluster's
identity provider, and upstream dmf-mxl respectively.
libmxl and libmxl-fabrics are linked from
dmf-mxl/mxl through the
go-mxl bindings.
FlowReader / FlowWriter semantics, grain layout, and the shape
of flow_def.json remain upstream's design; mxl-k8s is the
cluster orchestration around them.
See ROADMAP.md for the feature roadmap.
For media-function authors:
- No
libmxl-fabricslink, nolibfabriclink, no headers. - No TargetInfo handshake, no memory-region registration, no per-grain transfer loop.
- No provider choice at code time. The fabric provider
(
tcp,verbs,efa,shm) is a YAML knob on theMxlReceiver. - No reconnect path for producer or consumer restarts. The gateway rebuilds the fabric side and republishes the new address on its own; the function keeps reading from its local domain.
For cluster operators:
- One DaemonSet per node owns the fabric. Bandwidth scaling and failure isolation follow the standard Kubernetes affordances.
- Provider rollout and host-side prerequisites
(
/dev/infiniband,IPC_LOCK, RoCEv2 PFC/DSCP, EFA AMI configuration) are documented underdocs/RDMA.md. - Container images publish to
ghcr.io/qvest-digital/mxl-k8s/<component>for every PR, every push tomain(:devplus:sha-<short>), and every per-component release tag (:vX.Y.Zplus:latestor:pre).
The control plane (operator, agent, gateway, CRDs, RBAC) ships as a
Helm chart at
oci://ghcr.io/qvest-digital/mxl-k8s/charts/mxl-k8s.
helm install mxl oci://ghcr.io/qvest-digital/mxl-k8s/charts/mxl-k8s \
--version 1.0.0-rc.2 \
--namespace mxl-system --create-namespaceFluxCD users point an OCI-typed HelmRepository at the same URL.
See charts/mxl-k8s/README.md for the
full values reference, override examples (RDMA, private registry,
IRSA), and the FluxCD HelmRelease snippet.
The repo ships a KIND cluster that runs an end-to-end TCP flow
across two worker nodes. Requires Docker, kind >= 0.20,
kubectl, and helm. Linux host with a kernel >= 5.17 (the
agent's fanotify needs FAN_REPORT_DFID_NAME).
For docker run:
make kind-upor for podman:
make kind-up CONTAINER_RUNTIME=podmanThat builds every component image locally, brings up a
three-node KIND cluster (control plane plus two workers),
installs the mxl-k8s Helm chart against the overlay in
examples/kind/, applies the demo writer/
reader workload from examples/kind/demo/,
and waits for the MxlFlowMirror to reach Ready. After about
a minute the writer pod is producing grains on one worker and
the reader pod is consuming them on the other.
kubectl --context kind-mxl-k8s-demo -n mxl-system logs pod/mxl-tcp-demo-readerThe reader prints one line per grain (idx=... size=... slices=.../...).
Use make kind-status for the converged state of the CRDs and
pods, make kind-down to tear the cluster down.
docs/KIND.md walks through what each step does
and what to look at if convergence stalls.
The repo is a Go workspace with four modules:
| Module | Path | Purpose |
|---|---|---|
api |
github.com/qvest-digital/mxl-k8s/api |
CRD types. |
operator |
github.com/qvest-digital/mxl-k8s/operator |
Cluster operator that reconciles the CRDs. |
agent |
github.com/qvest-digital/mxl-k8s/agent |
Per-node DaemonSet. Pure Go; watches the domain via fanotify, does not link libmxl. |
gateway |
github.com/qvest-digital/mxl-k8s/gateway |
Per-node DaemonSet. Links libmxl + libmxl-fabrics via go-mxl. |
docs/USAGE.md covers the prerequisites for a
media function (container, libmxl link, capabilities) and how to
integrate it as a producer or consumer.
docs/BUILD.md covers local-build prerequisites
and the cgo lane for gateway.
CLAUDE.md carries the contributor rules.