Stateful firewall with native Junos configuration syntax.
Dataplane notice (#1373, complete): the eBPF dataplane retirement is done. The Rust AF_XDP userspace dataplane is the only runtime forwarding path. Explicit
system dataplane-type ebpfis hard-rejected at commit (ErrEBPFDataplaneRetired) and at runtime (ErrEBPFBackendRetired); useset system dataplane-type userspace, or omit the knob for the default. The legacy BPF source (bpf/xdp/*.c,bpf/tc/*.c) was deleted in #1476; the only retained eBPF artifacts are the userspace XDP shim (userspace-xdp/) and the sharedbpf/headers/*.hmap/struct bootstrap.
xpf is a high-performance stateful firewall that replicates Juniper vSRX
capabilities. It uses the familiar Junos hierarchical configuration syntax and
provides a full interactive CLI with tab completion and ? help.
xpf has a single runtime forwarding path: the Rust AF_XDP userspace dataplane. It is driven by the Go control plane (config, HA, routing, CLI, APIs).
The userspace AF_XDP backend is selected by set system dataplane-type userspace, or by omitting the knob entirely (the default). The legacy eBPF
forwarding backend was retired in #1373/#1476: explicit system dataplane-type ebpf is hard-rejected at commit time with ErrEBPFDataplaneRetired and at
runtime with ErrEBPFBackendRetired. The parser still accepts the ebpf
token so that load merge/load override of a pre-retirement config does not
syntax-error during a rolling upgrade — but commit check then fails with the
retirement error, and the remediation is set system dataplane-type userspace.
If a persisted config still names ebpf on startup, the daemon runs in
config-only mode until the operator updates it. The current userspace admission
boundary is tracked in
docs/userspace-dataplane-gaps.md.
A Rust-based forwarding engine receives packets via AF_XDP sockets and processes them in userspace. A Rust XDP shim stamps metadata, redirects transit traffic into AF_XDP, and still hands proven local/control traffic back to the kernel when needed. If helper/XSK forwarding is degraded, non-local transit fails closed in both compat and strict modes instead of bypassing policy, NAT, or conntrack.
NIC → XDP shim (redirect transit, pass local/control, drop degraded transit)
→ AF_XDP socket
→ Rust worker thread (session → policy → NAT → FIB → TX)
→ AF_XDP TX ring → NIC
- Per-worker architecture: one worker per queue shard, with session/NAT/policy/FIB handled in Rust
- AF_XDP fast path: current code supports both copy and zero-copy modes depending on driver/path behavior
- Kernel pass-through: cpumap-assisted delivery keeps local/kernel-owned traffic out of the AF_XDP fast path
- Fail-closed admission: unsupported userspace configs are gated or fail closed rather than bypassing policy, NAT, or conntrack
- Degraded mode: when helper/XSK forwarding is unavailable, the shim keeps non-local transit out of the kernel forwarding path, passes only proven local/control traffic, and drops degraded transit
- Best for: all dataplane forwarding — there is no other runtime backend
- See:
docs/userspace-dataplane-architecture.mdfor the current architecture anddocs/userspace-debug-map.mdfor the active debugging map
To tune the userspace dataplane:
system {
dataplane {
binary /usr/local/sbin/xpf-userspace-dp;
workers 6;
ring-entries 8192;
}
}
The original dataplane ran in-kernel using 14 BPF programs chained via tail
calls (XDP ingress main -> screen -> zone -> conntrack -> policy -> nat -> nat64 -> forward; TC egress main -> screen_egress -> conntrack -> nat -> forward) and reached 25+ Gbps on native XDP (mlx5, i40e, ice). That source
(bpf/xdp/*.c, bpf/tc/*.c) was deleted in #1476; the pipeline is preserved
only in git history (git log -- bpf/xdp/ bpf/tc/). It is no longer a
selectable backend — see the hard-reject contract above.
| Capability | Userspace AF_XDP (the runtime path) |
|---|---|
| Stateful forwarding | Yes |
| Zone + global policies | Yes |
| Application matching | Yes |
| Source NAT (interface + pool) | Interface and pool mode yes; userspace address-persistent uses a documented userspace-v1 hash. Non-HA per-pool persistent-nat lease reuse and pool exhaustion counters are implemented in helper-local runtime state; HA/restart persistence and cross-backend new-flow parity remain outside the current contract |
| Destination NAT | Yes |
| Static NAT (1:1) | Yes |
| NAT64 (IPv6↔IPv4) | Yes |
| NPTv6 (RFC 6296) | Yes |
| Screen/IDS (11 checks) | Yes; userspace SYN-cookie runtime is wired |
| Firewall filters + policers | Filters yes; three-color policers admitted for the reviewed color-blind then discard slice; broader color-aware and non-drop action work is tracked as production hardening |
| TCP MSS clamping | Yes |
| GRE tunnel transit | Yes (passthrough) |
| IPsec / XFRM | Yes (passthrough) |
| VLANs (802.1Q) | Yes |
| Flow export (NetFlow v9) | Yes |
| HA cluster + session sync | Integrated; HA hardening tracked in open issues |
| SYN cookie flood protection | Yes |
| Throughput (25G mlx5) | See validation/perf docs for current results |
The userspace dataplane covers the transit feature set in native Rust.
SYN-cookie-dependent screen behavior runs in userspace with bounded
SYN-ACK/RST replies and userspace status counters (#1374 closed). Port
mirroring has bounded userspace runtime admission (#1376 closed). Three-color
policers are admitted for the bounded color-blind then discard runtime
slice (#1375 closed); remaining color-aware, non-drop action, and HA/restart
continuity work is production hardening tracked in open issues such as
#1614 (CoS regression) and
#1608 (cold-path hardening), not
the closed #1373 feature-gap trackers. Pool-mode SNAT is admitted,
#1385 added userspace-v1
address-persistent selection, and the runtime fails closed for unusable
or exhausted source-NAT pool rules before forwarding. Non-HA per-pool
persistent-nat lease reuse is helper-local userspace state; it does not
survive helper restart and HA persistent-NAT configs remain gated. The exact
admission boundary is documented in
docs/userspace-dataplane-gaps.md.
- Go control plane handles config compilation, session GC, management APIs, HA cluster, and routing
- Rust AF_XDP userspace dataplane owns the only packet-forwarding path
- Retained eBPF surface is the userspace XDP shim (
userspace-xdp/) plus the sharedbpf/headers/*.hmap/struct bootstrap — not a forwarding backend - Dual session entries (forward + reverse) in the shared conntrack hash map
- Three-phase config compilation: Junos AST → typed Go structs → userspace-dp control messages
- Zone-based policies with stateful inspection, address books, application matching, global policies
- NAT: source (interface + pool, userspace-v1 address-persistent), destination (with hit counters), static 1:1, NAT64, NPTv6 (RFC 6296 stateless prefix translation)
- Dual-stack: IPv4 + IPv6, DHCPv4/v6 clients, embedded Router Advertisement sender (replaces radvd), SLAAC
- Screen/IDS: 11 checks (land, SYN flood, ping of death, teardrop, SYN-FIN, no-flag, winnuke, FIN-no-ACK, rate-limiting), SYN cookie flood protection (userspace-minted/validated SYN-ACK cookies replied through the AF_XDP TX path)
- Firewall filters: policer (token bucket + three-color), lo0 filter, flexible match, port ranges, hit counters, logging, forwarding-class DSCP rewrite
- TCP MSS clamping in the userspace AF_XDP dataplane (all-tcp, ipsec-vpn, and GRE gre-in/gre-out)
- ALG control, allow-dns-reply, allow-embedded-icmp
- Configurable timeouts (per-application inactivity)
- Session management: filtered clearing, idle time tracking, brief tabular view, aggregation reporting
- FRR integration: static, OSPF, BGP, IS-IS, RIP, ECMP multipath, export/redistribute
- VRFs with inter-VRF route leaking (next-table + rib-group)
- GRE tunnels, XFRM interfaces, PBR (policy-based routing)
- VLANs: 802.1Q tagging, trunk ports
- IPsec: strongSwan config generation, IKE proposals, gateway compilation
- Full interface management: xpfd owns ALL interfaces — renames via
.linkfiles, configures addresses/DHCP via.networkfiles, brings down unconfigured interfaces
- Chassis cluster with ~60ms failover (30ms VRRP intervals)
- Native VRRPv3: Go state machine, AF_PACKET, per-instance sockets, IPv6 NODAD, 30ms RETH advertisements, async GARP burst
- Bondless RETH: VRRP on physical member interfaces, per-node virtual MAC (
02:bf:72:CC:RR:NN), no Linux bonding required - Session sync: incremental 1s sweep + ring buffer + GC delete callbacks, TCP on fabric link
- Config sync: primary → secondary with
${node}variable expansion, reverse-sync on reconnect - IPsec SA sync: shared IKE/ESP state across cluster nodes
- Dual fabric links: independent fab0/fab1 for redundancy (no bonding)
- Fabric cross-chassis forwarding:
try_fabric_redirect()redirects to peer when FIB fails for synced sessions - Dataplane watchdogs: userspace heartbeat checks fail closed on daemon/helper failure; if a persisted config still names the retired
ebpfbackend, the daemon runs in config-only mode until it is updated - Readiness gate: per-RG readiness (interfaces + VRRP) + hold timer gates election
- Planned shutdown: near-instant takeover (priority-0 burst), failback ~130ms
- ISSU: in-service software upgrade with rolling deploy
- RA lifecycle: goodbye RAs (lifetime=0) on failover/startup to prevent stale IPv6 ECMP routes
- Syslog: facility/severity/category filtering, structured RT_FLOW format, TCP/TLS transport, event mode local file
- NetFlow v9: 1-in-N sampling
- Prometheus metrics (
/metricsendpoint) - SNMP: system + ifTable MIB
- RPM probes, dynamic address feeds
- Dataplane buffer utilization (
show system buffers): AF_XDP UMEM/TX-ring capacity, CoS queued-byte capacity, helper-published session-table and flow-cache capacity - LLDP: link layer discovery protocol
- Interactive CLI: Junos-style prefix matching, tab completion,
?help, pipe filters (| match,| count,| except) - Remote CLI:
clibinary connects via gRPC with full tab/?parity - gRPC API: 48+ RPCs (config, sessions, stats, routes, IPsec, DHCP, cluster)
- REST API: HTTP on port 8080 (health, Prometheus, config, full gRPC parity)
- Config management: candidate/active with commit model, 50 rollback slots,
load override/load merge,show | display set - Configure mode protection: blocked on secondary cluster nodes (RG0 primary is config authority)
- DHCP server: Kea integration with lease display
- DHCP relay: Option 82 support
- Event engine: event-driven automation
make generate # Generate the retained Rust AF_XDP userspace XDP shim object (post-#1476; no legacy bpf2go)
make build # Build xpfd daemon (embeds version from git)
make build-ctl # Build remote CLI client
make build-userspace-dp # Build Rust AF_XDP dataplane binary (requires cargo)
make test # Run 1020+ tests across 24 packagesxpf uses Junos-style configuration syntax:
interfaces {
trust0 {
unit 0 {
family inet {
address 10.0.1.1/24;
}
}
}
}
security {
zones {
security-zone trust {
interfaces {
trust0;
}
host-inbound-traffic {
system-services {
ssh;
ping;
}
}
}
}
policies {
from-zone trust to-zone untrust {
policy allow-all {
match {
source-address any;
destination-address any;
application any;
}
then {
permit;
}
}
}
}
}
The config supports both hierarchical { } blocks and flat set commands:
set interfaces trust0 unit 0 family inet address 10.0.1.1/24
set security zones security-zone trust interfaces trust0
set security policies from-zone trust to-zone untrust policy allow-all match source-address any destination-address any application any
set security policies from-zone trust to-zone untrust policy allow-all then permit
- Local CLI: run
xpfdin a TTY for interactive Junos-style shell - Remote CLI:
cli -addr <host>:50051connects via gRPC - gRPC API: 48+ RPCs on port 50051 (config, sessions, stats, routes, IPsec, DHCP, cluster)
- REST API: HTTP on port 8080 (health, Prometheus
/metrics, config endpoints)
- Userspace dataplane (the runtime path)
- AF_XDP-based forwarding with per-worker Rust session/NAT/policy/FIB processing
- Copy or zero-copy mode depending on NIC driver/path behavior
- Kernel pass-through via cpumap for local and other kernel-owned traffic
- See
docs/userspace-ha-validation.mdanddocs/userspace-perf-compare.mdfor current validation and profiling workflow
- Cluster / control plane
- Hitless restarts with zero packet loss
- ~60ms cluster failover (30ms VRRP, ~97ms masterDown interval)
- Near-instant planned shutdown (priority-0 burst, peer takes over in ~1ms)
- Historical (retired eBPF dataplane, git history only)
- 25+ Gbps with native XDP (i40e/ice PF passthrough), 15.6 Gbps with virtio-net
An Incus-based test environment provisions Debian VMs with FRR, strongSwan, and test containers:
# Single VM (standalone firewall)
make test-env-init # One-time setup
make test-vm # Create VM
make test-deploy # Build + deploy + restart service
make test-logs # View daemon logs
# Two-VM HA cluster (defaults to loss userspace cluster)
make cluster-init # Create networks + profile
make cluster-create # Launch xpf-userspace-fw0 + xpf-userspace-fw1 + LAN host
make cluster-deploy # Rolling deploy: secondary first, then primary (preserves traffic)Userspace dataplane testing (requires mlx5 NICs on loss cluster):
# Userspace HA cluster
make cluster-deploy
./scripts/userspace-ha-validation.sh --env test/incus/loss-userspace-cluster.env
./scripts/userspace-perf-compare.shmake cluster-deploy performs a rolling deploy to maintain traffic continuity:
- Determines which node is currently secondary
- Deploys to the secondary (primary continues forwarding traffic)
- Waits for the secondary to sync sessions from the primary
- Deploys to the primary (upgraded secondary takes over via VRRP failover)
To deploy to a single node: make cluster-deploy NODE=0 or make cluster-deploy NODE=1.
| Test | Command | Description |
|---|---|---|
| Unit tests | make test |
1020+ Go tests across 24 packages |
| Connectivity | make test-connectivity |
End-to-end IPv4/IPv6 routing and SNAT |
| Failover | make test-failover |
iperf3 survives fw0 reboot (session sync + VRRP) |
| Hard crash | make test-ha-crash |
Force-stop, daemon stop, multi-cycle crash recovery |
| Restart | make test-restart-connectivity |
Zero packet loss during daemon restart |
| Private RG | ./test/incus/test-private-rg.sh |
VRRP elimination via private-rg-election |
| Path | Description |
|---|---|
bpf/headers/*.h |
Shared C structs/constants consumed by the retained Rust AF_XDP shim build and userspace-dp parity tests. The legacy bpf/xdp/*.c and bpf/tc/*.c source were deleted in #1476 |
pkg/config/ |
Junos parser, AST, typed config, compiler |
pkg/cmdtree/ |
Single source of truth for all CLI command trees |
pkg/configstore/ |
Candidate/active/commit/rollback, atomic DB persistence |
pkg/dataplane/ |
Runtime contracts, retained userspace shim embed/loader, and eBPF/DPDK retirement-error sentinels (#1476/#1525) |
pkg/dataplane/userspace/ |
Go manager for the Rust userspace dataplane |
userspace-xdp/ |
Retained Rust XDP shim that redirects packets into the AF_XDP userspace runtime |
pkg/daemon/ |
Daemon lifecycle, reconciliation, interface management |
pkg/cluster/ |
Chassis cluster HA (state machine, session sync, config sync) |
pkg/vrrp/ |
Native VRRPv3 state machine (30ms RETH advertisements) |
pkg/ra/ |
Embedded RA sender (replaces radvd) |
pkg/cli/ |
Interactive Junos-style CLI |
pkg/conntrack/ |
Session garbage collection (with HA delete sync) |
pkg/logging/ |
Ring buffer reader, event buffer, syslog client |
pkg/dhcp/ |
DHCPv4/DHCPv6 clients |
pkg/frr/ |
FRR config generation + managed section in frr.conf |
pkg/networkd/ |
systemd-networkd .link/.network file generation |
pkg/routing/ |
GRE tunnels, VRFs, XFRM interfaces, route leaking |
pkg/ipsec/ |
strongSwan config + SA queries |
pkg/api/ |
HTTP REST API + Prometheus collector |
pkg/grpcapi/ |
gRPC server + protobuf bindings |
pkg/flowexport/ |
NetFlow v9 exporter |
pkg/feeds/ |
Dynamic address feed fetcher |
pkg/dhcpserver/ |
Kea DHCP server management |
pkg/dhcprelay/ |
DHCP relay with Option 82 |
pkg/eventengine/ |
Event-driven automation engine |
pkg/rpm/ |
RPM probe manager |
pkg/snmp/ |
SNMP agent (system + ifTable MIB) |
pkg/lldp/ |
LLDP protocol |
proto/xpf/v1/ |
Protobuf service definition |
cmd/xpfd/ |
Daemon main binary |
cmd/cli/ |
Remote CLI client binary |
userspace-dp/ |
Rust AF_XDP userspace dataplane binary |
docs/ |
Protocol docs, test plans, feature gaps |
test/incus/ |
Test environment scripts and configs |
See docs/ for detailed design documents:
sync-protocol.md— Cluster session sync wire protocol and algorithmsfabric-cross-chassis-fwd.md— Fabric link cross-chassis forwarding designha-cluster.conf— Unified HA cluster config with${node}variable expansiontesting-procedures.md— Test categories, procedures, and debugging tipsphases.md— Development phase history (40+ sprints)bugs.md— Bug tracker with root cause analysisoptimizations.md— Performance profiling and optimization notestest_env.md— Test topology and validation stepsfeature-gaps.md— vSRX feature parity trackinguserspace-dataplane-architecture.md— Comprehensive userspace AF_XDP dataplane architectureuserspace-debug-map.md— Active file/function map for userspace forwarding and debuggingxdp-io-uring-userspace-dataplane.md— Original userspace dataplane design documentshared-umem-plan.md— Cross-NIC shared UMEM design and validation planuserspace-ha-validation.md— HA failover validation proceduresuserspace-perf-compare.md— Throughput benchmarking methodologyuserspace-dnat-plan.md— Destination NAT implementation plan for userspace dataplaneuserspace-dataplane-gaps.md— Current userspace AF_XDP capability/admission boundary
- Linux kernel 6.12+ (6.18+ recommended for full NAT64 support)
- Go 1.22+
- clang/llvm (for generating the retained Rust AF_XDP userspace XDP shim object)
- Rust stable (for the primary userspace dataplane)
- FRR (for routing protocol integration)
- strongSwan (for IPsec, optional)
- Kea (for DHCP server, optional)