Skip to content

psaab/xpf

Repository files navigation

xpf

Stateful firewall with native Junos configuration syntax.

Dataplane notice (#1373, complete): the eBPF dataplane retirement is done. The Rust AF_XDP userspace dataplane is the only runtime forwarding path. Explicit system dataplane-type ebpf is hard-rejected at commit (ErrEBPFDataplaneRetired) and at runtime (ErrEBPFBackendRetired); use set system dataplane-type userspace, or omit the knob for the default. The legacy BPF source (bpf/xdp/*.c, bpf/tc/*.c) was deleted in #1476; the only retained eBPF artifacts are the userspace XDP shim (userspace-xdp/) and the shared bpf/headers/*.h map/struct bootstrap.

xpf is a high-performance stateful firewall that replicates Juniper vSRX capabilities. It uses the familiar Junos hierarchical configuration syntax and provides a full interactive CLI with tab completion and ? help.

Dataplane Architecture

xpf has a single runtime forwarding path: the Rust AF_XDP userspace dataplane. It is driven by the Go control plane (config, HA, routing, CLI, APIs).

The userspace AF_XDP backend is selected by set system dataplane-type userspace, or by omitting the knob entirely (the default). The legacy eBPF forwarding backend was retired in #1373/#1476: explicit system dataplane-type ebpf is hard-rejected at commit time with ErrEBPFDataplaneRetired and at runtime with ErrEBPFBackendRetired. The parser still accepts the ebpf token so that load merge/load override of a pre-retirement config does not syntax-error during a rolling upgrade — but commit check then fails with the retirement error, and the remediation is set system dataplane-type userspace. If a persisted config still names ebpf on startup, the daemon runs in config-only mode until the operator updates it. The current userspace admission boundary is tracked in docs/userspace-dataplane-gaps.md.

Userspace Dataplane (the runtime forwarding path)

A Rust-based forwarding engine receives packets via AF_XDP sockets and processes them in userspace. A Rust XDP shim stamps metadata, redirects transit traffic into AF_XDP, and still hands proven local/control traffic back to the kernel when needed. If helper/XSK forwarding is degraded, non-local transit fails closed in both compat and strict modes instead of bypassing policy, NAT, or conntrack.

NIC → XDP shim (redirect transit, pass local/control, drop degraded transit)
    → AF_XDP socket
    → Rust worker thread (session → policy → NAT → FIB → TX)
    → AF_XDP TX ring → NIC
  • Per-worker architecture: one worker per queue shard, with session/NAT/policy/FIB handled in Rust
  • AF_XDP fast path: current code supports both copy and zero-copy modes depending on driver/path behavior
  • Kernel pass-through: cpumap-assisted delivery keeps local/kernel-owned traffic out of the AF_XDP fast path
  • Fail-closed admission: unsupported userspace configs are gated or fail closed rather than bypassing policy, NAT, or conntrack
  • Degraded mode: when helper/XSK forwarding is unavailable, the shim keeps non-local transit out of the kernel forwarding path, passes only proven local/control traffic, and drops degraded transit
  • Best for: all dataplane forwarding — there is no other runtime backend
  • See: docs/userspace-dataplane-architecture.md for the current architecture and docs/userspace-debug-map.md for the active debugging map

To tune the userspace dataplane:

system {
    dataplane {
        binary /usr/local/sbin/xpf-userspace-dp;
        workers 6;
        ring-entries 8192;
    }
}

Historical note: the retired eBPF dataplane

The original dataplane ran in-kernel using 14 BPF programs chained via tail calls (XDP ingress main -> screen -> zone -> conntrack -> policy -> nat -> nat64 -> forward; TC egress main -> screen_egress -> conntrack -> nat -> forward) and reached 25+ Gbps on native XDP (mlx5, i40e, ice). That source (bpf/xdp/*.c, bpf/tc/*.c) was deleted in #1476; the pipeline is preserved only in git history (git log -- bpf/xdp/ bpf/tc/). It is no longer a selectable backend — see the hard-reject contract above.

Userspace Dataplane Capabilities

Capability Userspace AF_XDP (the runtime path)
Stateful forwarding Yes
Zone + global policies Yes
Application matching Yes
Source NAT (interface + pool) Interface and pool mode yes; userspace address-persistent uses a documented userspace-v1 hash. Non-HA per-pool persistent-nat lease reuse and pool exhaustion counters are implemented in helper-local runtime state; HA/restart persistence and cross-backend new-flow parity remain outside the current contract
Destination NAT Yes
Static NAT (1:1) Yes
NAT64 (IPv6↔IPv4) Yes
NPTv6 (RFC 6296) Yes
Screen/IDS (11 checks) Yes; userspace SYN-cookie runtime is wired
Firewall filters + policers Filters yes; three-color policers admitted for the reviewed color-blind then discard slice; broader color-aware and non-drop action work is tracked as production hardening
TCP MSS clamping Yes
GRE tunnel transit Yes (passthrough)
IPsec / XFRM Yes (passthrough)
VLANs (802.1Q) Yes
Flow export (NetFlow v9) Yes
HA cluster + session sync Integrated; HA hardening tracked in open issues
SYN cookie flood protection Yes
Throughput (25G mlx5) See validation/perf docs for current results

The userspace dataplane covers the transit feature set in native Rust. SYN-cookie-dependent screen behavior runs in userspace with bounded SYN-ACK/RST replies and userspace status counters (#1374 closed). Port mirroring has bounded userspace runtime admission (#1376 closed). Three-color policers are admitted for the bounded color-blind then discard runtime slice (#1375 closed); remaining color-aware, non-drop action, and HA/restart continuity work is production hardening tracked in open issues such as #1614 (CoS regression) and #1608 (cold-path hardening), not the closed #1373 feature-gap trackers. Pool-mode SNAT is admitted, #1385 added userspace-v1 address-persistent selection, and the runtime fails closed for unusable or exhausted source-NAT pool rules before forwarding. Non-HA per-pool persistent-nat lease reuse is helper-local userspace state; it does not survive helper restart and HA persistent-NAT configs remain gated. The exact admission boundary is documented in docs/userspace-dataplane-gaps.md.

Architecture

  • Go control plane handles config compilation, session GC, management APIs, HA cluster, and routing
  • Rust AF_XDP userspace dataplane owns the only packet-forwarding path
  • Retained eBPF surface is the userspace XDP shim (userspace-xdp/) plus the shared bpf/headers/*.h map/struct bootstrap — not a forwarding backend
  • Dual session entries (forward + reverse) in the shared conntrack hash map
  • Three-phase config compilation: Junos AST → typed Go structs → userspace-dp control messages

Features

Firewall & Security

  • Zone-based policies with stateful inspection, address books, application matching, global policies
  • NAT: source (interface + pool, userspace-v1 address-persistent), destination (with hit counters), static 1:1, NAT64, NPTv6 (RFC 6296 stateless prefix translation)
  • Dual-stack: IPv4 + IPv6, DHCPv4/v6 clients, embedded Router Advertisement sender (replaces radvd), SLAAC
  • Screen/IDS: 11 checks (land, SYN flood, ping of death, teardrop, SYN-FIN, no-flag, winnuke, FIN-no-ACK, rate-limiting), SYN cookie flood protection (userspace-minted/validated SYN-ACK cookies replied through the AF_XDP TX path)
  • Firewall filters: policer (token bucket + three-color), lo0 filter, flexible match, port ranges, hit counters, logging, forwarding-class DSCP rewrite

Flow Processing

  • TCP MSS clamping in the userspace AF_XDP dataplane (all-tcp, ipsec-vpn, and GRE gre-in/gre-out)
  • ALG control, allow-dns-reply, allow-embedded-icmp
  • Configurable timeouts (per-application inactivity)
  • Session management: filtered clearing, idle time tracking, brief tabular view, aggregation reporting

Routing & Networking

  • FRR integration: static, OSPF, BGP, IS-IS, RIP, ECMP multipath, export/redistribute
  • VRFs with inter-VRF route leaking (next-table + rib-group)
  • GRE tunnels, XFRM interfaces, PBR (policy-based routing)
  • VLANs: 802.1Q tagging, trunk ports
  • IPsec: strongSwan config generation, IKE proposals, gateway compilation
  • Full interface management: xpfd owns ALL interfaces — renames via .link files, configures addresses/DHCP via .network files, brings down unconfigured interfaces

High Availability

  • Chassis cluster with ~60ms failover (30ms VRRP intervals)
  • Native VRRPv3: Go state machine, AF_PACKET, per-instance sockets, IPv6 NODAD, 30ms RETH advertisements, async GARP burst
  • Bondless RETH: VRRP on physical member interfaces, per-node virtual MAC (02:bf:72:CC:RR:NN), no Linux bonding required
  • Session sync: incremental 1s sweep + ring buffer + GC delete callbacks, TCP on fabric link
  • Config sync: primary → secondary with ${node} variable expansion, reverse-sync on reconnect
  • IPsec SA sync: shared IKE/ESP state across cluster nodes
  • Dual fabric links: independent fab0/fab1 for redundancy (no bonding)
  • Fabric cross-chassis forwarding: try_fabric_redirect() redirects to peer when FIB fails for synced sessions
  • Dataplane watchdogs: userspace heartbeat checks fail closed on daemon/helper failure; if a persisted config still names the retired ebpf backend, the daemon runs in config-only mode until it is updated
  • Readiness gate: per-RG readiness (interfaces + VRRP) + hold timer gates election
  • Planned shutdown: near-instant takeover (priority-0 burst), failback ~130ms
  • ISSU: in-service software upgrade with rolling deploy
  • RA lifecycle: goodbye RAs (lifetime=0) on failover/startup to prevent stale IPv6 ECMP routes

Observability

  • Syslog: facility/severity/category filtering, structured RT_FLOW format, TCP/TLS transport, event mode local file
  • NetFlow v9: 1-in-N sampling
  • Prometheus metrics (/metrics endpoint)
  • SNMP: system + ifTable MIB
  • RPM probes, dynamic address feeds
  • Dataplane buffer utilization (show system buffers): AF_XDP UMEM/TX-ring capacity, CoS queued-byte capacity, helper-published session-table and flow-cache capacity
  • LLDP: link layer discovery protocol

Management

  • Interactive CLI: Junos-style prefix matching, tab completion, ? help, pipe filters (| match, | count, | except)
  • Remote CLI: cli binary connects via gRPC with full tab/? parity
  • gRPC API: 48+ RPCs (config, sessions, stats, routes, IPsec, DHCP, cluster)
  • REST API: HTTP on port 8080 (health, Prometheus, config, full gRPC parity)
  • Config management: candidate/active with commit model, 50 rollback slots, load override/load merge, show | display set
  • Configure mode protection: blocked on secondary cluster nodes (RG0 primary is config authority)
  • DHCP server: Kea integration with lease display
  • DHCP relay: Option 82 support
  • Event engine: event-driven automation

Quick Start

make generate           # Generate the retained Rust AF_XDP userspace XDP shim object (post-#1476; no legacy bpf2go)
make build              # Build xpfd daemon (embeds version from git)
make build-ctl          # Build remote CLI client
make build-userspace-dp # Build Rust AF_XDP dataplane binary (requires cargo)
make test               # Run 1020+ tests across 24 packages

Configuration

xpf uses Junos-style configuration syntax:

interfaces {
    trust0 {
        unit 0 {
            family inet {
                address 10.0.1.1/24;
            }
        }
    }
}
security {
    zones {
        security-zone trust {
            interfaces {
                trust0;
            }
            host-inbound-traffic {
                system-services {
                    ssh;
                    ping;
                }
            }
        }
    }
    policies {
        from-zone trust to-zone untrust {
            policy allow-all {
                match {
                    source-address any;
                    destination-address any;
                    application any;
                }
                then {
                    permit;
                }
            }
        }
    }
}

The config supports both hierarchical { } blocks and flat set commands:

set interfaces trust0 unit 0 family inet address 10.0.1.1/24
set security zones security-zone trust interfaces trust0
set security policies from-zone trust to-zone untrust policy allow-all match source-address any destination-address any application any
set security policies from-zone trust to-zone untrust policy allow-all then permit

Management Interfaces

  • Local CLI: run xpfd in a TTY for interactive Junos-style shell
  • Remote CLI: cli -addr <host>:50051 connects via gRPC
  • gRPC API: 48+ RPCs on port 50051 (config, sessions, stats, routes, IPsec, DHCP, cluster)
  • REST API: HTTP on port 8080 (health, Prometheus /metrics, config endpoints)

Performance

  • Userspace dataplane (the runtime path)
    • AF_XDP-based forwarding with per-worker Rust session/NAT/policy/FIB processing
    • Copy or zero-copy mode depending on NIC driver/path behavior
    • Kernel pass-through via cpumap for local and other kernel-owned traffic
    • See docs/userspace-ha-validation.md and docs/userspace-perf-compare.md for current validation and profiling workflow
  • Cluster / control plane
    • Hitless restarts with zero packet loss
    • ~60ms cluster failover (30ms VRRP, ~97ms masterDown interval)
    • Near-instant planned shutdown (priority-0 burst, peer takes over in ~1ms)
  • Historical (retired eBPF dataplane, git history only)
    • 25+ Gbps with native XDP (i40e/ice PF passthrough), 15.6 Gbps with virtio-net

Test Environment

An Incus-based test environment provisions Debian VMs with FRR, strongSwan, and test containers:

# Single VM (standalone firewall)
make test-env-init   # One-time setup
make test-vm         # Create VM
make test-deploy     # Build + deploy + restart service
make test-logs       # View daemon logs

# Two-VM HA cluster (defaults to loss userspace cluster)
make cluster-init    # Create networks + profile
make cluster-create  # Launch xpf-userspace-fw0 + xpf-userspace-fw1 + LAN host
make cluster-deploy  # Rolling deploy: secondary first, then primary (preserves traffic)

Userspace dataplane testing (requires mlx5 NICs on loss cluster):

# Userspace HA cluster
make cluster-deploy
./scripts/userspace-ha-validation.sh --env test/incus/loss-userspace-cluster.env
./scripts/userspace-perf-compare.sh

Cluster Deployment

make cluster-deploy performs a rolling deploy to maintain traffic continuity:

  1. Determines which node is currently secondary
  2. Deploys to the secondary (primary continues forwarding traffic)
  3. Waits for the secondary to sync sessions from the primary
  4. Deploys to the primary (upgraded secondary takes over via VRRP failover)

To deploy to a single node: make cluster-deploy NODE=0 or make cluster-deploy NODE=1.

Test Suite

Test Command Description
Unit tests make test 1020+ Go tests across 24 packages
Connectivity make test-connectivity End-to-end IPv4/IPv6 routing and SNAT
Failover make test-failover iperf3 survives fw0 reboot (session sync + VRRP)
Hard crash make test-ha-crash Force-stop, daemon stop, multi-cycle crash recovery
Restart make test-restart-connectivity Zero packet loss during daemon restart
Private RG ./test/incus/test-private-rg.sh VRRP elimination via private-rg-election

Code Layout

Path Description
bpf/headers/*.h Shared C structs/constants consumed by the retained Rust AF_XDP shim build and userspace-dp parity tests. The legacy bpf/xdp/*.c and bpf/tc/*.c source were deleted in #1476
pkg/config/ Junos parser, AST, typed config, compiler
pkg/cmdtree/ Single source of truth for all CLI command trees
pkg/configstore/ Candidate/active/commit/rollback, atomic DB persistence
pkg/dataplane/ Runtime contracts, retained userspace shim embed/loader, and eBPF/DPDK retirement-error sentinels (#1476/#1525)
pkg/dataplane/userspace/ Go manager for the Rust userspace dataplane
userspace-xdp/ Retained Rust XDP shim that redirects packets into the AF_XDP userspace runtime
pkg/daemon/ Daemon lifecycle, reconciliation, interface management
pkg/cluster/ Chassis cluster HA (state machine, session sync, config sync)
pkg/vrrp/ Native VRRPv3 state machine (30ms RETH advertisements)
pkg/ra/ Embedded RA sender (replaces radvd)
pkg/cli/ Interactive Junos-style CLI
pkg/conntrack/ Session garbage collection (with HA delete sync)
pkg/logging/ Ring buffer reader, event buffer, syslog client
pkg/dhcp/ DHCPv4/DHCPv6 clients
pkg/frr/ FRR config generation + managed section in frr.conf
pkg/networkd/ systemd-networkd .link/.network file generation
pkg/routing/ GRE tunnels, VRFs, XFRM interfaces, route leaking
pkg/ipsec/ strongSwan config + SA queries
pkg/api/ HTTP REST API + Prometheus collector
pkg/grpcapi/ gRPC server + protobuf bindings
pkg/flowexport/ NetFlow v9 exporter
pkg/feeds/ Dynamic address feed fetcher
pkg/dhcpserver/ Kea DHCP server management
pkg/dhcprelay/ DHCP relay with Option 82
pkg/eventengine/ Event-driven automation engine
pkg/rpm/ RPM probe manager
pkg/snmp/ SNMP agent (system + ifTable MIB)
pkg/lldp/ LLDP protocol
proto/xpf/v1/ Protobuf service definition
cmd/xpfd/ Daemon main binary
cmd/cli/ Remote CLI client binary
userspace-dp/ Rust AF_XDP userspace dataplane binary
docs/ Protocol docs, test plans, feature gaps
test/incus/ Test environment scripts and configs

Documentation

See docs/ for detailed design documents:

  • sync-protocol.md — Cluster session sync wire protocol and algorithms
  • fabric-cross-chassis-fwd.md — Fabric link cross-chassis forwarding design
  • ha-cluster.conf — Unified HA cluster config with ${node} variable expansion
  • testing-procedures.md — Test categories, procedures, and debugging tips
  • phases.md — Development phase history (40+ sprints)
  • bugs.md — Bug tracker with root cause analysis
  • optimizations.md — Performance profiling and optimization notes
  • test_env.md — Test topology and validation steps
  • feature-gaps.md — vSRX feature parity tracking
  • userspace-dataplane-architecture.md — Comprehensive userspace AF_XDP dataplane architecture
  • userspace-debug-map.md — Active file/function map for userspace forwarding and debugging
  • xdp-io-uring-userspace-dataplane.md — Original userspace dataplane design document
  • shared-umem-plan.md — Cross-NIC shared UMEM design and validation plan
  • userspace-ha-validation.md — HA failover validation procedures
  • userspace-perf-compare.md — Throughput benchmarking methodology
  • userspace-dnat-plan.md — Destination NAT implementation plan for userspace dataplane
  • userspace-dataplane-gaps.md — Current userspace AF_XDP capability/admission boundary

Requirements

  • Linux kernel 6.12+ (6.18+ recommended for full NAT64 support)
  • Go 1.22+
  • clang/llvm (for generating the retained Rust AF_XDP userspace XDP shim object)
  • Rust stable (for the primary userspace dataplane)
  • FRR (for routing protocol integration)
  • strongSwan (for IPsec, optional)
  • Kea (for DHCP server, optional)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors