Skip to content
This repository was archived by the owner on Feb 4, 2019. It is now read-only.
Allen Hubbe edited this page Oct 21, 2016 · 9 revisions

NTRDMA

Welcome to the NTRDMA wiki! Let's get started.

Overview

The NTRDMA driver was written to provide an efficient RDMA implementation over PCIe Non-Transparent Bridge hardware.

The Drivers

The NTRDMA driver has several module dependencies in the kernel.

  • ntb: PCIe Non-Transparent Bridge bus driver
  • ntb_hw_intel: Intel NTB hardware driver
  • ioatdma: Intel IO Acceleration Technology DMA engine driver
  • ntc: Abstract "non-transparent channel" bus driver
  • ntc_tcp: TCP NTC driver for development without hardware
  • ntc_ntb_msi: NTB NTC driver with some hardware-specific workarounds
  • ntrdma: Infiniband verbs driver for PCIe non-transparent bridge
  • ib_uverbs: Infiniband user space verbs support driver

Quick Start: User Space Library and Dependencies

# Red Hat
sudo yum install libibverbs-devel libibverbs-util
# Fedora
sudo dnf install libibverbs-devel libibverbs-util
# Debian
sudo apt-get install libibverbs-dev ibverbs-utils
git clone https://github.com/ntrdma/libntrdma.git
cd libntrdma
libtoolize
aclocal
autoconf
autoheader
automake
./configure --libdir=/usr/lib64
make
sudo make install
sudo cp ntrdma.driver /etc/libibverbs.d

Optional, for basic performance testing, download and follow the included instructions to install qperf and netperf.

wget https://www.openfabrics.org/downloads/qperf/qperf-0.4.9.tar.gz
wget ftp://ftp.netperf.org/netperf/netperf-2.7.0.tar.bz2

Quick Start: NTRDMA over TCP

The NTRDMA driver can be loaded and used without any NTB hardware present. This allows driver and application development and functional testing on generic hardware or virtual machine instances.

Installation

git clone https://github.com/ntrdma/ntrdma.git
cd ntrdma
make olddefconfig
cat >> .config <<EOF
CONFIG_NTC=m
CONFIG_NTC_TCP=m
CONFIG_NTRDMA=m
CONFIG_INFINIBAND=m
CONFIG_INFINIBAND_USER_ACCESS=m
EOF
make olddefconfig
make all
sudo make modules_install
sudo make install
# boot into the new kernel
sudo reboot

Module Loading

Replace 192.168.0.7 with an ip address already configured on one of the computers. Both the client and server side configuration should specify the server-side ip address.

sudo tee /etc/modprobe.d/ntc_tcp.conf <<EOF
options ntc_tcp config='s:192.168.0.7:3000'
EOF
sudo modprobe ntc_tcp
sudo modprobe ntrdma
sudo modprobe ib_uverbs
sudo tee /etc/modprobe.d/ntc_tcp.conf <<EOF
options ntc_tcp config='c:192.168.0.7:3000'
EOF
sudo modprobe ntc_tcp
sudo modprobe ntrdma
sudo modprobe ib_uverbs

Quick test

ibv_rc_pingpong
ibv_rc_pingpong 192.168.0.7

Quick Start: NTRDMA over PCIe NTB

Hardware setup

TODO: bios configuration for ntb

Installation

git clone https://github.com/ntrdma/ntrdma.git
cd ntrdma
make olddefconfig
cat >> .config <<EOF
CONFIG_DMA_ENGINE=y
CONFIG_INTEL_IOATDMA=m
CONFIG_NTB=m
CONFIG_NTB_INTEL=m
CONFIG_NTC=m
CONFIG_NTC_NTB_MSI=m
CONFIG_INFINIBAND=m
CONFIG_INFINIBAND_USER_ACCESS=m
CONFIG_NTRDMA=m
EOF
make olddefconfig
make all
sudo make modules_install
sudo make install
# boot into the new kernel
sudo reboot

Module Loading

The ntc_ntb_msi module requires that the ntb_hw_intel module has been loaded with specific parameters. The parameters are the same on both sides.

sudo tee /etc/modprobe.d/ntb_hw_intel.conf <<EOF
# mw configured for ntb_transport
#options ntb_hw_intel b2b_mw_idx=-1 b2b_mw_share=0

# mw configured for ntb_perf
#options ntb_hw_intel b2b_mw_idx=0 b2b_mw_share=0

# mw configured for ntrdma
options ntb_hw_intel b2b_mw_idx=0 b2b_mw_share=1
options ntb_hw_intel xeon_b2b_usd_bar4_addr64=0 xeon_b2b_dsd_bar4_addr64=0
options ntb_hw_intel no_msix=1

#debugging
#options ntb_hw_intel dyndbg=+pmfl
EOF

sudo modprobe ioatdma
sudo modprobe ntb_hw_intel
sudo modprobe ntc_ntb_msi
sudo modprobe ntrdma
sudo modprobe ib_uverbs

Quick test

ibv_rc_pingpong
ibv_rc_pingpong 192.168.0.7

RDMA Basic Performance Testing

Performance Testing with Qperf

numactl -m1 -N1 -- qperf
numactl -N 1 -m 1 -- qperf peer rc_lat rc_rdma_write_lat rc_rdma_read_lat rc_bw rc_rdma_write_bw rc_rdma_read_bw

In the above commands, numactl will force the process to execute and allocate memory closest to "node 1", which is usually the second socket of a dual socket motherboard. If there are two non-transparent bridges enabled, usually the second one appears first in the list of rdma devices. Try changing 1 to 0 on the client and server side if you have a different hardware configuration.

Ethernet Basic Performance Testing

The NTRDMA driver creates a network device as well as an RDMA device. After loading the NTRDMA driver, look for some new network interfaces to configure. Here, we will assume the new device is eth0.

Configure Interfaces

To do most anything with the network interfaces, we need to assign them ip addresses.

sudo ip addr add 169.254.0.1/24 dev eth0
sudo ip addr add 169.254.0.2/24 dev eth0
ping 169.254.0.1

For optimal throughput performance, increase the mtu to something much larger than the default Ethernet mtu. NTRDMA does not place any maximum limit on the mtu size.

sudo ip link set dev eth0 mtu 4096

After setting the mtu on both sides, flush out all the previously allocated network buffers of the wrong size. This can be done by flooding the link with small icmp ping traffic for a few seconds.

sudo ping -f 169.254.0.1

Throughput Testing with iperf3

numactl -N 1 -m 1 -- iperf3 -s
numactl -N 1 -m 1 -- iperf3 -c 169.254.0.1

Latency Testing with Netperf

numactl -m1 -N1 -- netserver -D
numactl -m1 -N1 -- netperf -H 169.254.0.1 -t TCP_RR

Troubleshooting

If something isn't working, start by turning on dynamic debugging

# Increase the console log level
sudo sysctl kernel.printk=8

# Enable debugging when loading modules
sudo tee -a /etc/modprobe.d/ntb_hw_intel.conf <<EOF
options ntb_hw_intel dyndbg=+pmfl
EOF
sudo tee -a /etc/modprobe.d/ntc_ntb_msi.conf <<EOF
options ntc_ntb_msi dyndbg=+pmfl
EOF
sudo tee -a /etc/modprobe.d/ntc_tcp.conf <<EOF
options ntc_tcp dyndbg=+pmfl
EOF
sudo tee -a /etc/modprobe.d/ntrdma.conf <<EOF
options ntrdma dyndbg=+pmfl
EOF

# Enable degugging on already loaded modules
sudo tee /sys/kernel/debug/dynamic_debug/control <<EOF
module ntb_hw_intel +pmfl
module ntc_ntb_msi +pmfl
module ntc_tcp +pmfl
module ntrdma +pmfl
EOF

# Monitor output on the system console or
sudo dmesg

Detailed state of the drivers can also be found in the following directories:

  • /sys/kernel/debug/ntrdma/
  • /sys/kernel/debug/ntc_ntb_msi/
  • /sys/kernel/debug/ntb_hw_intel/

TODO: copy from various emails