PowerScale Multipath Client Driver Configuration and Management

As discussed earlier in this series of articles, the multipath driver allows Linux clients to mount a PowerScale cluster’s NFS exports over NFS v3, NFS v4.1, or NFS v4.2 over RDMA.

The principal NFS mount options of interest with the multipath client driver are:

Mount option Description
nconnect Allows the admin to specify the number of TCP connections the client can establish between itself and the NFS server. It works with remoteports to spread load across multiple target interfaces.
localports Mount option that allows a client to use its multiple NICs to multiplex I/O.
localports_failover Mount option allowing transports to temporarily move from local client interfaces that are unable to serve NFS connections.
proto The underlying transport protocol that the NFS mount will use. Typically, either TCP or RDMA.
remoteports Mount option that allows a client to target multiple servers/ NICS to multiplex I/O. Remoteports spreads the load to multiple file handles instead of taking a single file handle to avoid thrashing on locks.
version The version of the NFS protocol that is to be used. The multipath driver supports NFSv3, NFSv4.1, and NFSv4.2. Note that NFSv4.0 is unsupported.

These options allow the multipath driver to be configured such that an IO stream to a single NFS mount can be spread across a number of local (client) and remote (cluster) network interfaces (ports). Nconnect allows you to specify how many socket connections you want to open to each combination of local and remote ports.

Below are some example topologies and NFS client configurations using the multipath driver.

  1. Here, NFSv3 with RDMA is used to spread traffic across all the front-end interfaces (remoteports) on the PowerScale cluster:
# mount -o proto=rdma,port=20049,vers=3,nconnect=18,remoteports=10.231.180.95- 10.231.180.98 10.231.180.98:/ifs/data /mnt/test

The above client NFS  mount configuration would open 5 socket connections to two of the ‘remoteports’ (cluster node) IP address specified and 4 socket connections to the other two.

As you can see, this driver can be incredibly powerful given its ability to multipath comprehensively. Clearly, there are many combinations of local and remote ports and socket connections that can be configured.

  1. This next example uses NFSv3 with RDMA across three ‘localports’ (client) and with 8 socket connections to each:
# mount -o proto=rdma,port=20049,vers=3,nconnect=24,localports=10.219.57.225- 10.219.57.227, remoteports=10.231.180.95-10.231.180.98 10.231.180.98:/ifs/data /mnt/test

  1. This final config specifies NFSv4.1 with RDMA using a high connection count to target multiple nodes (remoteports) on the cluster:
# mount -t nfs –o proto=rdma,port=20049,vers=4.1,nconnect=64,remoteports=10.231.180.95- 10.231.180.98 10.231.180.98:/ifs/data /mnt/test

In this case, 16 socket connections will be opened to each of the four specified remote (cluster) ports for a total of 64 connections.

Note that the Dell multipath driver has a hard-coded limit of 64 nconnect socket connections per mount.

Behind the scenes, the driver uses a network map to store the local and remote port and nconnect socket configuration.

The multipath driver supports both IPv4 and IPv6 addresses for local and remote port specification. If a specified IP address is unresponsive, the driver will remove the offending address from its network map.

Note that the Dell multipath driver supports NFSv3, NFSv4.1, and NFSv4.2 but is incompatible with NFSv4.0.

On Ubuntu 20.04, for example, NFSv3 and NFSv4.1 are both fully-featured. In addition, the ‘remoteports’ behavior is more obvious with NFSv4.1 because the client state is tracked:

# mount -t nfs -o vers=4.1,nconnect=8,remoteports=10.231.180.95-10.231.180.98 10.231.180.95:/ifs /mnt/test

And from the cluster:

cluster# isi_for_array 'isi_nfs4mgmt'

ID                Vers   Conn   SessionId     Client Address Port   O-Owners      Opens  Handles L-Owners

2685351452398170437  4.1    tcp    5           10.219.57.229  872    0     0     0     0

5880148330078271256   4.1    tcp    11          10.219.57.229  680    0     0     0     0

2685351452398170437   4.1    tcp    5           10.219.57.229  872    0     0     0     0

6230063502872509892   4.1    tcp    1           10.219.57.229  895    0     0     0     0

6786883841029053972   4.1    tcp    1           10.219.57.229  756    0     0     0     0

With a single mount, the client has created many connections across the server.

This also works with RDMA:

# mount -t nfs -o vers=4.1,proto=rdma,nconnect=4 10.231.180.95:/ifs /mnt/test

# isi_for_array 'isi_nfs4mgmt’

ID                 Vers   Conn   SessionId     ClientAddress  Port   O-Owners      Opens  Handles L-Owners

6786883841029053974   4.1    rdma   2           10.219.57.229  54807  0     0     0     0

6230063502872509894   4.1    rdma   2           10.219.57.229  34194  0     0     0     0

5880148330078271258   4.1    rdma   12          10.219.57.229  43462  0     0     0     0

2685351452398170443  4.1    rdma   8           10.219.57.229  57401  0     0     0     0




0

Once the Linux client NFS mounts have been configured, their correct functioning can be easily verified by generating read and/or write traffic to the PowerScale cluster and viewing the OneFS performance statistics. Running a load generator like ‘iozone’ is a useful way to generate traffic on the NFS mount. The iozone utility can be invoked with its ‘-a’ flag to select full automatic mode. This produces output that covers all tested file operations for record sizes of 4k to 16M for file sizes of 64k to 512M.

# iozone -a

For example:

From the PowerScale cluster, if the multipath driver is working correctly the ‘isi statistics client’ CLI command output will show the Linux client connecting and generating traffic to all of the cluster’s nodes that are specified in the’ –remoteports’ option for the NFS mount.

# isi statistics client

For example:

Alternatively, from the client, the ‘netstat’ CLI command can be used from the Linux client to query the number of TCP connections established:

# netstat -an | grep 2049 | grep EST | sort -k 5

On Linux systems, the ‘netstat’ command line utility typically requires the ‘net-tools’ package to be installed.

Since NFS is a network protocol, when it comes to investigating, troubleshooting, and debugging multipath driver issues, one of the most useful and revealing troubleshooting tools is a packet capturing device or sniffer. These provide visibility into the IP packets as they are transported across the Ethernet network between the Linux client and the PowerScale cluster nodes.

Packet captures (PCAPs) of traffic between client and cluster can be filtered and analyzed by tools such as Wireshark to ensure that requests, authentication, and transmission are occurring as expected across the desired NICs.

Gathering PCAPs is best performed on the Linux client-side and on the multiple specified interfaces if the client is using the ‘–localports’ NFS mount option.

In addition to PCAPs, the following three client-side logs are another useful place to check when debugging a multipath driver issue:

Log file Description
/var/log/kern.log Most client logging is written to this log file
/var/log/auth.log Authentication logging
/var/log/messages Error level message will appear here

 

Verbose logging can be enabled on the Linux client with the following CLI syntax:

# sudo rpcdebug -m nfs -s all

Conversely, the following command will revert logging back to the default level:

# sudo rpcdebug -m nfs -c all

 

Additionally, a ‘dellnfs-ctl’ CLI tool comes packaged with the multipath driver module and is automatically available on the Linux client after the driver module installation.

The command usage syntax for the ‘dellnfs-ctl’ tool is as follows:

# dellnfs-ctl

syntax: /usr/bin/dellnfs-ctl [reload/status/trace]

Note that to operate it in trace mode, the ‘dellnfs-ctl’ tool requires the ‘trace-cmd’ package to be installed.

For example, the trace-cmd package can be installed on an Ubuntu Linux system using the ‘apt install’ package utility command:

# sudo apt install trace-cmd

The current version of the dellnfs-ctl tool, plus the associated services and kernel modules, can be queried with the following CLI command syntax:

# dellnfs-ctl status

version: 4.0.22-dell

kernel modules: sunrpc rpcrdma compat_nfs_ssc lockd nfs_acl auth_rpcgss rpcsec_gss_krb5 nfs nfsv3 nfsv4

services: rpcbind.socket rpcbind rpc-gssd rpc_pipefs: /run/rpc_pipefs

With the ‘reload’ option, the ‘dellnfs-ctl’ tool uses ‘modprobe’ to reload and restart the NFS RPC services. For example:

# dellnfs-ctl reload

dellnfs-ctl: stopping service rpcbind.socket

dellnfs-ctl: umounting fs /run/rpc_pipefs

dellnfs-ctl: unloading kmod nfsv3

dellnfs-ctl: unloading kmod nfs

dellnfs-ctl: unloading kmod nfs_acl

dellnfs-ctl: unloading kmod lockd

dellnfs-ctl: unloading kmod compat_nfs_ssc

dellnfs-ctl: unloading kmod rpcrdma

dellnfs-ctl: unloading kmod sunrpc

dellnfs-ctl: loading kmod sunrpc

dellnfs-ctl: loading kmod rpcrdma

dellnfs-ctl: loading kmod compat_nfs_ssc

dellnfs-ctl: loading kmod lockd

dellnfs-ctl: loading kmod nfs_acl

dellnfs-ctl: loading kmod nfs

dellnfs-ctl: loading kmod nfsv3

dellnfs-ctl: mounting fs /run/rpc_pipefs

dellnfs-ctl: starting service rpcbind.socket

dellnfs-ctl: starting service rpcbind

In the event that a problem is detected, it is important to run the reload script before uninstalling or reinstalling the driver. Because the script runs modprobe, the kernel modules that are modified by the driver will be reloaded.

Note that this will affect existing NFS mounts. As such, any active mounts will need to be re-mounted after a reload is performed.

 

As we have seen throughout this series of articles, the core benefits of the Dell multipath driver include:

  • Better single NFS mount performance.
  • Increased bandwidth for single NFS mount with multiple R/W streaming files.
  • Improved performance for heavily used NIC’s to a single PowerScale Node.

The Dell multipath driver allows NFS clients to direct I/O to multiple PowerScale nodes through a single NFS mount point for higher single-client throughput. This enables Dell to deliver the first Ethernet storage solution validated on NVIDIA’s DGX SuperPOD.

Leave a Reply

Your email address will not be published. Required fields are marked *