In the previous article in this series, we examined the ‘what’ and ‘why’ of front-end Infiniband on a PowerScale cluster. Now we turn our attention to the ‘how’ – i.e. the configuration and management of front-end IB.
The networking portion of the OneFS WebUI has seen some changes in 9.10, and cluster admins now have the flexibility to create either Ethernet or InfiniBand (IB) subnets. Depending on the choice, the interface list, pool and rule details automatically adjust to match the selected link layer type. This means that if Infiniband (IB) is selected, the interface list and pool details will update to reflect settings specific to IB, including a new ‘green pill’ icon to indicate the presence of IB subnets and pools on the external network table. For example:
Similarly, the subnet1 view from the CLI, with the ‘interconnect’ field indicating ‘Infiniband’:
# isi network subnets view subnet1 ID: groupnet0.subnet1 Name: subnet1 Groupnet: groupnet0 Pools: pool-infiniband Addr Family: ipv4 Base Addr: 10.205.228.0 CIDR: 10.205.228.0/23 Description: Initial subnet Gateway: 10.205.228.1 Gateway Priority: 10 Interconnect: Infiniband MTU: 2044 Prefixlen: 24 Netmask: 255.255.254.0 SC Service Addrs: 1.2.3.4 SC Service Name: cluster.tme.isilon.com VLAN Enabled: False VLAN ID: -
Alternatively, if Ethernet is chosen, the relevant subnet, pool, and rule options for that topology are displayed.
This dynamic adjustment ensures that only the relevant options and settings for the configured network type are displayed, making the configuration process more intuitive and streamlined.
For example, to create an IB subnet under Cluster management > Network configuration > External network > Create subnet:
Or from the CLI:
# isi network subnets create groupnet0.subnet1 ipv4 255.255.254.0 --gateway --gateway-priority 10 10.205.228.1 --linklayer infiniband
Similarly, editing an Infiniband subnet:
Note that an MTU configuration option is not available when configuring an Infiniband subnet. Also, the WebUI displays a banner warning that NFS over Infiniband will operate at a reduced speed if NFS over RDMA has not already been enabled.
In contrast, editing an Ethernet subnet provides the familiar MTU frame-size configuration options:
A font-end network IP pool can be easily created under a subnet. For example from the CLI, using the ‘<groupnet>.<subnet>.<pool>’ notation:
# isi network pools create groupnet0.infiniband1.ibpool1
Or via the WebUI:
Adding an Infiniband subnet is permitted on any cluster, regardless of its network configuration. However, the above messages will be displayed if attempting to create a pool under an Infiniband subnet on a cluster or node without any configured front-end IB interfaces.
From the CLI, the ‘isi_hw_status’ utility can be used to easily verify a node’s front and back-end networking link layer types. For example, take the following F710 configuration:
The ‘isi_hw_status’ CLI command output also confirms the front-end network ‘FEType’ parameter, in this case as ‘Infiniband’:
# isi_hw_status SerNo: FD7LRY3 Config: PowerScale F710 ChsSerN: FD7LRY3 ChsSlot: n/a FamCode: F ChsCode: 1U GenCode: 10 PrfCode: 7 Tier: 16 Class: storage Series: n/a Product: F710-1U-Dual-512GB-2x1GE-2x100GE QSFP28-2x200GE QSFP56-38TB SSD HWGen: PSI Chassis: POWEREDGE (Dell PowerEdge) CPU: GenuineIntel (2.60GHz, stepping 0x000806f8) PROC: Dual-proc, 24-HT-core RAM: 549739036672 Bytes Mobo: 071PXR (PowerScale F710) NVRam: NVDIMM (SDPM VOSS Module) (8192MB card) (size 8589934592B) DskCtl: NONE (No disk controller) (0 ports) DskExp: None (No disk expander) PwrSupl: PS1 (type=AC, fw=00.1D.9C) PwrSupl: PS2 (type=AC, fw=00.1D,9C) NetIF: bge0,bge1,lagg0,mce0,mce1,mce2,mce3,mce4 BEType: Infiniband FEType: 200GigE LCDver: IsiVFD3 (Isilon VFD V3) Midpln: NONE (No FCB Support) Power Supplies OK
In contrast, the back-end network on this F710 is 200Gb Ethernet, as reported by the ‘BEType’ parameter.
From the node cabling perspective, the interface assignments on the rear of the F710 are as follows:
Additionally, the ‘mlxfwmanager’ CLI utility can be helpful for gleaning considerably more detail on a node’s NICs, including firmware versions, MAC address, GUID, part number, etc. For example:
# mlxfwmanager Querying Mellanox devices firmware ... Device #1: ---------- Device Type: ConnectX6 Part Number: ORRM24_Ax Description: Nvidia ConnectX-6 VPI adapter card; HDR IB (200Gb/s) and 200GbE; dual-port QSFP56; PCIe4.0 x16 PSID: DEL0000000052 PCI Device Name: pci0:13:0:0 Base MAC: 59a2e18dfdac Versions: Current Available FW 20.39.1002 N/A PXE 3.7.0201 N/A UEFI 14.32.0012 N/A Status: No matching image found Device #2: ---------- Device Type: ConnectX6DX Part Number: OF6FXM_08P2T2_Ax Description: Mellanox ConnectX-6 Dual Port 100 GbE QSFP56 Network Adapter PSID: DEL0000000027 PCI Device Name: pci0:139:0:0 Base GUID: e8ebd30300060684 Base MAC: e8ebd3060684 Versions: Current Available FW 22.36.1010 N/A PXE 3.6.0901 N/A UEFI 14.29.0014 N/A Status: No matching image found Device #3: ---------- Device Type: ConnectX6 Part Number: ORRM24_Ax Description: Nvidia ConnectX-6 VPI adapter card; HDR IB (200Gb/s) and 200GbE; dual-port QSFP56; PCIe4.0 x16 PSID: DEL0000000052 PCI Device Name: pci0:181:0:0 Base MAC: a088c2ec499e Base GUID: a088c20300ec499a Versions: Current Available FW 22.39.1002 N/A PXE 3.7.0201 N/A UEFI 14.32.0012 N/A Status: No matching image found
In the example above, ‘Device #1’ is the back-end NIC, ‘Device #2’ is the 100Gb Ethernet ConnectX6 DX NIC in the PCIe4 slot, and ‘Device #3’ is the front-end Infiniband ConnectX6 VPI NIC in the primary PCIe5 slot.
There are a couple of caveats to be aware of when using front-end Infiniband on F710 and F910 node pools:
- Upon upgrade to OneFS 9.10, any front-end Infiniband interfaces will only be enabled once the new release is committed.
- Network pools created within Infiniband subnets will have their default ‘aggregation mode’ set to ‘unset’. Furthermore, this parameter will not be modifiable.
- Since VLANs are not supported on Infiniband, OneFS includes validation logic to prevent this.