OneFS Routing and SBR – Part 2

As we saw in the previous article in this series, the primary effect of OneFS source-based routing (SBR) is helping to ensure that the cluster replies on the same interface as the ingress packet came in on. This happens automatically in conjunction with FlexNet’s NIC affinity.

Each of a cluster’s front-end subnets contains one or more pools of IP addresses which can be bound to external interfaces of nodes in the cluster. Pools also bind to a groupnet and associated access zone for multi-tenant authentication management, etc.

A cluster’s network pools each include a range of addresses, a list of interfaces, an aggregation mode, and a list of static routes. Static Routes can be configured on a per-pool basis. Unlike SBR, static routes provide a mechanism to force all traffic for a specific destination to use a specific address and a specific gateway. This means static routes, unlike SBR, can support client services without making those services zone-aware.

OneFS SBR will often simply just do the right thing with little or no additional configuration required. Therefore it is generally the preferred option, and indeed is the default for new clusters running OneFS 9.8 or later. That said, in order for SBR to create its IPFW  rule for a gateway, there must have been a session initiated from the source subnet in order to initiate it. If no traffic has been originated or received from a network that’s unreachable via the default gateway, OneFS will transmit traffic it originates through the default gateway. Static routes are an option in this case.

Static routes are also an alternative when SBR cannot do what is required – for example, if different subnets must be treated differently, or a customer actually requires the route from A to B to be different from the route from B to A.

Static routes can be easily added from the CLI with the following syntax:

 # isi network pools <pool> --static-routes<subnet_ip_address>/<CIDR_netmask>-<gateway_ip_address>

Where the first address and the integer form a netmask, and the second address is a gateway. Static routes are configured on a per-pool basis. For example:

# isi network pools groupnet0.subnet0.pool0 -–static-routes 10.30.1.0/22-10.30.1.1

Similarly, an individual static route can be removed as follows:

# isi network pools groupnet0.subnet0.pool0 –-remove-static-routes 10.30.1.0/22-10.30.1.1

Static routes are not mutually exclusive to SBR, but they do operate slightly differently when SBR is enabled. SBR just changes the way an egress packet is handled. Instead of matching a packet to a route based off the destination IP in the packet header, SBR uses the source IP of the packet instead.

Before changing the current SBR state on a cluster, the following CLI syntax can be used to confirm whether there are static routes configured in any IP address pools:

# isi network pools list –v | grep -i routes

If needed, all or a pool’s static routes can be easily removed as follows:

# isi network pools modify <pool_id> --clear-static-routes

When SBR parses the ipfw rule list in order to set the route, static routes take priority and are evaluated first. Under SBR, the narrowest route is preferred. This means that a CIDR /30 route that matches will be selected before a matching /28, etc. If no match is found, SBR then tries the subnet routes.

Clearly, static routes do have some notable limitations. By definition, they would need to include every destination address range in order to properly direct traffic – and this may be a large and changing set of information. Additionally, static routes can only direct traffic based on remote IP address. If multiple workflows use the same remote IP addresses, static routing cannot treat them differently.

Take the following multi-subnet topology example, where a client is three hops from a PowerScale cluster:

If the source IP address and the destination IP address both reside within the same subnet (ie. within the same ‘layer 2’ broadcast domain), the packet goes directly to the client’s IP address. Conversely, if the destination IP address is in a different subnet from the source, the packet is sent to the next-hop gateway.

In the above example, the client initiates a connection to the cluster at IP address 10.30.1.200:

First, the client determines that the destination IP address is not on its local network, and that it does not have a static route defined for that address. It then sends the packet to its default gateway (Gateway C) for more processing.

Next, the router at gateway C receives the client’s packet, examines the destination IP in the packet header, and determines that it has a route to the destination through router A at 10.30.1.1.

Since Router A has a direct connection on the 100GbE subnet to the destination IP address, it sends the packet directly to the cluster’s 10.30.1.200 interface.

At this point, OneFS must send a response packet to the client.

1. If SBR is disabled, the node determines that the client (10.2.1.50) is not on the same subnet and does not have a static route defined for that address. OneFS determines which gateway it must send the response packet to based upon the routing table.

Gateways with higher priority (lower value) have precedence over those with lower priority (higher value). For example, 1 has a higher priority than 10. The PowerScale node has one default gateway, which is the highest priority subnet the node is configured in. Since there is no static route configured on the node, OneFS chooses the default gateway 10.10.1.1 (router B) via the 10GbE interface.

The reply packet sent by the 10GbE interface has a source IP header of 10.30.1.200. Note that some firewalls or packet filters may interpret this as packet ‘spoofing’ and automatically block this type of behavior. Additionally, perceived performance asymmetry may also be an issue, since the connection may be bandwidth constrained because of the 10GbE link. In this case, a user may anticipate 100GbE performance but in actuality will limited to only a 10GbE connection.

2. Conversely, if SBR is enabled, the cluster node’s routing decisions are no longer predicated on the client’s destination IP address. Instead, OneFS parses the egress packet source IP header, then sends the packet via the gateway associated with the source IP subnet.

The node’s reply packet has a source IP address of 10.30.1.200 and, as such, the SBR routing rules indicate the preferred gateway is A (10.30.1.1) on the 100GbE subnet. When the response reaches gateway A, it is routed back to Gateway C across the core network to the client.

Note, however, that SBR will not override any statically configured routes. If SBR is enabled and a static route is created, a new ipfw rule is added. Since SBR only acts upon ‘reply’ packets, any traffic initiated by the node is unaffected. For example, when a node contacts a DNS or AD server, traditional routing rules (as though SBR is disabled) apply. Also, be aware that enabling SBR is a global (cluster-wide) action, and OneFS does not currently allow for SBR configuration at a groupnet, subnet, or pool-level granularity.

In addition to SBR, a number of other FlexNet networking configurations are either cluster-wide, or effectively amount to it:

Networking Component Description
DNS DNS, including the DNS caching daemon, operates cluster-wide.
Default gateway While the default gateway as a routing mechanism may appear to be subnet-specific in the UI, it behaves globally.
NTP The network time protocol (NTP) is configured and runs global to maintain time synchronization between the cluster’s nodes, domain controllers, and other servers and networked devices.
SBR While SBR is a cluster-wide setting, the routing changes will be specific to the routing tables on each individual node (each node’s routing table will be specific to the network pools it is a part of).

Note that, being global configuration, enabling SBR can affect other workflows as well.

Within Flexnet, each subnet has its own address space, with is specified by a base and netmask, gateway, VLAN tag, SmartConnect service address, and aggregation options (DSR return addresses).

While SBR is a cluster-wide setting, though the routing changes will be specific to the rules/routing tables on each individual node (each node’s routing table will be specific to the network pools it is a part of).

One quirk of subnet configuration is that, while each subnet can have a different default gateway configured, normally OneFS only uses the highest priority gateway configured in any of its subnets – falling back to lower-priority only if it is unreachable.

SBR aims to mitigate this idiosyncrasy, enforcing subnet settings by automatically creating IPFW rules based on subnet settings. This allows connections associated with a given subnet to use the specified gateway for that subnet, but only for connections bound to a specific local address within that subnet. This means they work only for incoming connections or outgoing connections that are made in a tenant-aware way; the common practice of clients binding to INADDR_ANY and letting the network stack choose which local address to use prevents SBR from working. Most client services running under OneFS (e.g. integration with authentication servers like LDAP and AD) therefore cannot currently use SBR.

OneFS Routing and SBR

The previous article on this topic generated several questions, which suggested that a more thorough exploration of OneFS source-based routing (SBR) is likely warranted. So here goes…

At its essence, network routing is the process of selecting a path for data traffic, either within a network or traversing multiple networks. The aim is to endure efficient data flow across subnets, while maintaining bandwidth and minimizing congestion. Routers, layer 3 switches, multi-homed system, etc, make routing decisions based on packet header addresses and routing tables, which record the paths packets should take to reach their destinations.

IP packet headers have the following form, with the source and destination addresses located towards the end of the header section, before the packet’s payload.

Routing is typically either static, using manually enter routing statements and rules, or dynamic, via routing protocols such as RIP, OSPF, etc.

While the nomenclature might suggest that OneFS source-based routing would route traffic based on a source IP address, instead SBR actually operates by dynamically creating per-subnet default routes. The gateway is derived from the subnet configuration, and, as such, gateways need to be defined for each subnet.

New cluster deployments running OneFS 9.8 and automatically have SBR enabled, whereas legacy clusters upgrading to 9.8 preserve their existing SBR configuration, whether on or off. While SBR is disabled by default in OneFS 9.7 and earlier releases, it can, if desired, be easily enabled from either the CLI or WebUI.

SBR is configured globally and, as such, is either on or off across the entire cluster and its network pools and subnets. OneFS 9.7 and earlier supports only the IPv4 protocol, whereas OneFS 9.8 and later also accommodate IPv6 subnets.

SBR can be instantly enabled on a PowerScale cluster by running the following CLI command:

# isi network external modify --sbr 1

# isi network external view | grep -i source

Source Based Routing: True

Or from the WebUI under Cluster management > Network configuration > Settings:

Similarly, SBR can be disabled as follows:

# isi network external modify --sbr 0

# isi network external view | grep -i source

Source Based Routing: False

Under the hood, SBR uses the FreeBSD ‘ipfw’ utility (as does the OneFS firewall) to record and manage its routing rules.

For example, with SBR disabled, querying ipfw on a cluster shows a single ‘any to any’ rule:

# isi network external view | grep -i source

Source Based Routing: False

# ipfw show

65535 11839927994 7560033188891 allow ip from any to any

By way of contrast, when SBR is enabled, a number of new, higher priority ‘allow’ rules for each NIC and gateway ‘fwd’ rules are added above the ‘any to any’ rule:

# isi network external view | grep -i source

Source Based Routing: True

# ipfw show

60000          16         33391 allow ip from any to any via lo0 out

60001           0             0 allow ip from any to ff02::1:ff00:0/104 out

60002      116082     112914089 allow ip from any to any via mce0 out

60003      217150     138771611 allow ip from any to any via mce1 out

60004           0             0 allow ip from any to any via ue0 out

60005           0             0 allow ip from any to fe80::/10 out

60006           0             0 allow ip from any to ff02::1 out

62000           0             0 fwd 2620:0:170:7c0f::1 ip from 2620:0:170:7c0f::/64 to not 2620:0:170:7c0f::/64 out

62001         121         94788 fwd 10.30.1.1 ip from 10.30.1.0/22 to not 10.30.1.0/22 out

65535 11842048952 7561181109905 allow ip from any to any

In this example node’s case, on a cluster running OneFS 9.8, there is one IPv4 subnet and one IPv6 subnet:

# isi network subnets list

ID                Subnet    Gateway|Priority      Pools     SC Service Addrs     Firewall Policy

------------------------------------------------------------------------------------------------

groupnet0.subnet0 10.30.1.0/22   10.30.1.1|10     pool0     10.30.1.100-10.30.1.110               default_subnets_policy

groupnet0.subnet1 2620:0:170:7c0f::/64 2620:0:170:7c0f::1|20 ipv6pool  2620:0:170:7c0f::4-2620:0:170:7c0f::4 default_subnets_policy

------------------------------------------------------------------------------------------------

Total: 2

So enabling SBR on this cluster results in the creation of a ‘fwd’ rule for each subnet:

# ipfw show | grep fwd

62000          33          2640 fwd 2620:0:170:7c0f::1 ip from 2620:0:170:7c0f::/64 to not 2620:0:170:7c0f::/64 out

62001      145794     140490002 fwd 10.30.1.1 ip from 10.30.1.0/22 to not 10.30.1.0/22 out

Please note that the ‘ipfw’ command should not be used to modify the OneFS routing rules (or firewall table) directly!

By way of a OneFS packet routing example, take the following network topology where three clients, each on separate subnets, are connecting to a PowerScale cluster:

The default gateway is the path for all traffic intended for clients not on the local subnet and not covered by a routing table entry. Utilizing SBR does not negate the need for a default gateway, since SBR effectively overrides the default gateway (but not static routes).

Note that SBR is not simple packet reflection. Instead, it’s the dynamic creation of per-subnet default routes. The router used as the gateway is derived from the FlexNet subnet definitions within the subnet configuration. As such, a gateway needs to be specified for each subnet.

 In addition to a gateway address, each subnet also has a defined priority. For example:

Or via the CLI:

# isi network subnets modify groupnet0.subnet1 --gateway 10.30.1.1 --gateway-priority 10

With SBR disabled, the highest priority gateway (ie. the gateway with the lowest reachable value) is used as the default route.

Once SBR is enabled, OneFS examines the FlexNet config for each subnet, and then creates ipfw rules that look at the source IP address from the cluster side and force the next-hop to be the gateway IP defined for the subnet which contains that IP address.

In the previous example with three clients on separate subnets connecting to a cluster, when traffic arrives from a subnet that is unreachable via the default gateway, the following routing rules will be added via ipfw:

The mechanism for adding ipfw rules is stateless, and SBR relies on the source IP address that transmits traffic to the cluster.

A session must be initiated from the source subnet for a corresponding ipfw rule to be created. Also, unless the cluster has received traffic that originated from a subnet has no route to the default gateway, OneFS transmits traffic it originates through the default gateway.

In the next article in this series. We’ll take a look at SBR and its interrelationship with static routes and other OneFS networking components.