OneFS QoS and DSCP Tagging

As more applications contend for shared network links with finite bandwidth, ensuring Quality of Service (QoS) becomes more critical. Each application or workload can have varying QoS requirements to deliver not only service availability, but also an optimal client experience. Associating each app with an appropriate QoS marking helps provide some traffic policing, by allowing certain packets to be prioritized across a shared network, all while meeting SLAs.

QoS can be implemented using a variety of methods, but the most common is through a Differentiated Services Code Point, or DSCP, which specifies a value in the packet header that maps to a traffic effort level.

OneFS 9.9 introduces support for DSCP marking, and the configuration is cluster-wide, and based on the class of network traffic. Once configured, OneFS inserts the DSCP marking in the Traffic Class or Type of Service fields of the IP packet header, and away you go.

The pertinent part of each IPv4 and IPv6 packet header is as follows:

OneFS QoS tagging separates network traffic into four default classes, each with an associated DSCP value, plus configurable source and destination ports. The four classes OneFS provides are ‘transactional’, ‘network management’, ‘bulk data’, and ‘catch all’:

Class  Traffic   Default DSCP Value  Source Ports  Destination Ports 
Transactional File Access and Sharing Protocols:

NFS, FTP, HTTPS data, HDFS, S3, RoCE

Security and Authentication Protocols:

Kerberos, LDAP, LSASS, DCE/RPC

RPC and Inter-Process Communication Protocols:

rpc.bind, mountd, statd, lockd, quotd, mgmntd

Naming Services Protocols: NetBIOS, Microsoft-DS

18 20, 21, 80, 88, 111, 135 137, 138, 139, 300, 302, 304, 305, 306, 389, 443, 445, 585, 636, 989, 990, 2049, 3268, 3269, 8020, 8082, 8440, 8441, 8443, 9020, 9021 Not defined by default, but administrator may configure.
Network Management WebUI, SSH, SMTP, syslog, DNS, NTP, SNMP, Perf collector, CEE, alerts 16 22, 25, 53, 123, 161, 162, 514, 6514, 6567, 8080, 9443, 12228 Not defined by default, but administrator may configure.
Bulk Data SmartSync, SyncIQ, NDMP 10 2097, 2098, 3148, 3149, 5667, 5668, 7722, 8470, 10000 Not defined by default, but administrator may configure.
Catch-All All other traffic that does not match any of the above 0 all Not defined by default, but administrator may configure.

The default DSCP feature values for each were specifically chosen to meet US government requirements and satisfy the Fed APL needs. While destination ports are undefined in the classes by default, cluster admins can customize the DSCP values, source ports, and destination ports per site requirements.

Under the hood, QoS tagging is built upon the OneFS firewall (ipfw):

As such, QoS tagging is only functional when both the firewall and the DSCP features are enabled.

The firewall inspects outgoing network traffic on the front-end ports and assigns it to the appropriate QoS class. The outbound IP packets are matched to the cluster’s four DSCP rules, one by one, from top to bottom, using the source ports, and destination ports too, if configured.

When a good match is found, the Firewall engine marks the packets’ DSCP bits as specified by that rule. If no match is found, the last ‘Best Effort’ rule will catch all outgoing IP packets which are unmatched with the other 3 DSCP rules.

The firewall assigns the DSCP value based on the QoS class, and the DSCP configuration and values are cluster wide and preserved across upgrades.

Note though, that this DSCP feature does not allow the creation of any additional or custom DSCP rules currently. Additionally, DSCP tagging is disabled by default in both STIG hardening and compliance modes.

Also, consider that in order to provide QoS, the firewall has to inspect and filter the outgoing packets, which obviously comes with a performance cost. Although this overhead should be fairly minimal, the recommendation is to test DSCP tagging in a lab environment first, to confirm workloads are not significantly impacted, before letting it loose on a production cluster.

In the next article in this series, we’ll look at the DSCP configuration and management, plus some basic troubleshooting tools.

Leave a Reply

Your email address will not be published. Required fields are marked *