OneFS Firewall Configuration – Part 2

In the previous article in this OneFS firewall series, we reviewed the upgrade, activation, and policy selection components of the firewall provisioning process.

Now, we turn our attention to the firewall rule configuration step of the process.

As stated previously, role-based access control (RBAC) explicitly limits who has access to manage the OneFS firewall. So ensure that the user account which will be used to enable and configure the OneFS firewall belongs to a role with the ‘ISI_PRIV_FIREWALL’ write privilege.

  1. Configuring Firewall Rules

Once the desired policy is created, the next step is to configure the rules. Clearly, the first step here is decide what ports and services need securing or opening, beyond the defaults.

The following CLI syntax will return a list of all the firewall’s default services, plus their respective ports, protocols, and aliases, sorted by ascending port number:

# isi network firewall services list

Service Name     Port  Protocol  Aliases

---------------------------------------------

ftp-data         20    TCP       -

ftp              21    TCP       -

ssh              22    TCP       -

smtp             25    TCP       -

dns              53    TCP       domain

                       UDP

http             80    TCP       www

                                 www-http

kerberos         88    TCP       kerberos-sec

                       UDP

rpcbind          111   TCP       portmapper

                       UDP       sunrpc

                                 rpc.bind

ntp              123   UDP       -

dcerpc           135   TCP       epmap

                       UDP       loc-srv

netbios-ns       137   UDP       -

netbios-dgm      138   UDP       -

netbios-ssn      139   UDP       -

snmp             161   UDP       -

snmptrap         162   UDP       snmp-trap

mountd           300   TCP       nfsmountd

                       UDP

statd            302   TCP       nfsstatd

                       UDP

lockd            304   TCP       nfslockd

                       UDP

nfsrquotad       305   TCP       -

                       UDP

nfsmgmtd         306   TCP       -

                       UDP

ldap             389   TCP       -

                       UDP

https            443   TCP       -

smb              445   TCP       microsoft-ds

hdfs-datanode    585   TCP       -

asf-rmcp         623   TCP       -

                       UDP

ldaps            636   TCP       sldap

asf-secure-rmcp  664   TCP       -

                       UDP

ftps-data        989   TCP       -

ftps             990   TCP       -

nfs              2049  TCP       nfsd

                       UDP

tcp-2097         2097  TCP       -

tcp-2098         2098  TCP       -

tcp-3148         3148  TCP       -

tcp-3149         3149  TCP       -

tcp-3268         3268  TCP       -

tcp-3269         3269  TCP       -

tcp-5667         5667  TCP       -

tcp-5668         5668  TCP       -

isi_ph_rpcd      6557  TCP       -

isi_dm_d         7722  TCP       -

hdfs-namenode    8020  TCP       -

isi_webui        8080  TCP       apache2

webhdfs          8082  TCP       -

tcp-8083         8083  TCP       -

ambari-handshake 8440  TCP       -

ambari-heartbeat 8441  TCP       -

tcp-8443         8443  TCP       -

tcp-8470         8470  TCP       -

s3-http          9020  TCP       -

s3-https         9021  TCP       -

isi_esrs_d       9443  TCP       -

ndmp             10000 TCP       -

cee              12228 TCP       -

nfsrdma          20049 TCP       -

                       UDP

tcp-28080        28080 TCP       -

---------------------------------------------

Total: 55

Similarly, the following CLI command will generate a list of existing rules and their associated policies, sorted in alphabetical order. For example, to show the first 5 rules:

# isi network firewall rules list –-limit 5

ID                                            Index  Description                                                                             Action

----------------------------------------------------------------------------------------------------------------------------------------------------

default_pools_policy.rule_ambari_handshake    41     Firewall rule on ambari-handshake service                                               allow

default_pools_policy.rule_ambari_heartbeat    42     Firewall rule on ambari-heartbeat service                                               allow

default_pools_policy.rule_catalog_search_req  50     Firewall rule on service for global catalog search requests                             allow

default_pools_policy.rule_cee                 52     Firewall rule on cee service                                                            allow

default_pools_policy.rule_dcerpc_tcp          18     Firewall rule on dcerpc(TCP) service                                                    allow

----------------------------------------------------------------------------------------------------------------------------------------------------

Total: 5

Both the ‘isi network firewall rules list’ and ‘isi network firewall services list’ commands also have a ‘-v’ verbose option, plus can return their output in csv, list, table, or json formats with the ‘–flag’.

The detailed info for a given firewall rule, in this case the default SMB rule, can be viewed with the following CLI syntax:

# isi network firewall rules view default_pools_policy.rule_smb

          ID: default_pools_policy.rule_smb

        Name: rule_smb

       Index: 3

 Description: Firewall rule on smb service

    Protocol: TCP

   Dst Ports: smb

Src Networks: -

   Src Ports: -

      Action: allow

Existing rules can be modified and new rules created and added into an existing firewall policy with the ‘isi network firewall rules create’ CLI syntax. Command options include:

Option Description
–action Allow, which mean pass packets.

Deny, which means silently drop packets.

Reject which means reply with ICMP error code.

id Specifies the ID of the new rule to create. The rule must be added to an existing policy. The ID can be up to 32 alphanumeric characters long and can include underscores or hyphens, but cannot include spaces or other punctuation. Specify the rule ID in the following format:

<policy_name>.<rule_name>

The rule name must be unique in the policy.

–index the rule index in the pool. the valid value is between 1 and 99. the lower value has the higher priority. if not specified, automatically go to the next available index (before default rule 100).
–live The live option must only be used when a user issues a command to create/modify/delete a rule in an active policy. Such changes will take effect immediately on all network subnets and pools associated with this policy. Using the live option on a rule in an inactive policy will be rejected, and an error message will be returned.
–protocol  Specify the protocol matched for the inbound packets.  Available value are tcp,udp,icmp,all.  if not configured, the default protocol all will be used.
–dst-ports   Specify the network ports/services provided in storage system which is identified by destination port(s). The protocol specified by –protocol will be applied on these destination ports.
–src-networks Specify one or more IP addresses with corresponding netmasks that are to be allowed by this firewall policy. The correct format for this parameter is address/netmask, similar to “192.0.2.128/25”. Multiple address/netmask pairs should be separated with commas. Use the value 0.0.0.0/0 for “any”.
–src-ports Specify the network ports/services provided in storage system which is identified by source port(s). The protocol specified by –protocol will be applied on these source ports.

Note that, unlike for firewall policies, there is no provision for cloning individual rules.

The following CLI syntax can be used to create new firewall rules. For example, to add ‘allow’ rules for the HTTP and SSH protocols, plus a ‘deny’ rule for port TCP 9876, into firewall policy fw_test1:

# isi network firewall rules create  fw_test1.rule_http  --index 1 --dst-ports http --src-networks 10.20.30.0/24,20.30.40.0/24 --action allow

# isi network firewall rules create  fw_test1.rule_ssh  --index 2 --dst-ports ssh --src-networks 10.20.30.0/24,20.30.40.0/16 --action allow

# isi network firewall rules create fw_test1.rule_tcp_9876 --index 3 --protocol tcp --dst-ports 9876  --src-networks 10.20.30.0/24,20.30.40.0/24 -- action deny

When a new rule is created in a policy, if the index value is not specified, it will automatically inherit the next available number in the series (ie. index=4 in this case).

# isi network firewall rules create fw_test1.rule_2049  --protocol udp -dst-ports 2049 --src-networks 30.1.0.0/16 -- action deny

For a more draconian approach, a ‘deny’ rule could be created using the match-everything ‘*’ wildcard for destination ports and a 0.0.0.0/0 network and mask, which would silently drop all traffic:

# isi network firewall rules create fw_test1.rule_1234  --index=100--dst-ports * --src-networks 0.0.0.0/0 --action deny

When modifying existing firewall rules, the following CLI syntax can be used, in this case to change the source network of an HTTP allow rule (index 1) in firewall policy fw_test1:

# isi network firewall rules modify fw_test1.rule_http --index 1  --protocol ip --dst-ports http --src-networks 10.1.0.0/16 -- action allow

Or to modify an SSH rule (index 2) in firewall policy fw_test1, changing the action from ‘allow’ to ‘deny’:

# isi network firewall rules modify fw_test1.rule_ssh --index 2 --protocol tcp --dst-ports ssh --src-networks 10.1.0.0/16,20.2.0.0/16 -- action deny

Also, to re-order the custom TCP 9876 rule form the earlier example from index 3 to index 7 in firewall policy fw_test1.

# isi network firewall rules modify fw_test1.rule_tcp_9876 --index 7

Note that all rules equal or behind index 7 will have their index values incremented by one.

When deleting a rule from a firewall policy, any rule reordering is handled automatically. If the policy has been applied to a network pool, the ‘–live’ option can be used to force the change to take effect immediately. For example, to delete the HTTP rule from the firewall policy ‘fw_test1’:

# isi network firewall policies delete fw_test1.rule_http --live

Firewall rules can also be created, modified and deleted within a policy from the WebUI by navigating to Cluster management > Firewall Configuration > Firewall Policies. For example, to create a rule that permits SupportAssist and Secure Gateway traffic on the 10.219.0.0/16 network:

Once saved, the new rule is then displayed in the Firewall Configuration page:

  1. Firewall management and monitoring.

In the next and final article in this series, we’ll turn our attention to managing, monitoring, and troubleshooting the OneFS firewall (step 5).

OneFS Firewall Configuration – Part 1

The new firewall in OneFS 9.5 enhances the security of the cluster and helps prevent unauthorized access to the storage system. When enabled, the default firewall configuration allows remote systems access to a specific set of default services for data, management, and inter-cluster interfaces (network pools).

The basic OneFS firewall provisioning process is as follows:

Note that role-based access control (RBAC) explicitly limits who has access to manage the OneFS firewall. In addition to the ubiquitous ‘root’, the cluster’s built-in SystemAdmin role has write privileges to configure and administer the firewall.

  1. Upgrade to OneFS 9.5

First, the cluster must be running OneFS 9.5 in order to provision the firewall.

If upgrading from an earlier release, the OneFS 9.5 upgrade must be committed before enabling the firewall.

Also, be aware that configuration and management of the firewall in OneFS 9.5 requires the new ISI_PRIV_FIREWALL administration privilege. This can be granted to a role with either  read-only or read-write privileges.

# isi auth privilege | grep -i firewall

ISI_PRIV_FIREWALL                   Configure network firewall

This privilege can be granted to a role with either  read-only or read-write permissions. By default, the built-in ‘SystemAdmin’ roles is granted write privileges to administer the firewall:

# isi auth roles view SystemAdmin | grep -A2 -i firewall

             ID: ISI_PRIV_FIREWALL

     Permission: w

Additionally, the built-in ‘AuditAdmin’ role has read permission to view the firewall configuration and logs, etc:

# isi auth roles view AuditAdmin | grep -A2 -i firewall

             ID: ISI_PRIV_FIREWALL

     Permission: r

Ensure that the user account which will be used to enable and configure the OneFS firewall belongs to a role with the ‘ISI_PRIV_FIREWALL’ write privilege.

  1. Activate Firewall

As mentioned previously, the OneFS firewall can be either ‘enabled’ or ‘disabled’, with the latter as the default state. The following CLI syntax will display the firewall’s global status – in this case ‘disabled’ (the default):

# isi network firewall settings view

Enabled: False

Firewall activation can be easily performed from the CLI as follows:

# isi network firewall settings modify --enabled true

# isi network firewall settings view

Enabled: True

Or from the WebUI under Cluster management > Firewall Configuration > Settings:

Note that the firewall is automatically enabled when STIG Hardening applied to a cluster.

  1. Pick policies

A cluster’s existing firewall policies can be easily viewed from the CLI with the following command:

# isi network firewall policies list

ID        Pools                    Subnets                   Rules
-----------------------------------------------------------------------------
fw_test1  groupnet0.subnet0.pool0  groupnet0.subnet1         test_rule1
-----------------------------------------------------------------------------
Total: 1

Or from the WebUI under Cluster management > Firewall Configuration > Firewall Policies:

The OneFS firewall offers four main strategies when it comes to selecting a firewall policy. These include:

  1. Retaining the default policy
  2. Reconfiguring the default policy
  3. Cloning the default policy and reconfiguring
  4. Creating a custom firewall policy

We’ll consider each of these strategies in order:

a.  Retaining the default policy

In many cases, the default OneFS firewall policy value will provide acceptable protection for a security conscious organization. In these instances, once the OneFS firewall has been enabled on a cluster, no further configuration is required, and the cluster administrators can move on to the management and monitoring phase.

The firewall policy for all front-end cluster interfaces (network pool) is ‘default’. While the default policy can be modified, be aware that this default policy is global. As such, any change against it will impact all network pools using this default policy.

The following table describes the default firewall policies that are assigned to each interface:

Policy Description
Default pools policy Contains rules for the inbound default ports for TCP and UDP services in OneFS.
Default subnets policy Contains rules for:

·         DNS port 53

·         Rule for ICMP

·         Rule for ICMP6

These can be viewed from the CLI as follows:

# isi network firewall policies view default_pools_policy

            ID: default_pools_policy

          Name: default_pools_policy

   Description: Default Firewall Pools Policy

Default Action: deny

     Max Rules: 100

         Pools: groupnet0.subnet0.pool0, groupnet0.subnet0.testpool1, groupnet0.subnet0.testpool2, groupnet0.subnet0.testpool3, groupnet0.subnet0.testpool4, groupnet0.subnet0.poolcava

       Subnets: -

         Rules: rule_ldap_tcp, rule_ldap_udp, rule_reserved_for_hw_tcp, rule_reserved_for_hw_udp, rule_isi_SyncIQ, rule_catalog_search_req, rule_lwswift, rule_session_transfer, rule_s3, rule_nfs_tcp, rule_nfs_udp, rule_smb, rule_hdfs_datanode, rule_nfsrdma_tcp, rule_nfsrdma_udp, rule_ftp_data, rule_ftps_data, rule_ftp, rule_ssh, rule_smtp, rule_http, rule_kerberos_tcp, rule_kerberos_udp, rule_rpcbind_tcp, rule_rpcbind_udp, rule_ntp, rule_dcerpc_tcp, rule_dcerpc_udp, rule_netbios_ns, rule_netbios_dgm, rule_netbios_ssn, rule_snmp, rule_snmptrap, rule_mountd_tcp, rule_mountd_udp, rule_statd_tcp, rule_statd_udp, rule_lockd_tcp, rule_lockd_udp, rule_nfsrquotad_tcp, rule_nfsrquotad_udp, rule_nfsmgmtd_tcp, rule_nfsmgmtd_udp, rule_https, rule_ldaps, rule_ftps, rule_hdfs_namenode, rule_isi_webui, rule_webhdfs, rule_ambari_handshake, rule_ambari_heartbeat, rule_isi_esrs_d, rule_ndmp, rule_isi_ph_rpcd, rule_cee, rule_icmp, rule_icmp6, rule_isi_dm_d




# isi network firewall policies view default_subnets_policy

            ID: default_subnets_policy

          Name: default_subnets_policy

   Description: Default Firewall Subnets Policy

Default Action: deny

     Max Rules: 100

         Pools: -

       Subnets: groupnet0.subnet0

         Rules: rule_subnets_dns_tcp, rule_subnets_dns_udp, rule_icmp, rule_icmp6

Or from the WebUI under Cluster Management > Firewall Configuration > Firewall Policies:

 

b.  Reconfiguring the default policy

Depending on an organization’s threat levels or security mandates, there may be a need to restrict access to certain additional IP addresses and/or management service protocols.

If the default policy is deemed insufficient, reconfiguring the default firewall policy can be a good option if only a small number of rule changes are required. The specifics of creating, modifying, and deleting individual firewall rules is covered later in this article (step 3 below).

Note that if new rule changes behave unexpectedly, or configurating the firewall generally goes awry, OneFS does provide a ‘get out of jail free’ card. In a pinch, the global firewall policy can be quickly and easily restored to its default values. This can be achieved with the following CLI syntax:

# isi network firewall reset-global-policy

This command will reset the global firewall policies to the original system defaults. Are you sure you want to continue? (yes/[no]):

 

Alternatively, the default policy can also be easily reverted from the WebUI too, by clicking the ‘Reset default policies’ button:

c.  Cloning the default policy and reconfiguring

Another option is cloning, which can be useful when batch modification or a large number of changes to the current policy are required. By cloning the default firewall policy, an exact copy of the existing policy and its rules is generated, but with a new policy name. For example:

# isi network firewall policies clone default_pools_policy clone_default_pools_policy

# isi network firewall policies list | grep -i clone

clone_default_pools_policy -

Cloning can also be initiated from the WebUI under Firewall Configuration > Firewall Policies > More Actions > Clone Policy:

Enter the desired name of the clone in the ‘Policy Name’ field in the pop-up window and click ‘Save’:

Once cloned, the policy can then be easily reconfigured to suit. For example, to modify the policy ‘fw_test1’ and change its default-action from deny-all to allow-all:

# isi network firewall policies modify fw_test1 --default--action allow-all

When modifying a firewall policy, the ‘–live’ option CLI option can be used to force it take effect immediately. Note that the ‘—live’ option is only valid when issuing a command to modify or delete an active custom policy and to modify default policy. Such changes will take effect immediately on all network subnets and pools associated with this policy. Using the live option on an inactive policy will be rejected, and an error message returned.

Options for creating or modifying a firewall policy include:

Option Description
–default-action Automatically add one rule to ‘deny all’ or ‘allow all’ to the bottom of the rule set for this created policy (Index = 100).
max-rule-num By default, each policy when created could have maximum 100 rules (include one default rule), so user could config maximum 99 rules.  User could expand the maximum rule number to a specified value. Currently this value is limited to 200 (and user could config maximum 199 rules).
–add-subnets  Specify the network subnet(s) to add to policy, separated by a comma.
–remove-subnets  Specify the networks subnets to remove from policy and fall back to global policy.
–add-pools  Specify the network pool(s) to add to policy, separated by a comma.
–remove-pools  Specify the networks pools to remove from policy and fall back to global policy.

When modifying firewall policies, OneFS prints the following warning to verify the changes and help avoid the risk of a self-induced denial-of-service:

# isi network firewall policies modify --pools groupnet0.subnet0.pool0 fw_test1

Changing the Firewall Policy associated with a subnet or pool may change the networks and/or services allowed to connect to OneFS. Please confirm you have selected the correct Firewall Policy and Subnets/Pools. Are you sure you want to continue? (yes/[no]): yes

Once again, having the following CLI command handy, plus console access to the cluster is always a prudent move:

# isi network firewall reset-global-policy

So adding network pools or subnets to a firewall policy will cause the previous policy to be removed from them. Similarly, adding network pools or subnets to the global default policy will revert any custom policy configuration they might have. For example, to apply the firewall policy fw_test1 to IP Pool groupnet0.subnet0.pool0 and groupnet0.subnet0.pool1:

# isi network pools view groupnet0.subnet0.pool0 | grep -i firewall

      Firewall Policy: default_pools_policy

# isi network firewall policies modify fw_test1 --add-pools groupnet0.subnet0.pool0, groupnet0.subnet0.pool1

# isi network pools view groupnet0.subnet0.pool0 | grep -i firewall

      Firewall Policy: fw_test1

Or to apply the firewall policy fw_test1 to IP Pool groupnet0.subnet0.pool0 and groupnet0.subnet0:

# isi network firewall policies modify fw_test1 --apply-subnet groupnet0.subnet0.pool0, groupnet0.subnet0

# isi network pools view groupnet0.subnet0.pool0 | grep -i firewall

 Firewall Policy: fw_test1

# isi network subnets view groupnet0.subnet0 | grep -i firewall

 Firewall Policy: fw_test1

To reapply global policy at any time, either add the pools to the default policy:

# isi network firewall policies modify default_pools_policy --add-pools groupnet0.subnet0.pool0, groupnet0.subnet0.pool1

# isi network pools view groupnet0.subnet0.pool0 | grep -i firewall

 Firewall Policy: default_subnets_policy

# isi network subnets view groupnet0.subnet1 | grep -i firewall

 Firewall Policy: default_subnets_policy

Or remove the pool from the custom policy:

# isi network firewall policies modify fw_test1 --remove-pools groupnet0.subnet0.pool0 groupnet0.subnet0.pool1

Firewall policies can also be managed on the desired network pool in the OneFS WebUI by navigating to Cluster configuration > Network configuration > External network > Edit pool details. For example:

Be aware that cloning is also not limited to the default policy, as clones can be made of any custom policies too. For example:

# isi network firewall policies clone clone_default_pools_policy fw_test1

d.  Creating a custom firewall policy

Alternatively, a custom firewall policy can also be created from scratch. This can be accomplished from the CLI using the following syntax, in this case to create a firewall policy named ‘fw_test1’:

# isi network firewall policies create fw_test1 --default-action deny

# isi network firewall policies view fw_test1

            ID: fw_test1

          Name: fw_test1

   Description:

Default Action: deny

     Max Rules: 100

         Pools: -

       Subnets: -

         Rules: -

Note that if a ‘default-action’ is not specified in the CLI command syntax, it will automatically default to deny.

Firewall policies can also be configured via the OneFS WebUI by navigating to Cluster management > Firewall Configuration > Firewall Policies > Create Policy:

However, in contrast to the CLI, if a ‘default-action’ is not specified when creating a policy in the WebUI, it will automatically default to ‘Allow’ instead, since the drop-down list works alphabetically.

If and when a firewall policy is no longer required, it can be swiftly and easily removed. For example, the following CLI syntax will delete the firewall policy ‘fw_test1’, clearing out any rules within this policy container:

# isi network firewall policies delete fw_test1

Are you sure you want to delete firewall policy fw_test1? (yes/[no]): yes

Note that the default global policies cannot be deleted.

# isi network firewall policies delete default_subnets_policy

Are you sure you want to delete firewall policy default_subnets_policy? (yes/[no]): yes

Firewall policy: Cannot delete default policy default_subnets_policy.
  1. Configuring Firewall Rules

In the next article in this series, we’ll turn our attention to configuring the OneFS firewall rule(s) (step 4).

OneFS Firewall

Among the array of security features introduced in OneFS 9.5 is a new host-based firewall. This firewall allows cluster administrators to configure policies and rules on a PowerScale cluster in order to meet the network and application management needs and security mandates of an organization.

The OneFS firewall protects the cluster’s external, or front-end, network and operates as a packet filter for inbound traffic. It is available upon installation or upgrade to OneFS 9.5, but is disabled by default in both cases. However, the OneFS STIG hardening profile automatically enables the firewall and the default policies, in addition to manual activation.

The firewall generally manages IP packet filtering in accordance with the OneFS Security Configuration Guide, especially in regards to the network port usage. Packet control is governed by firewall policies, which are comprised of one or more individual rules.

Item Description Match Action
Firewall Policy Each policy is a set of firewall rules. Rules are matched by index in ascending order Each policy has a default action.
Firewall Rule Each rule specifies what kinds of network packets should be matched by Firewall engine and what action should be taken upon them. Matching criteria includes protocol, source ports, destination ports, source network address) Options are ‘allow’, ‘deny’ or ‘reject’.

A security best practice is to enable the OneFS firewall using the default policies, with any adjustments as required. The recommended configuration process is as follows:

Step Details
1.  Access Ensure that the cluster uses a default SSH or HTTP port before enabling. The default firewall policies block all nondefault ports until you change the policies.
2.  Enable Enable the OneFS firewall.
3.  Compare Compare your cluster network port configurations against the default ports listed in Network port usage.
4.  Configure Edit the default firewall policies to accommodate any non-standard ports in use in the cluster. NOTE: The firewall policies do not automatically update when port configurations are changed.
5.  Constrain Limit access to the OneFS Web UI to specific administrator terminals

Under the hood, the OneFS firewall is built upon the ubiquitous ‘ipfirewall’, or ‘ipfw’, which is FreeBSD’s native stateful firewall, packet filter and traffic accounting facility.

Firewall configuration and management is via the CLI, or platform API, or WebUI and OneFS 9.5 introduces a new Firewall Configuration page to support this. Note that the firewall is only available once a cluster is already running OneFS 9.5 and the feature has been manually enabled, activating the isi_firewall_d service. The firewall’s configuration is split between gconfig, which handles the settings and policies, and the ipfw table, which stores the rules themselves.

The firewall gracefully handles any SmartConnect dynamic IP movement between nodes since firewall policies are applied per network pool. Additionally, being network pool based allows the firewall to support OneFS access zones and shared/multitenancy models.

The individual firewall rules, which are essentially simplified wrappers around ipfw rules, work by matching packets via the 5-tuples that uniquely identify an IPv4 UDP or TCP session:

  • Source IP address
  • Source port
  • Destination IP address
  • Destination port
  • Transport protocol

The rules are then organized within a firewall policy, which can be applied to one or more network pools.

Note that each pool can only have a single firewall policy applied to it. If there is no custom firewall policy configured for a network pool, it automatically uses the global default firewall policy.

When enabled, the OneFS firewall function is cluster wide, and all inbound packets from external interfaces will go through either the custom policy or default global policy before reaching the protocol handling pathways. Packets passed to the firewall are compared against each of the rules in the policy, in rule-number order. Multiple rules with the same number are permitted, in which case they are processed in order of insertion. When a match is found, the action corresponding to that matching rule is performed. A packet is checked against the active ruleset in multiple places in the protocol stack, and the basic flow is as follows:

  1. Get the logical interface for incoming packets
  2. Find all network pools assigned to this interface
  3. Compare these network pools one by one with destination IP address to find the matching pool (either custom firewall policy, or default global policy).
  4. Compare each rule with service (protocol & destination ports) & source IP address in this pool from in order of lowest index value.  If matched, perform actions according to the associated rule.
  5. If no rule matches, go to the final rule (deny all or allow all) which is specified upon policy creation.

The OneFS firewall automatically reserves 20,000 rules in the ipfw table for its custom and default policies and rules. By default, each policy can gave a maximum of 100 rules, including one default rule. This translates to an effective maximum of 99 user-defined rules per policy, because the default rule is reserved and cannot be modified. As such, a maximum of 198 policies can be applied to pools or subnets since the default-pools-policy and default-subnets-policy are reserved and cannot be deleted.

Additional firewall bounds and limits to keep in mind include:

Name Value Description
MAX_INTERFACES 500 Maximum number of Layer 2 interfaces per node (including Ethernet, VLAN, LAGG interfaces).
MAX _SUBNETS 100 Maximum number of subnets within a OneFS cluster
MAX_POOLS 100 Maximum number of network pools within a OneFS cluster
DEFAULT_MAX_RULES 100 Default value of maximum rules within a firewall policy
MAX_RULES 200 Upper limit of maximum rules within a firewall policy
MAX_ACTIVE_RULES 5000 Upper limit of total active rules across the whole cluster
MAX_INACTIVE_POLICIES 200 Maximum number of policies which are not applied to any network subnet or pool. They will not be written into ipfw table.

The firewall default global policy is ready to use out of box and, unless a custom policy has been explicitly configured, all network pools use this global policy. Custom policies can be configured by either cloning and modifying an existing policy or creating one from scratch.

Component Description
Custom policy A user-defined container with a set of rules. A policy can be applied to multiple network pools, but a network pool can only apply one policy.

 

Firewall rule An ipfw-like rule which can be used to restrict remote access. Each rule has an index which is valid within the policy. Index values range from 1 to 99, with lower numbers having higher priority. Source networks are described by IP and netmask, and services can be expressed either by port number (ie. 80) or service name (ie. http,ssh,smb). The ‘*‘ wildcard can also be used to denote all services. Supported actions include ‘allow’, ‘drop’ and ‘reject’.
Default policy A global policy to manage all default services, used for maintaining OneFS minimum running and management. While ‘Deny any‘ is the default action of the policy, the defined service rules have a default action to ‘allow all remote access’. All packets not matching any of the rules are automatically dropped.

Two default policies: 

·         default-pools-policy

·         default-subnets-policy

Note that these two default policies cannot be deleted, but individual rule modification is permitted in each.

Default services The firewall’s default pre-defined services include the usual suspects, such as: DNS, FTP, HDFS, HTTP, HTTPS, ICMP, NDMP, NFS, NTP, S3, SMB, SNMP, SSH, etc. A full listing is available via the ‘isi network firewall services list’ CLI command output.

For a given network pool, either the global policy or a custom policy is assigned and takes effect. Additionally, all configuration changes to either policy type are managed by gconfig and are persistent across cluster reboots.

In the next article in this series we’ll take a look at the configuration and management of the OneFS firewall.

OneFS Snapshot Security

In this era of elevated cyber-crime and data security threats, there is increasing demand for immutable, tamper-proof snapshots. Often this need arises as part of a broader security mandate, ideally proactively, but oftentimes as a response to a security incident. OneFS addresses this requirement in the following ways:

On-cluster Off-cluster
·         Read-only snapshots

·         Snapshot locks

·         Role-based administration

·         SyncIQ snapshot replication

·         Cyber-vaulting

 

  1. Read-only snapshots

At its core, OneFS SnapshotIQ generates read-only, point-in-time, space efficient copies of a defined subset of a cluster’s data.

Only the changed blocks of a file are stored when updating OneFS snapshots, ensuring efficient storage utilization. They are also highly scalable and typically take less than a second to create, while generating little performance overhead. As such, the RPO (recovery point objective) and RTO (recovery time objective) of a OneFS snapshot can be very small and highly flexible, with the use of rich policies and schedules.

OneFS Snapshots are created manually, via a scheduled, or automatically generated by OneFS to facilitate system operations. But whatever the generation method, once a snapshot has been taken, its contents cannot be manually altered.

  1. Snapshot Locks

In addition to snapshot contents immutability, for an enhanced level of tamper-proofing, SnapshotIQ also provides the ability to lock snapshots with the ‘isi snapshot locks’ CLI syntax. This prevents snapshots from being accidentally or unintentionally deleted.

For example, a manual snapshot, ‘snaploc1’ is taken of /ifs/test:

# isi snapshot snapshots create /ifs/test --name snaploc1

# isi snapshot snapshots list | grep snaploc1

79188 snaploc1                                     /ifs/test

A lock is then placed on it (in this case lock ID=1):

# isi snapshot locks create snaplock1

# isi snapshot locks list snaploc1

ID

----

1

----

Total: 1

Attempts to delete the snapshot fails because the lock prevents its removal:

# isi snapshot snapshots delete snaploc1

Are you sure? (yes/[no]): yes

Snapshot "snaploc1" can't be deleted because it is locked

The CLI command ‘isi snapshot locks delete <lock_ID>’ can be used to clear existing snapshot locks, if desired. For example,  to remove the only lock (ID=1) from snapshot ‘snaploc1’:

# isi snapshot locks list snaploc1

ID

----

1

----

Total: 1

# isi snapshot locks delete snaploc1 1

Are you sure you want to delete snapshot lock 1 from snaploc1? (yes/[no]): yes

# isi snap locks view snaploc1 1

No such lock

Once the lock is removed, the snapshot can then be deleted:

# isi snapshot snapshots delete snaploc1

Are you sure? (yes/[no]): yes

# isi snapshot snapshots list| grep -i snaploc1 | wc -l

       0

Note that a snapshot can have up to a maximum of sixteen locks on it at any time. Also, lock numbers are continually incremented and not recycled upon deletion.

Like snapshot expiry, snapshot locks can also have an expiry time configured. For example, to set a lock on snapshot ‘snaploc1’ that expires at 1am on April 1st April, 2024:

# isi snap lock create snaploc1 --expires '2024-04-01T01:00:00'

# isi snap lock list snaploc1

ID

----

36

----

Total: 1

# isi snap lock view snaploc1 33

     ID: 36

Comment:

Expires: 2024-04-01T01:00:00

  Count: 1

Note that if the duration period of a particular snapshot lock expires but others remain, OneFS will not delete that snapshot until all the locks on it have been deleted or expired.

The following table provides an example snapshot expiration schedule, with monthly locked snapshots to prevent deletion:

Snapshot Frequency Snapshot Time Snapshot Expiration Max Retained Snapshots
Every other hour Start at 12:00AM

End at 11:59AM

1 day 27
Every day At 12:00AM 1 week
Every week Saturday at 12:00AM 1 month
Every month First Saturday of month at 12:00AM Locked

3. Roles-based Access Control

Read-only snapshots plus locks present physically secure snapshots on a cluster. However, if you are able to login to the cluster and have the required elevated administrator privileges to do so, you can still remove locks and/or delete snapshots.

Since data security threats come from inside an environment as well as out, such as from a disgruntled IT employee or other internal bad actor, another key to a robust security profile is to constrain the use of all-powerful ‘root’, ‘administrator’, and ‘sudo’ accounts as much as possible. Instead, of granting cluster admins full rights, a preferred security best practice is to leverage the comprehensive authentication, authorization, and accounting framework that OneFS natively provides.

OneFS role-based access control (RBAC) can be used to explicitly limit who has access to manage and delete snapshots. This granular control allows administrative roles to be crafted which can create and manage snapshot schedules, but prevent their unlocking and/or deletion. Similarly, lock removal and snapshot deletion can be isolated to a specific security role (or to root only).

A cluster security administrator selects the desired access zone, creates a zone-aware role within it, assigns privileges, and then assigns members.

For example, from the WebUI under Access > Membership and roles > Roles:

When these members login to the cluster via a configuration interface (WebUI, Platform API, or CLI) they inherit their assigned privileges.

The specific privileges that can be used to segment OneFS snapshot management include:

Privilege Description
ISI_PRIV_SNAPSHOT_ALIAS Aliasing for snapshots
ISI_PRIV_SNAPSHOT_LOCKS Locking of snapshots from deletion
ISI_PRIV_SNAPSHOT_PENDING Upcoming snapshot based on schedules
ISI_PRIV_SNAPSHOT_RESTORE Restoring directory to a particular snapshot
ISI_PRIV_SNAPSHOT_SCHEDULES Scheduling for periodic snapshots
ISI_PRIV_SNAPSHOT_SETTING Service and access settings
ISI_PRIV_SNAPSHOT_SNAPSHOTMANAGEMENT Manual snapshots and locks
ISI_PRIV_SNAPSHOT_SNAPSHOT_SUMMARY Snapshot summary and usage details

Each privilege can be assigned one of four permission levels for a role, including:

Permission Indicator Description
No permission.
R Read-only permission.
X Execute permission.
W Write permission.

The ability for a user to delete a snapshot is governed by the ‘ISI_PRIV_SNAPSHOT_SNAPSHOTMANAGEMENT’ privilege.  Similarly, the ‘ISI_PRIV_SNAPSHOT_LOCKS’ governs lock creation and removal.

In the following example, the ‘snap’ role has ‘read’ rights for the ‘ISI_PRIV_SNAPSHOT_LOCKS’ privilege, allowing a user associated with this role to view snapshot locks:

# isi auth roles view snap | grep -I -A 1 locks

             ID: ISI_PRIV_SNAPSHOT_LOCKS

     Permission: r

--

# isi snapshot locks list snaploc1

ID

----

1

----

Total: 1

However, attempts to remove the lock ‘ID 1’ from the ‘snaploc1’ snapshot fail without write privileges:

# isi snapshot locks delete snaploc1 1

Privilege check failed. The following write privilege is required: Snapshot locks (ISI_PRIV_SNAPSHOT_LOCKS)

Write privileges are added to ‘ISI_PRIV_SNAPSHOT_LOCKS’ in the ‘’snaploc1’ role:

# isi auth roles modify snap –-add-priv-write ISI_PRIV_SNAPSHOT_LOCKS

# isi auth roles view snap | grep -I -A 1 locks

             ID: ISI_PRIV_SNAPSHOT_LOCKS

     Permission: w

--

This allows the lock ‘ID 1’ to be successfully deleted from the ‘snaploc1’ snapshot:

# isi snapshot locks delete snaploc1 1

Are you sure you want to delete snapshot lock 1 from snaploc1? (yes/[no]): yes

# isi snap locks view snaploc1 1

No such lock

Using OneFS RBAC, an enhanced security approach for a site could be to create three OneFS roles on a cluster, each with an increasing realm of trust:

a.  First, an IT ops/helpdesk role with ‘read’ access to the snapshot attributes would permit monitoring and troubleshooting, but no changes:

Snapshot Privilege Permission
ISI_PRIV_SNAPSHOT_ALIAS Read
ISI_PRIV_SNAPSHOT_LOCKS Read
ISI_PRIV_SNAPSHOT_PENDING Read
ISI_PRIV_SNAPSHOT_RESTORE Read
ISI_PRIV_SNAPSHOT_SCHEDULES Read
ISI_PRIV_SNAPSHOT_SETTING Read
ISI_PRIV_SNAPSHOT_SNAPSHOTMANAGEMENT Read
ISI_PRIV_SNAPSHOT_SNAPSHOT_SUMMARY Read

b.  Next, a cluster admin role, with ‘read’ privileges for ‘ISI_PRIV_SNAPSHOT_LOCKS’ and ‘ISI_PRIV_SNAPSHOT_SNAPSHOTMANAGEMENT’ would prevent snapshot and lock deletion, but provide ‘write’ access for schedule configuration, restores, etc..

Snapshot Privilege Permission
ISI_PRIV_SNAPSHOT_ALIAS Write
ISI_PRIV_SNAPSHOT_LOCKS Read
ISI_PRIV_SNAPSHOT_PENDING Write
ISI_PRIV_SNAPSHOT_RESTORE Write
ISI_PRIV_SNAPSHOT_SCHEDULES Write
ISI_PRIV_SNAPSHOT_SETTING Write
ISI_PRIV_SNAPSHOT_SNAPSHOTMANAGEMENT Read
ISI_PRIV_SNAPSHOT_SNAPSHOT_SUMMARY Write

c.  Finally, a cluster security admin role (root equivalence) would provide full snapshot configuration and management, lock control, and deletion rights:

Snapshot Privilege Permission
ISI_PRIV_SNAPSHOT_ALIAS Write
ISI_PRIV_SNAPSHOT_LOCKS Write
ISI_PRIV_SNAPSHOT_PENDING Write
ISI_PRIV_SNAPSHOT_RESTORE Write
ISI_PRIV_SNAPSHOT_SCHEDULES Write
ISI_PRIV_SNAPSHOT_SETTING Write
ISI_PRIV_SNAPSHOT_SNAPSHOTMANAGEMENT Write
ISI_PRIV_SNAPSHOT_SNAPSHOT_SUMMARY Write

Note that when configuring OneFS RBAC, remember to remove the ‘ISI_PRIV_AUTH’ and ‘ISI_PRIV_ROLE’ privilege from all but the most trusted administrators.

Additionally, enterprise security management tools such as CyberArk can also be incorporated to manage authentication and access control holistically across an environment. These can be configured to frequently change passwords on trusted accounts (ie. every hour or so), require multi-Level approvals prior to retrieving passwords, as well as track and audit password requests and trends.

While this article focuses exclusively on OneFS snapshots, the expanded use of RBAC granular privileges for enhanced security is germane to most key areas of cluster management and data protection, such as SyncIQ replication, etc.

  1. Snapshot replication

In addition to utilizing snapshots for its own checkpointing system, SyncIQ, the OneFS data replication engine, supports snapshot replication to a target cluster.

OneFS SyncIQ replication policies contain an option for triggering a replication policy when a snapshot of the source directory is completed. Additionally, at the onset of a new policy configuration, when the “Whenever a Snapshot of the Source Directory is Taken” option is selected, a checkbox appears to enable any existing snapshots in the source directory to be replicated. More information is available in this SyncIQ paper.

  1. Cyber-vaulting

File data is arguably the most difficult to protect, because:

  • It is the only type of data where potentially all employees have a direct connection to the storage (with the other type of storage it’s via an application)
  • File data is linked (or mounted) to the operating system of the client. This means that it’s sufficient to gain file access to the OS to get access to potentially critical data.
  • Users are the largest breach points that happen.

The Cyber Security Framework (CSF) from the National Institute of Standards and Technology (NIST) categorizes the threat through recovery process:

Within the ‘Protect’ phase, there are two core aspects:

  • Applying all the core protection features available on the OneFS platform, namely:
Feature Description
Access control Where the core data protection functions are being executed. Assess who actually needs write access.
Immutability Having immutable snapshots, replica versions, etc. Augmenting backup strategy with an archiving strategy with SmartLock WORM.
Encryption Encrypting both data in-flight and data at rest.
Anti-virus Integrating with anti-virus/anti-malware protection that does content inspection.
Security advisories Dell Security Advisories (DSA) inform about fixes to common vulnerabilities and exposures.

 

  • Data isolation provides a last resort copy of business critical data, and can be achieved by using an air gap to isolate the cyber vault copy of the data. The vault copy is logically separated from the production copy of the data. Data syncing happens only intermittently by closing the airgap after ensuring there are no known issues.

The combination of OneFS snapshots and SyncIQ replication allows for granular data recovery. This means that only the affected files are recovered, while the most recent changes are preserved for the unaffected data. While an on-prem air-gapped cyber vault can still provide secure network isolation, in the event of an attack, the ability to failover to a fully operational ‘clean slate’ remote site provides additional security and peace of mind.

We’ll explore PowerScale cyber protection and recovery in more depth in a future article.

OneFS SupportAssist Management and Troubleshooting

In this final article in the OneFS SupportAssist series, we turn our attention to management and troubleshooting.

Once the provisioning process above is complete, the ‘isi supportassist settings view’ CLI command reports the status and health of SupportAssist operations on the cluster.

# isi supportassist settings view

        Service enabled: Yes

       Connection State: enabled

      OneFS Software ID: xxxxxxxxxx

          Network Pools: subnet0:pool0

        Connection mode: direct

           Gateway host: -

           Gateway port: -

    Backup Gateway host: -

    Backup Gateway port: -

  Enable Remote Support: Yes

Automatic Case Creation: Yes

       Download enabled: Yes

This can also be obtained from the WebUI by navigating to Cluster management > General settings > SupportAssist:

There are some caveats and considerations to keep in mind when upgrading to OneFS 9.5 and enabling SupportAssist, including:

  • SupportAssist is disabled when STIG Hardening applied to cluster
  • Using SupportAssist on a hardened cluster is not supported
  • Clusters with the OneFS network firewall enabled (‘isi network firewall settings’) may need to allow outbound traffic on port 9443.
  • SupportAssist is supported on a cluster that’s running in Compliance mode
  • Secure keys are held in Key manager under the RICE domain

Also, note that ESRS can no longer be used after SupportAssist has been provisioned on a cluster.

SupportAssist has a variety of components that gather and transmit various pieces of OneFS data and telemetry to Dell Support and backend services through the Embedded Service Enabler (ESE.  These workflows include CELOG events; In-product activation (IPA) information; CloudIQ telemetry data; Isi-Gather-info (IGI) logsets; and provisioning, configuration, and authentication data to ESE and the various backend services.

Activity Information
Events and alerts SupportAssist can be configured to send CELOG events..
Diagnostics The OneFS isi diagnostics gather and isi_gather_info logfile collation and transmission commands have a SupportAssist option.
Healthchecks HealthCheck definitions are updated using SupportAssist.
License Activation The isi license activation start command uses SupportAssist to connect.
Remote Support Remote Support uses SupportAssist and the Connectivity Hub to assist customers with their clusters.
Telemetry CloudIQ telemetry data is sent using SupportAssist.

CELOG

Once SupportAssist is up and running, it can be configured to send CELOG events and attachments via ESE to CLM. This can be managed by the ‘isi event channels’ CLI command syntax. For example:

# isi event channels list

ID   Name                Type          Enabled

-----------------------------------------------

1    RemoteSupport       connectemc    No

2    Heartbeat Self-Test heartbeat     Yes

3    SupportAssist       supportassist No

-----------------------------------------------

Total: 3

# isi event channels view SupportAssist

     ID: 3

   Name: SupportAssist

   Type: supportassist

Enabled: No

Or from the WebUI:

CloudIQ Telemetry

In OneFS 9.5, SupportAssist provides an option to send telemetry data to CloudIQ. This can be enabled from the CLI as follows;

# isi supportassist telemetry modify --telemetry-enabled 1 --telemetry-persist 0

# isi supportassist telemetry view

        Telemetry Enabled: Yes

        Telemetry Persist: No

        Telemetry Threads: 8

Offline Collection Period: 7200

Or via the SupportAssist WebUI:

 

Diagnostics Gather

Also in OneFS 9.5, the ‘isi diagnostics gather’ and ‘isi_gather_info’ CLI commands both now include a ‘–supportassist’ upload option for log gathers, which also allows them to continue to function when the cluster is unhealthy via a new ‘Emergency mode’. For example, to start a gather from the CLI that will be uploaded via SupportAssist:

# isi diagnostics gather start –supportassist 1

Similarly, for the isi_gather_info utility:

# isi_gather_info --supportassist

Or to explicitly avoid using SupportAssist for ISI gather info log gather upload:

# isi_gather_info --nosupportassist

This can also be configured from the WebUI via Cluster management > General configuration > Diagnostics > Gather:

 

License Activation

A cluster’s product licenses can also be managed through SupportAssist in OneFS 9.5.

PowerScale License Activation (previously known as In-Product Activation) facilitates the management of the cluster’s entitlements and licenses by communicating directly with Software Licensing Central via SupportAssist.

To activate OneFS product licenses through the SupportAssist WebUI, navigate to Cluster management > Licensing. For example, on a new cluster without any signed licenses:

Click the button Update & Refresh in the License Activation section. In the ‘Activation File Wizard’, select the desired software modules.

Next select ‘Review changes’, review, click ‘Proceed’, and finally ‘Activate’.

Note that it can take up to 24 hours for the activation to occur.

Alternatively, cluster License activation codes (LAC) can also be added manually.

Troubleshooting

When it comes to troubleshooting SupportAssist, the basic process flow is as follows:

The OneFS components and services above are:

Component Info
ESE Embedded Service Enabler.
isi_rice_d Remote Information Connectivity Engine (RICE).
isi_crispies_d Coordinator for RICE Incidental Service Peripherals including ESE Start.
Gconfig OneFS centralized configuration infrastructure.
MCP Master Control Program – starts, monitors, and restarts OneFS services.
Tardis Configuration service and database.
Transaction journal Task manager for RICE.

 

Of these, ESE, isi_crispies_d, isi_rice_d, and the Transaction Journal are new in OneFS 9.5 and exclusive to SupportAssist. In contrast, Gconfig, MCP, and Tardis are all legacy services that are used by multiple other OneFS components.

For its connectivity, SupportAssist elects a single leader single node within the subnet pool, and NANON nodes are automatically avoided. Ports 443 and 8443 are required to be open for bi-directional communication between the cluster and Connectivity Hub, and port 9443 is for communicating with a gateway. The SupportAssist ESE component communicates with a number of Dell backend services, including SRS, Connectivity Hub, ELMS licensing, CloudIQ, ESE, etc.

As such, debugging backend issues may involve one or more services, and Dell Support can assist with this process.

The main log files for investigating and troubleshooting SupportAssist issues and idiosyncrasies are isi_rice_d.log and isi_crispies_d.log. These is also an ese_log, which can be useful, too. These can be found at:

Component Logfile Location Info
Rice /var/log/isi_rice_d.log Per node
Crispies /var/log/isi_crispies_d.log Per node
ESE /ifs/.ifsvar/ese/var/log/ESE.log Cluster-wise for single instance ESE

 

Debug level logging can be configured from the CLI as follows:

# isi_for_array isi_ilog -a isi_crispies_d --level=debug+

# isi_for_array isi_ilog -a isi_rice_d --level=debug+

Note that the OneFS log gathers (such as the output from the isi_gather_info utility) will capture all the above log files, plus the pertinent SupportAssist Gconfig contexts and Tardis namespaces, for later analysis.

If needed, the Rice and ESE configurations can also be viewed as follows:

# isi_gconfig -t ese

[root] {version:1}

ese.mode (char*) = direct

ese.connection_state (char*) = disabled

ese.enable_remote_support (bool) = true

ese.automatic_case_creation (bool) = true

ese.event_muted (bool) = false

ese.primary_contact.first_name (char*) =

ese.primary_contact.last_name (char*) =

ese.primary_contact.email (char*) =

ese.primary_contact.phone (char*) =

ese.primary_contact.language (char*) =

ese.secondary_contact.first_name (char*) =

ese.secondary_contact.last_name (char*) =

ese.secondary_contact.email (char*) =

ese.secondary_contact.phone (char*) =

ese.secondary_contact.language (char*) =

(empty dir ese.gateway_endpoints)

ese.defaultBackendType (char*) = srs

ese.ipAddress (char*) = 127.0.0.1

ese.useSSL (bool) = true

ese.srsPrefix (char*) = /esrs/{version}/devices

ese.directEndpointsUseProxy (bool) = false

ese.enableDataItemApi (bool) = true

ese.usingBuiltinConfig (bool) = false

ese.productFrontendPrefix (char*) = platform/16/supportassist

ese.productFrontendType (char*) = webrest

ese.contractVersion (char*) = 1.0

ese.systemMode (char*) = normal

ese.srsTransferType (char*) = ISILON-GW

ese.targetEnvironment (char*) = PROD

And for Rice:

# isi_gconfig -t rice

[root] {version:1}

rice.enabled (bool) = false

rice.ese_provisioned (bool) = false

rice.hardware_key_present (bool) = false

rice.supportassist_dismissed (bool) = false

rice.eligible_lnns (char*) = []

rice.instance_swid (char*) =

rice.task_prune_interval (int) = 86400

rice.last_task_prune_time (uint) = 0

rice.event_prune_max_items (int) = 100

rice.event_prune_days_to_keep (int) = 30

rice.jnl_tasks_prune_max_items (int) = 100

rice.jnl_tasks_prune_days_to_keep (int) = 30

rice.config_reserved_workers (int) = 1

rice.event_reserved_workers (int) = 1

rice.telemetry_reserved_workers (int) = 1

rice.license_reserved_workers (int) = 1

rice.log_reserved_workers (int) = 1

rice.download_reserved_workers (int) = 1

rice.misc_task_workers (int) = 3

rice.accepted_terms (bool) = false

(empty dir rice.network_pools)

rice.telemetry_enabled (bool) = true

rice.telemetry_persist (bool) = false

rice.telemetry_threads (uint) = 8

rice.enable_download (bool) = true

rice.init_performed (bool) = false

rice.ese_disconnect_alert_timeout (int) = 14400

rice.offline_collection_period (uint) = 7200

The ‘-q’ flag can also be used in conjunction with the isi_gconfig command to identify any values that are not at their default settings. For example, the stock (default) Rice gconfig context will not report any configuration entries:

# isi_gconfig -q -t rice

[root] {version:1}

OneFS SupportAssist Provisioning – Part 2

In the previous article in this OneFS SupportAssist series, we reviewed the off-cluster prerequisites for enabling OneFS SupportAssist:

  1. Upgrading the cluster to OneFS 9.5.
  2. Obtaining the secure access key and PIN.
  3. Selecting either direct connectivity or gateway connectivity.
  4. If using gateway connectivity, installing Secure Connect Gateway v5.x.

In this article, we turn our attention to step 5 – provisioning SupportAssist on the cluster.

Note that, as part of this process, we’ll be using the access key and PIN credentials previously obtained from the Dell Support portal in step 2 above.

Provisioning SupportAssist on a cluster

SupportAssist can be configured from the OneFS 9.5 WebUI by navigating to ‘Cluster management > General settings > SupportAssist’. To initiate the provisioning process on a cluster, click on the ‘Connect SupportAssist’ link, as below:

Note that if SupportAssist is unconfigured, the Remote Support page displays the following banner warning of the future deprecation of SRS:

Similarly, when unconfigured, the SupportAssist WebUI page also displays verbiage recommending the adoption of SupportAssist:

There is also a ‘Connect SupportAssist’ button to begin the provisioning process.

  1. Accepting the telemetry notice.

Selecting the ‘Configure SupportAssist’ button initiates the following setup wizard. The first step requires checking and accepting the Infrastructure Telemetry Notice:

 

  1. Support Contract.

For the next step, enter the details for the primary support contact, as prompted:

Or from the CLI using the ‘isi supportassist contacts’ command set. For example:

# isi supportassist contacts modify --primary-first-name=Nick --primary-last-name=Trimbee --primary-email=trimbn@isilon.com

 

  1. Establish Connections.

Next, complete the ‘Establish Connections’ page

This involves the following steps:

  • Selecting the network pool(s).
  • Adding the secure access key and PIN,
  • Configuring either direct or gateway access
  • Selecting whether to allow remote support, CloudIQ telemetry, and auto case creation.

a.  Select network pool(s).

At least one statically-allocated IPv4 network subnet and pool is required for provisioning SupportAssist. As of OneFS 9.5, does not support IPv6 networking for SupportAssist remote connectivity. However, IPv6 support is planned for a future release.

Select one or more network pools or subnets from the options displayed. For example, in this case ‘subnet0pool0’:

Or from the CLI:

Select one or more static subnet/pools for outbound communication. This can be performed via the following CLI syntax:

# isi supportassist settings modify --network-pools="subnet0.pool0"

Additionally, if the cluster has the OneFS 9.5 network firewall enabled (‘isi network firewall settings’), ensure that outbound traffic is allowed on port 9443.

b.  Add secure access key and PIN.

In this next step, add the secure access key and pin. These should have been obtained in an earlier step in the provisioning procedure from the following Dell Support site: https://www.dell.com/support/connectivity/product/isilon-onefs.:

Alternatively, if configuring SupportAssist via the OneFS CLI, add the key and pin via the following syntax:

# isi supportassist provision start --access-key <key> --pin <pin>

 

c.  Configure access.

  i.  Direct access.

From the WebUI, under ‘Cluster management > General settings > SupportAssist’ select the ‘Connect directly’ button:

Or from the CLI. For example, to configure direct access (the default), ensure the following parameter is set:

# isi supportassist settings modify --connection-mode direct

# isi supportassist settings view | grep -i "connection mode"

        Connection mode: direct

  ii.  Gateway access.

Alternatively, to connect via a gateway, check the ‘Connect via Secure Connect Gateway’ button:

Complete the ‘gateway host’ and ‘gateway port’ fields as appropriate for the environment.

Alternatively, to set up a gateway configuration from the CLI, use the ‘isi supportassist settings modify’ syntax. For example, to configure using the gateway FQDN ‘secure-connect-gateway.yourdomain.com’ and the default port ‘9443’:

# isi supportassist settings modify --connection-mode gateway

# isi supportassist settings view | grep -i "connection mode"

        Connection mode: gateway

# isi supportassist settings modify --gateway-host secure-connect-gateway.yourdomain.com --gateway-port 9443

When setting up the gateway connectivity option, Secure Connect Gateway v5.0 or later must be deployed within the data center. Note that SupportAssist is incompatible with either ESRS gateway v3.52 or SAE gateway v4. However, Secure Connect Gateway v5.x is backwards compatible with PowerScale OneFS ESRS, which allows the gateway to be provisioned and configured ahead of a cluster upgrade to OneFS 9.5.

 

d.  Configure support options.

Finally, configure the desired support options:

When complete, the WebUI will confirm that SmartConnect is successfully configured and enabled, as follows:

Or from the CLI:

# isi supportassist settings view

        Service enabled: Yes

       Connection State: enabled

      OneFS Software ID: ELMISL0223BJJC

          Network Pools: subnet0.pool0, subnet0.testpool1, subnet0.testpool2, subnet0.testpool3, subnet0.testpool4

        Connection mode: gateway

           Gateway host: eng-sea-scgv5stg3.west.isilon.com

           Gateway port: 9443

    Backup Gateway host: eng-sea-scgv5stg.west.isilon.com

    Backup Gateway port: 9443

  Enable Remote Support: Yes

Automatic Case Creation: Yes

       Download enabled: Yes

OneFS SupportAssist Provisioning – Part 1

In OneFS 9.5, several OneFS components also now leverage SupportAssist as their secure off-cluster data retrieval and communication channel. These include:

Component Details
Events and Alerts SupportAssist can send CELOG events and attachments via ESE to CLM.
Diagnostics Logfile gathers can be uploaded to Dell via SupportAssist.
License activation License activation uses SupportAssist for the ‘isi license activation start’ CLI cmd
Telemetry Telemetry is sent through SupportAssist to CloudIQ for analytics
Health check Health check definition downloads will now leverage SupportAssist
Remote Support Remote Support now uses SupportAssist along with Connectivity Hub

For existing clusters, SupportAssist supports the same basic workflows as its predecessor, ESRS. This makes the transition from old to new generally pretty seamless.

As such, the overall process for enabling OneFS SupportAssist is as follows:

  1. Upgrade the cluster to OneFS 9.5.
  2. Obtain the secure access key and PIN.
  3. Select either direct connectivity or gateway connectivity.
  4. If using gateway connectivity, install Secure Connect Gateway v5.x.
  5. Provision SupportAssist on the cluster.

We’ll go through each of the configuration steps above in order:

  1. Upgrading to OneFS 9.5.

First, the cluster must be running OneFS 9.5 in order to configure SupportAssist.

There are some additional considerations and caveats to bear in mind when upgrading to OneFS 9.5 and planning on enabling SupportAssist. These include the following:

  • SupportAssist is disabled when STIG Hardening applied to cluster
  • Using SupportAssist on a hardened cluster is not supported.
  • Clusters with the OneFS network firewall enabled (‘isi network firewall settings’) may need to allow outbound traffic on ports 443 and 8443, plus 9443 if gateway (SCG) connectivity is configured.
  • SupportAssist is supported on a cluster that’s running in Compliance mode

If upgrading from an earlier release, the OneFS 9.5 upgrade to must be committed before SupportAssist can be provisioned.

Also, ensure that the user account that will be used to enable SupportAssist belongs to a role with the ‘ISI_PRIV_REMOTE_SUPPORT’ read and write privilege:

# isi auth privileges | grep REMOTE
ISI_PRIV_REMOTE_SUPPORT                           Configure remote support

For example, the ‘ese’ user account below:

# isi auth roles view SupportAssistRole
       Name: SupportAssistRole
Description: -
    Members: ese
 Privileges
             ID: ISI_PRIV_LOGIN_PAPI
     Permission: r
             ID: ISI_PRIV_REMOTE_SUPPORT
     Permission: w

 

  1. Obtaining secure access key and PIN.

An access key and pin are required in order to provision SupportAssist, and these secure keys are held in Key manager under the RICE domain. This access key and pin can be obtained from the following Dell Support site: https://www.dell.com/support/connectivity/product/isilon-onefs.

In the Quick link navigation bar, select the ‘Generate Access key’ link:

On the following page, select the appropriate button:

The credentials required to obtain an access key and pin vary depending on prior cluster configuration. Sites that have previously provisioned ESRS will need their OneFS Software ID (SWID) to obtain their access key and pin.

The ‘isi license list’ CLI command can be used to determine a cluster’s SWID. For example:

# isi license list | grep "OneFS Software ID"

OneFS Software ID: ELMISL999CKKD

However, customers with new clusters and/or have not previously provisioned ESRS or SupportAssist will require their Site ID in order to obtain the access key and pin.

Note that any new cluster hardware shipping after January 2023 will already have a built-in key, so this key can be used in place of the Site ID above.

For example, if this is the first time registering this cluster and it does not have a built-in key, select ‘Yes, let’s register’:

Enter the Site ID, site name, and location information for the cluster:

Choose a 4-digit PIN and save it for future reference. After that click the ‘Create My Access Key’ button:

Next, the access key is generated:

An automated email is sent from the ‘Dell | ServicesConnectivity Team’ containing the pertinent key info. For example:

Note that this access key is valid for one week, after which it automatically expires.

 

  1. Direct or gateway topology decision. 

A topology decision will need to be made between implementing either direct connectivity or gateway connectivity, depending on the needs of the environment:

  • Direct Connect:

  • Gateway Connect:

SupportAssist uses ports 443 and 8443 by default for bi-directional communication between the cluster and Connectivity Hub. As such, these ports will need to be open across any firewalls or packet filters between the cluster and the corporate network edge to allow connectivity to Dell Support.

Additionally, port 9443 is used for communicating with a gateway (SCG).

# grep -i esrs /etc/services

isi_esrs_d      9443/tcp  #EMC Secure Remote Support outbound alerts

 

  1. Optional Secure Connect Gateway installation.

This step is only required when deploying a Secure Connect gateway. If a direct connect topology is desired, go directly to step 5 below.

When configuring SupportAssist with the gateway connectivity option, Secure Connect Gateway v5.0 or later must be deployed within the data center.

Dell Secure Connect Gateway (SCG) is available for Linux, Windows, Hyper-V, and VMware environments, and as of writing, the latest version is 5.14.00.16. The installation binaries can be downloaded from: https://www.dell.com/support/home/en-us/product-support/product/secure-connect-gateway/drivers

The procedure to download SCG is as follows:

  1. Sign in to www.dell.com/SCG-App . The Secure Connect Gateway – Application Edition page is displayed. If you have issues signing in using your business account or unable to access the page even after signing in, contact Dell Administrative Support.
  2. In the Quick links section, click Generate Access key.
  3. On the Generate Access Key page, perform the following steps:
  4. Select a site ID, site name, or site location.
  5. Enter a four-digit PIN and click Generate key. An access key is generated and sent to your email address. NOTE: The access key and PIN must be used within seven days and cannot be used to register multiple instances of secure connect gateway.
  6. Click Done.
  7. On the Secure Connect Gateway – Application Edition page, click the Drivers & Downloads tab.
  8. Search and select the required version.
  9. In the ACTION column, click Download.

The following steps are required in order to setup SCG:

https://dl.dell.com/content/docu105633_secure-connect-gateway-application-edition-quick-setup-guide.pdf?language=en-us

Pertinent resources for installing SCG include:

Users guide, for system and network requirements, steps to create business account, and installation instructions. https://www.dell.com/SCG-App-docs

Support matrix, for supported devices, protocols, firmware versions, and operating systems: https://www.dell.com/SCG-App-docs

The SCG dashboard page displays a variety of device and network information, status, and metrics. For example:

Another useful source of SCG installation, configuration, and troubleshooting information is the Dell support forum: https://www.dell.com/community/Secure-Connect-Gateway/bd-p/SCG

 

  1. Provisioning SupportAssist on the cluster.

At this point, the off-cluster pre-staging work should be complete.

In the next article in this series, we turn our attention to the SupportAssist provisioning process on the cluster itself (step 5).

OneFS SupportAssist Architecture and Operation

The previous article in this series looked at an overview of OneFS SupportAssist. Now, we’ll turn our attention to its core architecture and operation.

Under the hood, SupportAssist relies on the following infrastructure and services:

Service Name
ESE Embedded Service Enabler.
isi_rice_d Remote Information Connectivity Engine (RICE).
isi_crispies_d Coordinator for RICE Incidental Service Peripherals including ESE Start.
Gconfig OneFS centralized configuration infrastructure.
MCP Master Control Program – starts, monitors, and restarts OneFS services.
Tardis Configuration service and database.
Transaction journal Task manager for RICE.

Of these, ESE, isi_crispies_d, isi_rice_d, and the Transaction Journal are new in OneFS 9.5 and exclusive to SupportAssist. In contrast, Gconfig, MCP, and Tardis are all legacy services that are used by multiple other OneFS components.

The Remote Information Connectivity Engine (RICE) represents the new SupportAssist ecosystem for OneFS to connect to the Dell backend, and the high level architecture is as follows:

Dell’s Embedded Service Enabler (ESE) is at the core of the connectivity platform and acts as a unified communications broker between the PowerScale cluster and Dell Support. ESE runs as a OneFS service and, on startup, looks for an on-premises gateway server. If none is found, it connects back to the connectivity pipe (SRS). The collector service then interacts with ESE to send telemetry, obtain upgrade packages, transmit alerts and events, etc.

Depending on the available resources, ESE provides a base functionality with additional optional capabilities to enhance serviceability. ESE is multithreaded, and each payload type is handled by specific threads. For example, events are handled by event threads, binary and structured payloads are handled by web threads, etc. Within OneFS, ESE gets installed to /usr/local/ese and runs as ‘ese’ user and  group.

The responsibilities of isi_rice_d include listening for network changes, getting eligible nodes elected for communication, monitoring notifications from CRISPIES, and engaging Task Manager when ESE is ready to go.

The Task Manager is a core component of the RICE engine. Its responsibility is to watch the incoming tasks that are placed into the journal and assign workers to step through the tasks state machine until completion. It controls the resource utilization (python threads) and distributes tasks that are waiting on a priority basis.

The ‘isi_crispies_d’ service exists to ensure that ESE is only running on the RICE active node, and nowhere else. It acts, in effect, like a specialized MCP just for ESE and RICE-associated services, such as IPA. This entails starting ESE on the RICE active node, re-starting it if it crashes on the RICE active node, and stopping it and restarting it on the appropriate node if the RICE active instance moves to another node. We are using ‘isi_crispies_d’ for this, and not MCP, because MCP does not support a service running on only one node at a time.

The core responsibilities of ‘isi_crispies_d’ include:

  • Starting and stopping ESE on the RICE active node
  • Monitoring ESE and restarting, if necessary. ‘isi_crispies_d’ restarts ESE on the node if it crashes. It will retry a couple of times and then notify RICE if it’s unable to start ESE.
  • Listening for gconfig changes and updating ESE. Stopping ESE if unable to make a change and notifying RICE.
  • Monitoring other related services.

The state of ESE, and of other RICE service peripherals, is stored in the OneFS tardis configuration database so that it can be checked by RICE. Similarly, ‘isi_crispies_d’ monitors the OneFS Tardis configuration database to see which node is designated as the RICE ‘active’ node.

The ‘isi_telemetry_d’ daemon is started by MCP and runs when SupportAssist is enabled. It does not have to be running on the same node as the active RICE and ESE instance. Only one instance of ‘isi_telemetry_d’ will be active at any time, and the other nodes will be waiting for the lock.

The current status and setup of SupportAssist on a PowerScale cluster can be queried via the ‘isi supportassist settings view’ CLI command. For example:

# isi supportassist settings view

        Service enabled: Yes

       Connection State: enabled

      OneFS Software ID: ELMISL08224764

          Network Pools: subnet0:pool0

        Connection mode: direct

           Gateway host: -

           Gateway port: -

    Backup Gateway host: -

    Backup Gateway port: -

  Enable Remote Support: Yes

Automatic Case Creation: Yes

       Download enabled: Yes

This can also be obtained from the WebUI by navigating to Cluster management > General settings > SupportAssist:

SupportAssist can be enabled and disabled via the ‘isi services’ CLI command set. For example:

# isi services isi_supportassist disable

The service 'isi_supportassist' has been disabled.

# isi services isi_supportassist enable

The service 'isi_supportassist' has been enabled.

# isi services -a | grep supportassist

   isi_supportassist    SupportAssist Monitor                    Enabled

The core services can be checked as follows:

# ps -auxw | grep -e 'rice' -e 'crispies' | grep -v grep

root    8348    9.4  0.0 109844  60984  -  Ss   22:14        0:00.06 /usr/libexec/isilon/isi_crispies_d /usr/bin/isi_crispies_d

root    8183    8.8  0.0 108060  64396  -  Ss   22:14        0:01.58 /usr/libexec/isilon/isi_rice_d /usr/bin/isi_rice_d

Note that once a cluster is provisioned with SupportAssist, ESRS can no longer be used. However, customers that have not previously connected their clusters to Dell Support may still provision ESRS, but will be presented with a message encouraging them to adopt the best practice of using SupportAssist

Additionally, SupportAssist in OneFS 9.5 does not currently support IPv6 networking, so clusters deployed in IPv6 environments should continue to use ESRS until SupportAssist IPv6 integration is introduced in a future OneFS release.

OneFS SupportAssist

Amongst the myriad of new features that are introduced in the OneFS 9.5 release is SupportAssist, Dell’s next-gen remote connectivity system.

Dell SupportAssist helps rapidly identify, diagnose, and resolve cluster issues fast, and provides the following key benefits:

  • Improve productivity by replacing manual routines with automated support.
  • Accelerate resolution, or avoid issues completely, with predictive issue detection and proactive remediation.
  • SupportAssist is included with all support plans (features vary based on service level agreement).

Within OneFS, SupportAssist is intended for transmitting events, logs, and telemetry from PowerScale to Dell support. As such, it provides a full replacement for the legacy ESRS.

Delivering a consistent remote support experience across the Dell storage portfolio, SupportAssist is intended for all sites that can send telemetry off-cluster to Dell over the internet. SupportAssist integrates the Dell Embedded Service Enabler (ESE) into PowerScale OneFS along with a suite of daemons to allow its use on a distributed system

SupportAssist ESRS
Dell’s next generation remote connectivity solution. Being phased out of service.
Can either connect directly, or via supporting gateways. Can only use gateways for remote connectivity.
Uses Connectivity Hub to coordinate support. Uses ServiceLink to coordinate support.
Requires access key and pin, or hardware key, to enable. Uses customer username and password to enable.

SupportAssist uses Dell Connectivity Hub and can either interact directly, or through a Secure Connect gateway.

SupportAssist comprises a variety of components that gather and transmit various pieces of OneFS data and telemetry to Dell Support, via the Embedded Service Enabler (ESE).  These workflows include CELOG events, In-product activation (IPA) information, CloudIQ telemetry data, Isi-Gather-info (IGI) logsets, and provisioning, configuration and authentication data to ESE and the various backend services.

Operation Details
Event Notification In OneFS 9.5, SupportAssist can be configured to send CELOG events and attachments via ESE to CLM.   CELOG has a ‘supportassist’ channel that, when active, will create an EVENT task for SupportAssist to propagate.
License Activation The isi license activation start command uses SupportAssist to connect.

Several pieces of PowerScale and OneFS functionality require licenses, and to register and must communicate with the Dell backend services in order to activate those cluster licenses. In OneFS 9.5, SupportAssist is the preferred mechanism to send those license activations via the Embedded Service Enabler(ESE) to the Dell backend. License information can be generated via the ‘isi license generate’ CLI command, and then activated via the ‘isi license activation start’ syntax.

Provisioning SupportAssist must register with backend services in a process known as provisioning.  This process must be executed before the Embedded Service Enabler(ESE) will respond on any of its other available API endpoints.  Provisioning can only successfully occur once per installation, and subsequent provisioning tasks will fail. SupportAssist must be configured via the CLI or WebUI before provisioning.  The provisioning process uses authentication information that was stored in the key manager upon the first boot.
Diagnostics The OneFS isi diagnostics gather and isi_gather_info logfile collation and transmission commands have a –supportassist option.
Healthchecks HealthCheck definitions are updated using SupportAssist.
Telemetry CloudIQ Telemetry data is sent using SupportAssist.
Remote Support Remote Support uses SupportAssist and the Connectivity Hub to assist customers with their clusters.

SupportAssist requires an access key and PIN, or hardware key, in order to be enabled, with most customers likely using the access key and pin method. Secure keys are held in Key manager under the RICE domain.

In addition to the transmission of data from the cluster to Dell, Connectivity Hub also allows inbound remote support sessions to be established for remote cluster troubleshooting.

In the next article in this series, we’ll take a deeper look at the SupportAssist architecture and operation.

OneFS SmartQoS Monitoring and Troubleshooting

The previous articles in this series have covered the SmartQoS architecture, configuration, and management. Now, we’ll turn out attention to monitoring and troubleshooting.

The ‘isi statistics workload’ CLI command can be used to monitor the dataset’s performance. The ‘Ops’ column displays the current protocol operations per second. In the following example, OPs stabilize  around 9.8, which is just below the configured limit value of 10 Ops.

# isi statistics workload --dataset ds1 & data

Similarly, this next example from the SmartQoS WebUI shows a small NFS workflow performing 497 protocol OPS in a pinned workload with a limit of 500 OPS:

Multiple paths and protocols can be pinned by selecting ‘Pin Workload’ option for a given Dataset. Here, four directory path workloads are each configured with different Protocol OPs limits:

When it comes to troubleshooting SmartQoS, there are a few areas that are worth checking right away, including the SmartQoS Ops limit configuration, isi_pp_d and isi_stats_d daemons, and the protocol service(s).

  1. For suspected Ops limit configuration issues, first confirm that the SmartQoS limits feature is enabled:
# isi performance settings view
Top N Collections: 1024
Time In Queue Threshold (ms): 10.0
Target read latency in microseconds: 12000.0
Target write latency in microseconds: 12000.0
Protocol Ops Limit Enabled: Yes

Next, verify that the workload level protocols_ops limit is correctly configured:

# isi performance workloads view <workload>

Check whether any errors are reported in the isi_tardis_d configuration log:

# cat /var/log/isi_tardis_d.log
  1. To investigating isi_pp_d, first check that the service is enabled
# isi services –a isi_pp_d

Service 'isi_pp_d' is enabled.

If necessary, the isi_pp_d service can be restarted as follows:

# isi services isi_pp_d disable

Service 'isi_pp_d' is disabled.

# isi services isi_pp_d enable

Service 'isi_pp_d' is enabled.

There’s also an isi_pp_d debug tool, which can be helpful in a pinch:

# isi_pp_d -h

Usage: isi_pp_d [-ldhs]

-l Run as a leader process; otherwise, run as a follower. Only one leader process on the cluster will be active.

-d Run in debug mode (do not daemonize).

-s Display pp_leader node (devid and lnn)

-h Display this help.

Debugging can be enabled on the isi_pp_d log file with the following command syntax:

# isi_ilog -a isi_pp_d -l debug, /var/log/isi_pp_d.log

For example, the following log snippet shows a typical isi_ppd_d.log message communication between isi_pp_d leader and isi_pp_d followers:

/ifs/.ifsvar/modules/pp/comm/SETTINGS

[090500b000000b80,08020000:0000bfddffffffff,09000100:ffbcff7cbb9779de,09000100:d8d2fee9ff9e3bfe,090001 00:0000000075f0dfdf]      

100,,,,20,1658854839  < in the format of <workload_id, cputime, disk_reads, disk_writes, protocol_ops, timestamp>

Here, the extract from the /var/log/isi_pp_d.log logfiles from nodes 1 and 2 of a cluster illustrate the different stages of Protocol Ops limit enforcement and usage:

  1. To investigate the isi_stats_d, first confirm that the isi_pp_d service is enabled:
# isi services -a isi_stats_d
Service 'isi_stats_d' is enabled.

If necessary, the isi_stats_d service can be restarted as follows:

# isi services isi_stats_d disable

# isi services isi_stats_d enable

The workload level statistics can be viewed with the following command:

# isi statistics workload list --dataset=<name>

Debugging can be enabled and on the isi_stats_d log file with the following command syntax:

# isi_stats_tool --action set_tracelevel --value debug

# cat /var/log/isi_stats_d.log
  1. To investigate protocol issues, the ‘isi services’ and ‘lwsm’ CLI commands can be useful. For example, to check the status of the S3 protocol:
# /usr/likewise/bin/lwsm list | grep -i protocol
hdfs                       [protocol]    stopped
lwswift                    [protocol]    running (lwswift: 8393)
nfs                        [protocol]    running (nfs: 8396)
s3                         [protocol]    stopped
srv                        [protocol]    running (lwio: 8096)

# /usr/likewise/bin/lwsm status s3
stopped

# /usr/likewise/bin/lwsm info s3
Service: s3
Description: S3 Server
Categories: protocol
Path: /usr/likewise/lib/lw-svcm/s3.so
Arguments:
Dependencies: lsass onefs_s3 AuditEnabled?flt_audit_s3
Container: s3

The above CLI output confirms that the S3 protocol is inactive. The S3 service can be started as follows:

# isi services -a | grep -i s3
s3                   S3 Service                               Enabled

Similarly, the S3 service can be restarted as follows:

# /usr/likewise/bin/lwsm restart s3
Stopping service: s3
Starting service: s3

To investigate further, the protocol’s log level verbosity can be increase. For example, to set the s3 log to ‘debug’:

# isi s3 log-level view
Current logging level is 'info'

# isi s3 log-level modify debug

# isi s3 log-level view
Current logging level is 'debug'

Next, view and monitor the appropriate protocol log. For example, for the S3 protocol:

# cat /var/log/s3.log

# tail -f /var/log/s3.log

Beyond the above, /var/log/messages can also be monitored for pertinent errors, since the main partition performance (PP) modules log to this file. Debug level logging can be enabled for the various PP modules as follows

Dataset:

# sysctl ilog.ifs.acct.raa.syslog=debug+ 
ilog.ifs.acct.raa.syslog: error,warning,notice (inherited) -> error,warning,notice,info,debug

Workload:

# sysctl ilog.ifs.acct.rat.syslog=debug+
ilog.ifs.acct.rat.syslog: error,warning,notice (inherited) -> error,warning,notice,info,debug

Actor work:

# sysctl ilog.ifs.acct.work.syslog=debug+
ilog.ifs.acct.work.syslog: error,warning,notice (inherited) -> error,warning,notice,info,debug

When finished, the default logging levels for the above modules can be restored as follows:

# sysctl ilog.ifs.acct.raa.syslog=notice+

# sysctl ilog.ifs.acct.rat.syslog=notice+

# sysctl ilog.ifs.acct.work.syslog=notice+