OneFS SmartFail

OneFS protects data stored on failing nodes or drives in a cluster through a process called Smartfail. During the process, OneFS places a device into quarantine and, depending on the severity of the issue, the data on it into a read-only state. While a device is quarantined, OneFS reprotects the data on the device by distributing the data to other devices.

After all data eviction or reconstruction is complete, OneFS logically removes the device from the cluster and the node or drive can be physically replaced. OneFS only automatically Smartfails devices as a last resort. Nodes and/or drives can also be manually Smartfailed. However, it is strongly recommended to first consult Dell Technical Support.

Occasionally a device might fail before OneFS detects a problem. If a drive fails without being Smartfailed, OneFS automatically starts rebuilding the data to available free space on the cluster. However, because a node might recover from a transient issue, if a node fails, OneFS does not start rebuilding data unless it is logically removed from the cluster.

A node that is unavailable and reported by isi status as ‘D’, or down, can be Smartfailed. If the node is hard down, likely with a significant hardware issue, the Smartfail process will take longer because data has to be recalculated from the FEC protection parity blocks. That said, it’s well worth attempting to bring the node up if at all possible – especially if the cluster and/or node pools is at the default +2D:1N protection.  The concern here is that, with a node down, there is a risk of data loss if a drive or other component goes bad during the Smartfail process.

If possible, and assuming the disk content is still intact, it can often be quicker to have the node hardware repaired. In this case, the entire node’s chassis (or compute module in the case of Gen 6 hardware) could be replaced and the old disks inserted with original content on them. This should only be performed at the recommendation and under the supervision of Dell Technical support. If the node is down as a result of a journal inconsistency, it will have to be Smartfailed out. In this case,  Support should be engaged to determine an appropriate action plan.

The recommended procedure for Smartfailing a node is as follows. In this example, we’ll assume that node 4 is down:

  1. From the CLI of any node except node 4, run the following command to Smartfail out the node:
# isi devices node smartfail --node-lnn 4
  1. Verify that node is removed from the cluster.
# isi status –q

(An ‘—S-’ will appear in node 4’s ‘Health’ column to indicate it has been Smartfailed).

  1. Monitor the successful completion of the job engine’s MultiScan, FlexProtect/FlexProtectLIN jobs:
# isi job status
  1. Un-cable and remove the node from the rack for disposal

As mentioned above, there are two primary Job Engine jobs that run as a result of a Smartfail:

  • MultiScan
  • FlexProtect or FlexProtectLIN

MultiScan performs the work of the both AutoBalance and Collect jobs simultaneously, and it is triggered after every group change. The reason is that new file layouts and file deletions that happen during a disruption to the cluster may be imperfectly balanced or, in the case of deletions, simply lost.

The Collect job reclaims free space from previously unavailable nodes or drives. A mark and sweep garbage collector, it identifies everything valid on the filesystem in the first phase, then in the second phase scans the drives freeing anything that isn’t marked valid.

AutoBalance ensures that, when node and drive usage across the cluster are out of balance. This job scans through all the drives looking for files to re-layout to make use of the less filled devices.

The purpose of the FlexProtect job is to scan the file system after a device failure to ensure that all files remain protected. Incomplete protection levels are fixed, in addition to missing data or parity blocks caused by drive or node failures. This job is started automatically after Smartfailing a drive or node. If a Smartfailed device was the reason the job started, the device is marked gone (completely removed from the cluster) at the end of the job.

Although a new node can be added to a cluster at any time, it’s best to avoid major group changes during a Smartfail operation. This helps avoid any unnecessary interruptions of a critical job engine data reprotection job. However, since a node is down, there is a window of risk while the cluster is rebuilding the data from that cluster.  Under pressing circumstances the Smartfail operation can be paused, the node added, and then Smartfail resumed once the new node has happily joined the cluster.

Be aware that, if the node you are adding is the same node that was Smartfailed, the cluster maintains a record of that node and may prevent the re-introduction of that node until the Smartfail is complete.  To mitigate risk, Dell Technical Support should definitely be involved to ensure data integrity.

The time for a Smartfail to complete is hard to predict with any accuracy, and is dependent on:

Attribute Description
OneFS release Determines OneFS job engine version and how efficiently it operates.
System hardware Drive types, CPU, RAM, etc.
File system Quantity and type of data (ie. small vs large files), protection, tunables, etc.
Cluster load Processor and CPU utilization, capacity utilization, etc.

Typical Smartfail runtimes range from minutes for fairly empty, idle nodes with SSD and SAS drives to days for nodes with large SATA drives and a high capacity utilization. The FlexProtect job already runs at the highest job engine priority (value=1) and medium impact by default. As such, there isn’t much that can be done to speed up this job, beyond reducing cluster load.

Smartfail is also a valuable tool for proactive cluster node replacement, for example during a hardware refresh.  Provided cluster quorum is not broken, a Smartfail can be initiated on multiple nodes concurrently – but never more than n/2 – 1 nodes (rounded up)!

If replacing an entire node-pool as part of a tech refresh, a SmartPools filepool policy can be crafted to migrate the data to another nodepool across the back-end network. When complete, the nodes can then be Smartfailed out, which should progress swiftly since they are now empty.

If multiple nodes are Smartfailed simultaneously, at the final stage of the process the node remove is serialized with around 60 seconds pause between each. The Smartfail job places the selected nodes in read-only mode while it copies the protection stripes to the cluster’s free space. Using SmartPools to evacuate data from a node or set of nodes in preparation to remove them is generally a good idea, and is usually a relatively fast process.

SmartPools’ Virtual Hot Spare (VHS) functionality helps ensure that node pools maintain enough free space to successfully re-protect data in the event of a Smartfail. Though configured globally, VHS actually operates at the node pool level so that nodes with different size drives reserve the appropriate VHS space. This helps ensures that, while data may move from one disk pool to another during repair, it remains on the same class of storage. VHS reservations are cluster wide and configurable as either a percentage of total storage (0-20%) or as a number of virtual drives (1-4), with the default being 10%.

Note, a Smartfail is not guaranteed to remove all data on a node. Any pool in a cluster that’s flagged with the ‘System’ flag can store /ifs/.ifsvar data. A filepool policy to move the regular data won’t address this data. Also, since SmartPools ‘spillover’ may have occurred at some point, there are no guarantees that an ‘empty’ node is completely devoid of data. For this reason, OneFS still has to search the tree for files that may have blocks residing on the node.

Nodes can be easily Smartfailed via the OneFS WebUI by navigating to Cluster Management > Hardware Configuration and selecting ‘Actions > More > Smartfail Node’ for the desired node(s):

Similarly, the following CLI commands initiate and halt the node Smartfail process respectively. Firstly, the ‘isi devices node smartfail’ command kicks off the Smartfail process on a node and removes it from the cluster.

# isi devices node smartfail -h

Syntax

# isi devices node smartfail

[--node-lnn <integer>]

[--force | -f]

[--verbose | -v]

If necessary, the ‘isi devices node stopfail’ command can be used to discontinue the Smartfail process on a node.

# isi devices node stopfail -h

Syntax

isi devices node stopfail

[--node-lnn <integer>]

[--force | -f]

[--verbose | -v]

Similarly, individual drives within a node can be Smartfailed with the ‘isi devices drive smartfail’ CLI command.

# isi devices drive smartfail { <bay> | --lnum <integer> | --sled <string> }

        [--node-lnn <integer>]

        [{--force | -f}]

        [{--verbose | -v}]

        [{--help | -h}]

When it comes to Smartfailing PowerScale chassis-based nodes, there are a couple of other things to be aware of regarding the mirrored journal:

  • When you smartfail a node in a node pair, you do not have to smartfail its partner node.
  • A node will still run indefinitely with its partner missing. However, this significantly increases the window of risk since there’s no journal mirror to rely on (in addition to lack of redundant power supply, etc).
  • If you do smartfail a single node in a pair, the journal is still protected by the vault and powerfail memory persistence.

OneFS Drain-based Upgrade

In the previous blog article, we looked at the mechanism by which OneFS enables non-disruptive upgrades. For NFS users, rolling node upgrades is typically a pretty seamless event since client connections can be dynamically moved and rebalanced across the other nodes. However, for SMB clients, rolling upgrades can be more impactful.

During an upgrade workflow, nodes will reboot, and the SMB protocol service must be stopped temporarily. This results in a brief disruption for Windows clients  connected to the rebooting node. To solve this, OneFS 9.2 introduces a drain-based upgrade feature, which provides a mechanism by which nodes are prevented from rebooting or restarting protocol services until all SMB clients have disconnected from the node. A single SMB client that does not disconnect can cause the upgrade to be delayed indefinitely, so the cluster administrator is provided with options to reboot the node despite persisting clients.

The drain-based upgrade supports both OneFS, firmware and combined upgrades, and can be configured and managed via the OneFS WebUI, CLI, and RESTful platform API.

  • SMB protocol
  • OneFS upgrades
  • Firmware upgrades
  • Cluster reboots
  • Combined upgrades (OneFS and Firmware)

Drain-based upgrade is predicated upon the parallel upgrade workflow, covered in a previous article, and which offers accelerated upgrades for large clusters by working across OneFS neighborhoods, or fault domains. By concurrently upgrading a node per neighborhood, the more node neighborhoods there are within a cluster the more parallel activity can occur.

For simplicities sake, assume a PowerScale cluster comprising five H700 chassis, divided into two neighborhoods, each containing ten nodes.

The following CLI command can be used to identify the correlation between the cluster’s nodes and OneFS neighborhoods, or failure domains:

# sysctl efs.lin.lock.initiator.coordinator_weights

Once the drain-based upgrade is started, a maximum of one node from each neighborhood will get the reservation which allows the nodes to upgrade simultaneously. OneFS will not reboot these nodes until the number of SMB clients is “0”. In this example Node 1 and Node 8 get the reservation for upgrading at the same time. However, there is one SMB connection to Node 1 and two SMB connections to Node 8. Neither of these nodes will be able to reboot until their SMB connection count gets to “0”. At this point, there are three options available:

Drain Action Description
Wait Wait until the SMB connection count reaches “0” or it hits the drain timeout value. The drain timeout value isa configurable parameter for each upgrade process. It is the maximum waiting period. If drain timeout is set to “0”, it means wait forever.
Delay drain Add the node into the delay list to delay client draining. The upgrade process will continue on another node in this neighborhood. After all the non-delayed nodes are upgraded, OneFS will rewind to the node in the delay list.
Skip drain Stop waiting for clients to migrate away from the draining node and reboot immediately.

 

The following CLI command can be used to confirm whether there are any active SMB connections. In this case, node 1 has one connected Windows client:

# isi statistics query current --keys=node.clientstats.connected.smb

 Node  node.clientstats.connected.smb

-------------------------------------

    1                               1

-------------------------------------

The ‘isi upgrade’ CLI command syntax can be used to perform the drain-based upgrade, and now includes flags for configuring drain-timeout and alert-timeout. In this example setting each to value 60 minutes and 45 minutes respectively. As such, if there is still an SMB connection after 45 minutes, a CELOG alert will be triggered to notify the cluster administrator. And after an hour, any remaining SMB connections will be dropped and the node upgrade reboot will continue.

# isi upgrade start --parallel --skip-optional --install-image-path=/ifs /data/<installation-file-name> --drain-timeout=60m --alert-timeout=45m

From the OneFS WebUI, the same can be achieved by navigating to Upgrade under Cluster management. In the example below, the WebUI indicates that node 1 is waiting for a draining SMB client. The response can be either to Skip or Delay.

If ‘Delay’ is selected, node one pauses to allow the remaining active client connection to drain:

After ‘Skip’ is chosen, the active client count is reduced to 0 and upgrade continues:

Here, the WebUI reports that node 2 has completed upgraded and is in the process of rebooting:

Once all nodes have completed, the upgrade can be committed by running the following command:

# isi upgrade cluster commit

Or from the WebUI:

Finally, confirm that the current version of OneFS is correct by running the following command:

# isi version

OneFS Non-disruptive Upgrades

When it comes to updating the OneFS version on a cluster, there are three primary options:

Of these, the simultaneous reboot is fast but disruptive, in that all the cluster’s nodes are upgraded and restarted in unison.

The other two options, rolling and parallel, are non-disruptive upgrades (NDUs), which allow a storage admin to upgrade a cluster while their end users continue to access data.

During the rolling upgrade process, one node at a time is updated to the new code, and the active clients attached to it are automatically migrated to other nodes in the cluster. Partial upgrade is also permitted, whereby a subset of cluster nodes can be upgraded, and the subset of nodes may also be grown during the upgrade. OneFS also allows an upgrade to be paused and resumed enabling customers to span upgrades over multiple smaller Maintenance Windows.

However, for larger clusters, OneFS also offers a parallel upgrade option. Parallel upgrade provides upgrade efficiency within node pools on clusters with multiple neighborhoods (availability zones), allowing the simultaneous upgrading of a node per neighborhood until the pool is complete . By doing this, the upgrade duration is dramatically reduced, while ensuring that end-users still continue to have full access to their data.

The parallel upgrade option avoids rebooting nodes unless a Diskpools DB reservation can be taken on that node. Each node runs the pre-upgrade optional and mandatory steps in lockstep. Nodes will not proceed to the MarkUpgrading state until the pre-upgrade checks have run successfully on all nodes. Once a node has reached the MarkUpgrading state, it will proceed through the upgrade hooks without regard for the completion state of hook on other nodes (ie not in lockstep).

Given that OneFS’ parallel upgrade option can dramatically improve the OneFS upgrade efficiency without impacting the data availability, the following formula can be used to estimate the duration of the parallel upgrade:

𝐸𝑠𝑡𝑖𝑚𝑎𝑡𝑖𝑜𝑛 𝑡𝑖𝑚𝑒 = (𝑝𝑒𝑟 𝑛𝑜𝑑𝑒 𝑢𝑝𝑔𝑟𝑎𝑑𝑒 𝑑𝑢𝑟𝑎𝑡𝑖𝑜𝑛) × (ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑛𝑜𝑑𝑒𝑠 𝑝𝑒𝑟 𝑛𝑒𝑖𝑔ℎ𝑏𝑜𝑟ℎ𝑜𝑜𝑑)

In the above formula:

  • The first parameter – per node upgrade duration – is around 20 minutes on average.
  • The second parameter – the highest number of nodes per neighborhood – can be obtained by running the following CLI command:
# sysctl efs.lin.lock.initiator.coordinator_weights

For example, consider a 150 node OneFS cluster. In an ideal layout, there would be 15 neighborhoods, each containing ten nodes. Neighborhood 1 would comprise nodes 1 to 10, neighborhood 2, nodes 11 to 20, and so on and so forth.

During the parallel upgrade, the upgrade framework will pick at most one node from each neighborhood, to run the upgrading job simultaneously. So in this case, node 1 from neighborhood 1st, node 11 from neighborhood 2nd, node 21 from neighborhood 3rd and etc will be upgraded at the same time. Considering, they are all in different neighborhoods or failure domain, it will not impact the current running workload.  After the first pass completes, it will go to the 2nd pass and then 3rd and etc.

So, in the 150 node example above, the estimated duration of the parallel upgrade is 200 minutes:

𝐸𝑠𝑡𝑖𝑚𝑎𝑡𝑖𝑜𝑛 𝑡𝑖𝑚𝑒 = (𝑝𝑒𝑟 𝑛𝑜𝑑𝑒 𝑢𝑝𝑔𝑟𝑎𝑑𝑒 𝑑𝑢𝑟𝑎𝑡𝑖𝑜𝑛) × (ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑛𝑜𝑑𝑒𝑠 𝑝𝑒𝑟 𝑛𝑒𝑖𝑔ℎ𝑏𝑜𝑟ℎ𝑜𝑜𝑑) = 20 × 10 = 200 𝑚𝑖𝑛𝑢𝑡𝑒𝑠

Under the hood, the OneFS non-disruptive upgrade system consists an UpgradeAgent and UpgradeSupervisor components.

The UpgradeAgent is a daemon that runs on every node. The UpgradeAgent’s role is to continually attempt to advance the upgrade process through to completion. It accomplishes this doing two things:

  1. Ensuring that an UpgradeSupervisoris running somewhere on the cluster by (a) checking to see if an upgrade is in progress and (b) waiting for its time slot, grabbing a lock file and then attempting to launch a supervisor.
  2. Receiving messages from any actively running UpgradeSupervisorand taking action on those messages.

The UpgradeSupervisor is a short-lived process which assesses the current state of the cluster and then takes action to advance the progress of the upgrade. The UpgradeSupervisor is stateless. It collects the persistent state of each node from that node’s UpgradeAgent using a status message. It also collects any information persistent on a cluster-wide basis. After reconstructing the current state of the upgrade process, it will then take action to affect the progress of the upgrade by dispatching an action message to the appropriate UpgradeAgent.

Since isi upgrade is an asynchronous process, the nodes in the cluster take turns running the controlling process. As such, the process that starts the upgrade does not run the upgrade but only sets it up. So when an ‘isi upgrade’ CLI command is run it will return fairly quickly. This also means that can’t stop the upgrade by stopping one process. Instead, a stop and restart option is provided using the ‘isi upgrade pause’ and ‘isi upgrade resume’ CLI commands.

Parallel upgrades are easily configured from the OneFS CLI by navigating to Cluster Management > Upgrade, and selecting ‘Parallel upgrade’ from the Upgrade type drop-down menu:

This can also be kicked-off from the OneFS command line using the following CLI syntax:

# isi upgrade start --parallel <upgrade_image>

Similarly, to start a rolling upgrade, which is the default, run:

# isi upgrade cluster start <upgrade_image>

The following CLI syntax will initiate a simultaneous upgrade:

# isi upgrade cluster start --simultaneous <upgrade_image>

Note that the upgrade framework always defaults to a rolling upgrade. Caution is advised when using the CLI to perform a simultaneous upgrade and the scheduling ‘type’ must be specified, i.e., –rolling, –simultaneous or –parallel

For example:

# isi upgrade cluster start /ifs/install.tar isi upgrade cluster start <code_path>

Since OneFS supports the ability to roll back to the previous version, in-order to complete an upgrade it must be committed.

isi upgrade cluster commit

Up until the time an upgrade is committed, an upgrade can be rolled back to the prior version as follows.

isi upgrade cluster rollback

The isi upgrade view CLI command can be used to monitor how the upgrade is progressing:

# isi upgrade viewisi upgrade view -i/--interactive

The following command will provide more detailed/verbose output:

# isi_upgrade_status

A faster, simpler version of isi_upgrade_status is also available:

# isi_upgrade_node_state-a (aggregate the latest hook update for each node)-devid=<X,Y,E-F>  (filter and display by devid)-lnn=<X-Y,A,C> (filter and display by LNN)-ts (time sort entries)

If the end of a maintenance window is reached but the cluster is not fully upgraded, the upgrade process can be quiesced and then restarted using the following CLI commands:

# isi upgrade pause
# isi upgrade resume

For example:

# isi upgrade pause

You are about to pause the running process, are you sure?  (yes/[no]):

yes

The process will be paused once the current step completes.

The current operation can be resumed with the command:

# isi upgrade resume

Note that pausing is not immediate: The upgrade will remain in a “Pausing” state until the currently
upgrading node is completed. Additional nodes will not be upgraded until the upgrade process is resumed.

The ‘pausing’ state can be viewed with the following commands: ‘isi upgrade view’ and ‘isi_upgrade_status’. Note that a rollback can be initiated either during ‘Pausing’ or ‘Paused’ states. Also, be aware that the ‘isi upgrade pause’ command has no effect when performing a simultaneous OneFS upgrade.

A rolling reboot can be initiated from the CLI on a subset of cluster nodes using the ‘isi upgrade rolling-reboot’ syntax and the ‘–nodes’ flag specifying the desired LNNs for upgrade:

# isi upgrade rolling-reboot --help

Description:

    Perform a Rolling Reboot of cluster.

Required Privileges:

    ISI_PRIV_SYS_UPGRADE

Usage:

    isi upgrade cluster rolling-reboot

        [--nodes <integer_range_list>]

        [--force]

        [{--help | -h}]

Options:

    --nodes <integer_range_list>

        List of comma (1,3,7) or dash (1-7) specified node LNNs to select. "all"

        can also be used to select all the cluster nodes at any given time.

  Display Options:

    --force

        Do not ask confirmation.

    --help | -h

        Display help for this command.

This ‘isi upgrade view’ syntax provides better visibility, status and progress of the rolling reboot process. For example:

# isi upgrade view

Upgrade Status:

Current Upgrade Activity: RollingReboot

   Cluster Upgrade State: committed

   Upgrade Process State: Not started

      Current OS Version: 9.2.0.0

      Upgrade OS Version: N/A

        Percent Complete: 0%

Nodes Progress:

     Total Cluster Nodes: 3

       Nodes On Older OS: 3

          Nodes Upgraded: 0

Nodes Transitioning/Down: 0

LNN  Progress  Version  Status

---------------------------------

1    100%        9.2.0.0  committed

2    rebooting   Unknown  non-responsive

3    0%          9.2.0.0  committed

Due to the duration of OneFS upgrades on larger clusters, it can sometimes be unclear if an OS upgrade is actually progressing or has stalled. To address this, if an upgrade is not making progress after fifteen minutes, the upgrade framework automatically sends a SW_UPGRADE_NODE_NON_RESPONSIVE alert via CELOG. For example:

# isi event events list

ID        Occurred       Sev    Lnn   Eventgroup ID  Message                                                                     

---------------------------------------------------------------------------------------------------------------

2.1805  06/14 04:33  C       2      1087               Excessive Time executing a Hook on Node: 3
# isi status

...

Critical Events:

Time                 LNN  Event                          

---------------     ----  ------------------------------ 

06/14 05:16:30  2    Excessive Time executing a ... 

...

The isi_upgrade_logs command also provides detailed upgrade tracking and debugging data.

Usage: isi_upgrade_logs [-a|--assessment][--lnn][--process {process name}][--level {start level,end level][--time {start time,end time][--guid {guid} | --devid {devid}]

 + No parameter this utility will pull error logs for the current upgrade process

 + -a or --assessment - will interrogate the last upgrade assessment run and display the results

The following arguments enable filtering to help extract the desired upgrade information:

Filter CMD Flag Description
–guid dump the logs for the node with the supplied guid
–devid dump the logs for the node/s with the supplied devid/s
–lnn dump the logs for the node/s with the supplied lnn/s
–process dump the logs for the node with the supplied process name
–level dump the logs for the supplied level range
–time dump the logs for the supplied time range
–metadata dump the logs matching the supplied regex

For example, to display all of the logs generated by isi_upgrade_agent_d on the node with LNN1:

# isi_upgrade_logs --lnn 1 --process /usr/sbin/isi_upgrade_agent_d  
…
1  2021-06-14T18:06:15  /usr/sbin/isi_upgrade_agent_d  Debug  Starting /usr/share/upgrade/event-actions/pre-upgrade-optional/read_only_node_check.py

1  2021-06-14T23:59:59  /usr/sbin/isi_upgrade_agent_d  Debug  Starting /usr/share/upgrade/event-actions/pre-upgrade-optional/isi_upgrade_checker

1  2021-06-14T18:06:15  /usr/sbin/isi_upgrade_agent_d  Debug  Starting /usr/share/upgrade/event-actions/pre-upgrade-optional/volcopy_check

1  2021-06-14T18:06:15  /usr/sbin/isi_upgrade_agent_d  Debug  Starting /usr/share/upgrade/event-actions/pre-upgrade-optional/empty

1  2021-06-14T18:06:15  /usr/sbin/isi_upgrade_agent_d  Debug  Starting Hook [/usr/share/upgrade/event-actions/pre-upgrade-optional/read_only_node_check.py]
 …

Note that the ‘–process’ flag requires the full name including path to be specified, as it is displayed in the logs.

For example, the following CLI syntax displays a list all of the Upgrade-related process names that have logged to LNN 1:

# isi_upgrade_logs --lnn 1 | awk ‘{print $3}’ | sort | uniq

These process names can then be added to the ‘–process’ argument.

OneFS CELOG Alerts and Events WebUI

The OneFS 9.2 release introduced a number of OneFS usability enhancements for managing cluster events and alerts. This new functionality makes it considerably simpler to filter events chronologically, categorize by their status, filter by the severity, search the event history, resolve, suppress or ignore bulk events, and manage scheduled maintenance windows.

For example, you can easily categorize, identify, and filter events by using the following criteria:

Action Detail
Show ·         Show events for:

–    Today

–    This week

–    This month

–    Custom range/

–    All

Categorize ·         Categorize events by their status:

–    Active

–    Ignored

–    Resolved

–    All

Filter ·         Filter events by severity:

–    Emergency

–    Critical

–    Warning

–    Information

Search ·         Search for specific event(s) in the event history
Resolve ·         Resolve bulk events
Ignore ·         Ignore bulk events

The new WebUI page for event group history can be accessed by navigating to Cluster management > Events and Alerts. For example:

With OneFS 9.2, CELOG maintenance mode can also easily be manually enabled and disabled. During a maintenance window, the system will continue to log events but not generate alerts. However, all events that occurred during the maintenance window can then be reviewed upon disabling maintenance mode. Active event groups will automatically resume generating alerts when the scheduled maintenance period ends. For example, to enable CELOG maintenance mode, in the OneFS WebUI select Cluster Management > Events and Alerts > Alert management tab and click on the ‘Enable CELOG maintenance mode’ button. In the prompt window, select ‘Enable CELOG maintenance mode’ as follows:

Create an Alert channel. Either SMTP or SNMP can be configured for the alert channel communication, and can be created by selecting ‘Create channel’:

To create an alert rule, click the tab Alerting rule and then click the button Create alert rule. In the prompt window, fill the Rule name, set the Rule condition to NEW, apply it to all the Alert categories, and attached it to the channel you have just created.

Events can be created using CLI syntax similar to the following:

# /usr/bin/isi_celog/celog_send_events.py -o 940100002

Heap looptimes: [-1]

running -1 [940100002]

1612871342 :: Sending eventids [940100002] with specifier None

940100002 message is OneFS {version} is currently running on unsupported nodes (devid(s) {devids}). {msg}.

1.195 (70368744177859) corresponds to eventid 940100002

Out of events to run. Exiting.
# /usr/bin/isi_celog/celog_send_events.py -o 940100001

Heap looptimes: [-1]

running -1 [940100001]

1612872343 :: Sending eventids [940100001] with specifier None

940100001 message is OneFS {version} is currently running and is not supported on this hardware: {msg}.

1.196 (70368744177860) corresponds to eventid 940100001

Out of events to run. Exiting.

During maintenance mode, OneFS will still show the event but there will be no associated alert. In this example, there is no SMTP alert email triggered.

The following CLI syntax can also be used to filter all the events which happened while the cluster was in CELOG maintenance mode:

# isi event groups list --maintenance-mode=true

ID   Started     Ended       Causes Short                           Lnn  Events  Severity

------------------------------------------------------------------------------------------

16   02/09 11:49 --          HW_CLUSTER_ONEFS_VERSION_NOT_SUPPORTED 1    1       critical

17   02/09 12:05 02/09 12:19 HW_ONEFS_VERSION_NOT_SUPPORTED         1    1       critical

------------------------------------------------------------------------------------------

Click the “Disable CELOG maintenance mode’ button. and select one of the following from the display window:

    1. View event details
    2. Ignore event
    3. Resolve event

In this example, the event HW_ONEFS_VERSION_NOT_SUPPORTED is marked resolved by clicking Action and Resolve event.

After the CELOG maintenance mode is disabled, you will get the email notification only for HW_CLUSTER_ONEFS_VERSION_NOT_SUPPORTED. The event which has been marked resolved will not trigger any notification.

When an event type is suppressed, it prevents an event from alerting on all configured CELOG channels. However, the event will still be displayed in the event group history.

To suppress an event type, click the button Suppress for a specific event under Event type ID tab. In this example, both 930100006 and 930100005 have been suppressed.

Create several events, for example by using CLI commands such as the following:

# /usr/bin/isi_celog/celog_send_events.py -o 930100006

Heap looptimes: [-1]

running -1 [930100006]

1612873812 :: Sending eventids [930100006] with specifier None

930100006 message is {sensor} out of spec in chassis {chassis} slot {slot}.

1.200 (70368744177864) corresponds to eventid 930100006

Out of events to run. Exiting.
# /usr/bin/isi_celog/celog_send_events.py -o 930100005

Heap looptimes: [-1]

running -1 [930100005]

1612873817 :: Sending eventids [930100005] with specifier None

930100005 message is {sensor} out of spec in chassis {chassis} slot {slot}.

1.201 (70368744177865) corresponds to eventid 930100005

Out of events to run. Exiting.

To list all the events in the suppressed list, use the following CLI syntax:

# isi event suppress list

ID        Name

—————————-

930100005 HWMON_ANY_DISCRETE

930100006 HWMON_ANY_METERS

—————————-

These suppressed events will only show in the event history and will not trigger any notification in any channels.

The desired event types can be un-suppressed by clicking pertinent the Un-suppress button(s).

 

OneFS External Key Management for Data Encryption

Data-at-rest data is inactive content that is physically housed on a cluster, or other storage medium. Protecting this data with cryptography ensures that it’s guarded against theft, in the event that drives or nodes are removed from a PowerScale cluster. Data-at-rest encryption (DARE) is a requirement for federal and industry regulations ensuring data is encrypted when it is stored. OneFS has provided DARE solutions for many years through self-encrypting drives (SEDs) and, until now, an internal key management system.

The new OneFS 9.2 release introduces External Key Management (EKM) support for encrypted clusters, through the key management interoperability protocol, or KMIP. This enables offloading of the Master Key from a node to an External Key Manager, such as SKLM, SafeNet or Vormetric. Enhanced security is inherently provided through the separation of key manager from cluster, since node(s) cannot be rebooted without the keys. It also supports the secure transport of nodes. Additionally, centralized key management is also made available for multiple SED clusters, and provides the option to migrate existing keys from a cluster’s internal key store.

EKM provides enhanced security through the separation of the key manager from the cluster, enabling the secure transport of nodes, and helping organizations to meet regulatory compliance and corporate data at rest security requirements.

In order to use the OneFS 9.2 External Key Manager feature, clearly a cluster with self-encrypting drives is needed. Additionally, for the SED drives to be unlocked and user data made available, each node in the cluster must first contact the KMIP server to obtain the master encryption key from the server. Nodes in the cluster cannot boot without contacting the KMIP server first. Note that clusters without all their nodes connected to a front-end network (NANON) do not support External Key Management.

For the server, a KMIP Compliant Server supporting KMIP version 1.2 or greater is needed, such as:

Vendor Key Manager & Version
Dell EMC CloudLink Center 6.0
Gemalto Gemalto KeySecure 8.7 k150v

Gemalto KeySecure 8.7 k170v

IBM Secure Key Lifecycle Manager (SKLM) 2.6.0.2

Secure Key Lifecycle Manager (SKLM) 2.7.0.0

Secure Key Lifecycle Manager (SKLM) 3.0.0

Thales E-Security KeyAuthority 4.0

Also:

  • KMIP Storage Array with SEDS Profile Version 1.0
  • KMIP server host/port information
  • 509 PKI for TLS mutual authentication
    • Certificate authority bundle
    • Client certificate and private key

External Key Management can be configured on OneFS 9.2 as follows:

  1. Obtain the KMIPs Server and Client Certificates. Copy both certificates to the cluster and make a note of the file names and location.
  2. From the OneFS web interface, select Access > Key Management

Alternatively, this can also be accomplished via the OneFS CLI:

# isi keymanager kmip servers create
  1. From the WebUI Key Management page, enter the KMIP server information and specify the filename with the server/client certificates’ location. If the KMIP has a client certificate password specified, enter that here. Check the “Enable Key Management” box and click Submit.

  1. Next, OneFS contacts the KMIP and confirms the connection or displays any errors.
  2. Once the KMIP server is added, the keys can now be migrated. Click the Keys tab to display all current Master Keys on the cluster. Click on Migrate all to migrate the keys to the KMIP server. From the “Migrate all” pop-up, click Migrate to start the migration.

  1. The key migration process may take several minutes or more to complete depending on the cluster and network utilization. During this time, a “Migration in process” message is displayed.

  1. Once the process is complete, and “Migration Successful” message is displayed:

The following OneFS key management CLI commands are also available:

To configure external KMIP servers:

# isi keymanager kmip servers -h     

Description:

    Configure external KMIP servers.

Required Privileges:

    ISI_PRIV_KEY_MANAGER

Usage:

    isi keymanager kmip servers <action>

        [--timeout <integer>]

        [{--help | -h}]

Actions:

    create    Configure a new external KMIP server.

    delete    Delete a external KMIP server.

    modify    Modify a external KMIP server.

    list      View a list of configured external KMIP servers.

    view      View a single external KMIP server.

Options:

  Display Options:

    --timeout <integer>

        Number of seconds for a command timeout (specified as 'isi --timeout NNN <command>').

    --help | -h

        Display help for this command.

See 'isi keymanager kmip servers <subcommand> --help' for more information on a specific subcommand.

To manage SED keystore settings:

# isi keymanager sed settings -h

Description:

    Manage self-encrypting drive keystore settings.

Required Privileges:

    ISI_PRIV_KEY_MANAGER

Usage:

    isi keymanager sed settings <action>

        [{--help | -h}]

Actions:

    modify    Modify SED settings

    view      View current SED settings.

Options:

  Display Options:

    --help | -h

        Display help for this command.

See 'isi keymanager sed settings <subcommand> --help' for more information on a specific subcommand.

And to report keymanager SED status:

# isi keymanager sed status    

 Node Status  Location  Remote Key ID  Error Info(if any)
----------------------------------------------------------
  1   REMOTE  Server    F84B50640CABD44B9D5F75427C2B5E

  2   REMOTE  Server    24285969BD8804A9A61EE39D99573B

  3   REMOTE   Server    7D561B1CA89B72B891B21BF834097F  
-----------------------------------------------------------
Total: 3