Monday, August 11, 2014

What is the meaning of the "jeopardy" status of a cluster node and when will the Service Group get into an autodisabled state?

Issue

There are times when a cluster node is reported in a jeopardy membership and we also see messages of service groups getting into autodisabled state. This article will explain as what is the special jeopardy membership in a VCS Cluster and under what scenarios will a service group get into an autodisabled state.

Solution

From Veritas Cluster Server (VCS) point of view, when a system in the cluster loses all but the last link of the LLT interconnect links, then that system is placed in a special cluster membership called "jeopardy" membership status. The Service Groups running on the system continue to be online or offline and the state of the Service Groups are not changed, however the cluster node is now in a special 'jeopardy' state but can continue running in this state until any one of the following three things that could occur:

1.) The loss of the last available interconnect link
In this case, the cluster cannot reliably identify and discriminate if the last interconnect link is lost or the system itself is lost and hence the cluster will form a network partition causing two or more mini  clusters to be formed depending on the actual network partition. At this time, every Service Group that is not online on its own mini cluster, but may be online on the other mini cluster will be marked to be in an "autodisabled" state for that mini cluster until such time that the interconnect links start communicating normally.

2.) The loss of an existing system which is currently in jeopardy state due to a problem
In this case, the situation is exactly the same as explained in step 1 forming two or more mini clusters.

3.) The LLT interconnect links get into connected state again so that the cluster node has more than one interconnect link accessible and
In this scenario, the cluster will note the LLT membership change due to more than one LLT interconnect link available and hence will now remove the cluster from jeopardy membership and place it in normal cluster membership for all those nodes where more than one interconnect links are now available again.
For example, here is the expected behavior in a two node cluster when we have GAB port a and h configured but NO iofencing port b configured:
Test: node A (Service Group A online on this node) and node B ( Service Group B online on this node) with 2 private interconnect LLT links configured and no low-priority LLT links
On loss of one link from node A, the GAB will report a change in membership and will place the node in Joepardy membership but no impact or change to the Service Group status

 Lets assume after 5 minutes, we have a loss of the second LLT interconnect on node A, so at this time, the Cluster Service Group A goes to autodisable state and on the node B it reports reports offline or offline faulted. When the LLT interconnects reconnect and the LLT links are visible to both the nodes, the node A again joins the regular cluster membership and the Service Groups will no longer be in autodisabled state.

In case where both both the LLT interconnect links disconnect at the same time and we do not have any low-pri links configured, then the cluster cannot reliably identify if it is the interconnects that have disconnected and will assume that the other system is down and now unavailable.  Hence in this scenario, the cluster would consider this like a system fault and the service groups will be attempted to be onlined on each mini cluster depending upon the system StartupList defined on each Service Group. This may lead to a possible data corruption due to Applications writing to the same underlying data on storage from different systems at the same time.
 
At this stage, if we are in a network partitioned state and if the LLT interconnect links get re-connected once again, then at that time the cluster will identify that the Failover Service Groups are online on more than one node at the same time and hence it will detect this and report a "Concurrency Voilation". The cluster will then initiate an offline of the Service Group on any one node.

Depending on how long were the LLT interconnect links in disconnected state and after how much time they resumed the connectivity, the possibility of data corruption is quite high. The above scenario will only occur if IOFENCING is not configured, and hence we always recommend to configure IOFENCING to enable a a high level of data protection in similar scenarios to avoid a possible data corruption.

In the above explained scenarios, the behavior will be slightly different if I/O Fencing is enabled in SCSI3 mode. Please refer to the product documentation to understand clearly how does IOFENCING benefit customers from the possible risks of data corruption.

No comments:

Post a Comment