Wednesday, March 18, 2015

What is a quorum?

When creating a cluster in Windows, the Failover Cluster Manager will indicate which quorum configuration best suits your configuration. The importance of this choice is often overlooked.
Before we go into the possible configurations, it’s helpful to understand the function of a quorum first.
What is a quorum?
A quorum is nothing more than the minimum number of votes required for a majority.

Cluster nodes communicate with each other over the network (port 3343). When nodes are unable to communicate with each other, they all assume the resources of the other (unreachable) nodes have to be brought online. Because the same resource will be brought online on multiple nodes at the same time, data corruption can occur. This results in a situation called "Split Brain.”
To prevent Split Brains we need to bring the cluster resource online on a single node (rather than multiple nodes). This would have to be done in a ‘democratic’ way so the cluster can maintain functionality and repair itself. The Democracy means that it can only be done with a majority (quorum) of votes.
The Democracy of the cluster is maintained by giving each node in the cluster a vote. However if the number of nodes in the Cluster is even, the quorum disk (or file share) also has a vote to prevent a tie (more on this later in this post).
If the majority of nodes fail, the Cluster fails. The steps that the Cluster takes in order to repair itself are as follows:
  1. Each node attempts to communicate with the other nodes.
  2. Once this communication has been established, they consult with each other about the state of the Cluster until there is a consensus.
  3. The nodes are aware of the number of nodes (hence votes) that are in the Cluster, whether they are active or not. This awareness, combined with the number of active nodes, enables them to calculate what majority is required (i.e.: How many votes) for the Cluster.
  4. If no majority is achieved, the nodes wait until more nodes come online.
  5. When a majority is achieved, the cluster becomes active and brings the resources online. Note: This is a severe over-simplification of the steps performed; for a better understanding we would need to write a considerably longer article.
With this method, the cluster will be active only if the majority of the votes are present, which helps us to avoid the Split Brain scenario.
Quorum configurations The Failover Cluster Manager will offer you an advice when configuring the cluster quorum based on best practices which can be adjusted later (if needed) through the Failover Cluster Manager on a per-cluster level.
image
Even though the proposal made by the Failover Cluster Manager is usually the most sensible option, we must understand the advantages and disadvantages of the possible options.
Majority Node The Majority Node option is recommended for clusters with an odd number of nodes. This configuration can handle a loss of half of the number of cluster nodes rounded off downwards. For example, a cluster of five nodes can handle the failure of two nodes.
When we have a five nodes cluster, where three of the nodes can communicate with each other but the last two cannot, the three nodes have enough votes for a majority and the cluster will remain active despite the possible failure of two nodes. The on-line nodes will remove the active cluster membership from the other two nodes until they can be reached again and are reported healthy. However if one more node fails, only two nodes will be left in the cluster. Now the cluster will fail because a majority of votes cannot be reached (two does not add up to a majority of five).
image
No Majority: Disk Only This option only requires one active cluster node to keep the cluster active; the quorum is stored on a disk. This disk becomes a single point of failure and therefore a risk. For that reason, this option is generally not recommended.
If all cluster nodes are healthy, the network has no problems. However when the disk is no longer available (or becomes corrupted) the cluster fails.
Node and Disk Majority This option is recommended for clusters with an even number of nodes and provides multiple possibilities regarding the failure of nodes and therefor the maximum number of nodes that may fail before the cluster goes down. This depends on the availability of the disk (also named "Witness Disk Resource"). Aside from holding a vote, the disk also serves as storage for the most recent cluster database.
If the disk is available this configuration can handle a failure of half the cluster nodes because it holds an extra vote. In a six-node cluster three nodes can fail and the cluster remains online. If the disk is not available, this configuration can handle a failure of half the nodes minus one. So the same six-node cluster can handle the failure of two nodes and still remain online.
In the diagram below, the "Witness" is the Witness Disk Resource.
image

Node and File Share Majority
The principle of this configuration is similar to "Node and Disk Majority" but in this case the disk is replaced by a file share (also called the "File Share Witness Resource").
There is one important difference: after the failure of a cluster, one of the nodes will have a local copy of the most recent cluster configuration. If that is not present (or only an outdated version is available) the cluster service must be started manually on one of the nodes which will cause the cluster configuration to be updated from the file share.
This option is often chosen in the case of SAP systems or when the nodes are in different physical locations.

In the diagram below, the "witness" is the File Share Witness Resource.
image

Which option in any situation?
For convenience, I put together this table to summarize your options for any situation. X’s are recommended, I’s are possible, and O’s are not recommended.


Quorum \ nodes 1 2 3 4 5 6 7 8 Recommended = X
Possible = I
Not Recommended = O
Node Majority X O X O X O X O
Disk only O O O O O O O O
Node and disk majority O X O X O X O X
Node and file share majority O I O I O I O I
Conclusion
Each cluster option has advantages and disadvantages. Although the Failover Cluster Manager will recommend what the logical choice would be when you create your cluster, you do not have to take its advice. In my professional experience the recommendation of the Failover Cluster Manager is usually the right choice, and generally "Node Majority" or "Node and Disk Majority" have been the best options.

No comments:

Post a Comment