Original Publication Date: September 8, 2014
What is the criteria to use total number of Zookeepers either 1/3/5/7/9 or 11 in a cluster?
When setting up the zookeeper quorum the reasonable numbers are 1, 3 and 5 nodes.
1 is useful if you don't want redundancy at all. This happens, for instance on the sandbox version where you have only a single node in the cluster.
3 is useful for failure tolerance, but it is sensitive to hardware failure during maintenance when you might have one machine down.
5 is used in large, high-value clusters which need to stay up at all costs.
It is very rare to use more than 5 Zookeeper nodes in a cluster.
Why does the number need to be odd an number?
The rationale for having an odd number is Zookeeper always requires a majority of the nodes in the Zookeeper cluster to agree on any change. This means that any number of nodes less than half can fail and Zookeeper can still function. For example, if you have 4 nodes configured, you can only suffer one failure and still continue operating. If you have two failures, then the 2 remaining nodes can never by greater than half (since they are exactly half) of the nodes. This is no better than with three nodes. Also, since the chance of two failures in four nodes is slightly more likely than the chance of two failures in three nodes, this would not be favorable.
This same logic applies to any even number. Since an even number of nodes adds no assurance relative to the odd number with one less ZK server, the standard practice is to always use an odd number.