Planning a Large Cluster

In a large cluster layout of 120 nodes or more, nodes form a correlated failure group, governed by their control node—the node that runs control messaging (spread). If a control node fails, all nodes in its host group also fail.

This topic provides tips on how to plan for a large cluster arrangement. See Installing a Large Cluster and Large Cluster Best Practices for more information.

Planning the number of control nodes

Configuring a large cluster requires careful and thorough network planning. You must have a solid understanding of your network topology before you configure the cluster.

To assess how many cluster nodes should be control nodes, use the square root of the total number of nodes expected to be in the database cluster to help satisfy both data K-Safety and rack fault tolerance for the cluster. Depending on the result, you might need to adjust the number of control nodes to account for your physical hardware/rack count. For example, if you have 121 nodes (with a result of 11), and your nodes will be distributed across 8 racks, you might want to increase the number of control nodes to 16 so you have two control nodes per rack.

Specifying the number of control nodes

Vertica provides different tools to help you define the number of control nodes, depending on your current configuration. Consider the following scenarios, in which cluster nodes are distributed among three racks in different configurations:

If your cluster fits this scenario ... Consider this setup
Three control nodes. All other nodes are evenly distributed among the three racks. Specify one control node per rack.
Five control nodes and three racks. Specify two control nodes on two racks each and one control node on the final rack.
Four control nodes. One rack has twice as many nodes as the other racks. Specify two control nodes on the larger rack and one control node each on the other two racks.