Large Cluster Best Practices

Keep the following best practices in mind when you are planning and managing a large cluster implementation.

Planning the number of control nodes

To asses how many cluster nodes should be control nodes, use the square root of the total number of nodes expected to be in the database cluster to help satisfy both data K-Safety and rack fault tolerance for the cluster. Depending on the result, you might need to adjust the number of control nodes to account for your physical hardware/rack count. For example, if you have 121 nodes (with a result of 11), and your nodes will be distributed across 8 racks, you might want to increase the number of control nodes to 16 so you have two control nodes per rack.

See Planning a Large Cluster Arrangement.

Control node assignment/realignment

After you specify the number of control nodes, you must update the control host's (spread) configuration files to reflect the catalog change. Certain cluster management functions might require that you run other functions or restart the database or both.

If, for example, you drop a control node, cluster nodes that point to it are reassigned to another control node. If that node fails, all the nodes assigned to it also fail, so you need to use the Administration Tools to restart the database. In this scenario, you'd call the REALIGN_CONTROL_NODES() and RELOAD_SPREAD(true) functions, which notify nodes of the changes and realign fault groups. Calling RELOAD_SPREAD(true) connects an existing cluster node to a newly-assigned control node.

On the other hand, if you run REALIGN_CONTROL_NODES() multiple times in a row, the layout does not change beyond the initial setup, so you don't need to restart the database. But if you add or drop a node and then run REALIGN_CONTROL_NODES(), the function call could change many node assignments.

Here's what happens with control node assignments when you add or drop nodes, whether those nodes are control nodes or non-control nodes:

If you add a cluster node—Vertica assigns a control node to the newly-added node based on the current cluster configuration. If the new node joins a fault group, it is assigned to a control node from that fault group and requires a database restart to reconnect to that control node. See Fault Groups for more information.
If you drop a non-control node—Vertica quietly drops the cluster node. This operation could change the cluster and spread layout, so you must call REBALANCE_CLUSTER() after you drop a node.
If you drop a control node—All nodes assigned to the control node go down. In large cluster implementations, however, the database remains up because the down nodes are not buddies with other cluster nodes.

Dropping a control node results in (n-1) control nodes. You must call REALIGN_CONTROL_NODES() to reset the cluster so it has n control nodes, which might or might not be the same number as before you dropped the control node. Remaining nodes are assigned new control nodes. In this operation, Vertica makes control node assignments based on the cluster layout. When it makes the new assignments, it respects user-defined fault groups, if any, which you can view by querying the V_CATALOG.CLUSTER_LAYOUT system table, a view that also lets you see the proposed new layout for nodes in the cluster. If you want to influence the layout of control nodes in the cluster, you should define fault groups.

For more information, see Defining and Realigning Control Nodes on an Existing Cluster and Rebalancing Data Across Nodes.

Allocate standby nodes

Have as many standby nodes available as you can, ideally on racks you are already using in the cluster. If a node suffers a non-transient failure, use the Administration Tools "Replace Host" utility to swap in a standby node.

Standby node availability is especially important for control nodes. If you are swapping a node that's a control node, all nodes assigned to the control node's host grouping will need to be taken offline while you swap in the standby node. For details on node replacement, see Replacing Nodes.

Plan for cluster growth

If you plan to expand an existing cluster to 120 or more nodes, you can configure the number of control nodes for the cluster after you add the new nodes. See Defining and Realigning Control Nodes.

Write custom fault groups

When you deploy a large cluster, Vertica automatically creates fault groups around control nodes, placing nodes that share a control node into the same fault group. Alternatively, you can specify which cluster nodes should reside in a particular correlated failure group and share a control node. See High Availability With Fault Groups in Vertica Concepts.

Use segmented projections

On large-cluster setups, minimize the use of unsegmented projections in favor of segmented projections. When you use segmented projections, Vertica creates buddy projections and distributes copies of segmented projections across database nodes. If a node fails, data remains available on the other cluster nodes.

Use the Database Designer

OpenText recommends that you use the Database Designer to create your physical schema. If you choose to design projections manually, you should segment large tables across all database nodes and replicate (unsegment) small table projections on all database nodes.