Installing a Large Cluster

Whether you are forming a new large cluster (adding all nodes for the first time) or expanding an existing cluster to a large cluster, Vertica provides two methods that let you specify the number of control nodes (the nodes that run control messaging). See the following sections for details:

If you want to install a new large cluster

To configure Vertica for a new, large cluster, pass the install_vertica script the --large-cluster <integer> argument. Vertica selects the first <integer> number of hosts from the comma-separated --hosts host_list as control nodes and assigns all other hosts you specify in the --hosts argument to a control node based on a round-robin model.

The number of hosts you include in the --hosts argument determines a large cluster layout, not the number of nodes you later include in the database. If you specify 120 or more hosts in the --hosts list, but you do not specifically enable large cluster by providing the --large-cluster argument, Vertica automatically enables large cluster and configures control nodes for you.

To help control nodes and the nodes assigned to them be configured for the highest possible fault tolerance, you must specify hosts in the --hosts host_list in a specific order. For example, if you have four sets of hosts on four racks, the first four nodes in the --hosts host_list must be one host from each rack in order to have one control node per rack. Then the list must consist of four hosts from each rack in line with the first four hosts. You'll continue to use this pattern of host listing for all targeted hosts. See Sample rack-based cluster hosts topology below for examples.

If you pass the --large-cluster argument a DEFAULT value instead of an <integer> value, Vertica calculates a number of control nodes based on the total number of nodes specified in the --hosts host_list argument. If you want a specific number of control nodes on the cluster, you must use the <integer> value.

For more information, see the following topics:

Sample rack-based cluster hosts topology

This example shows a simple, multi-rack cluster layout, in which cluster nodes are evenly distributed across three racks. Each rack has one control node.

In the rack-based example:

  • Rack-1, Rack-2, and Rack-3 are managed by a single network switch
  • Host-1_1, Host-1_2, and Host-1_3 are control nodes
  • All hosts on Rack-1 are assigned to control node Host-1_1
  • All hosts on Rack-2 are assigned to control node Host-1_2
  • All hosts on Rack-3 are assigned to control node Host-1_3

In the following install_vertica script fragment, note the order of the hosts in the --hosts list argument. The final arguments specifically enable large cluster and provide the number of control nodes (3):

... install_vertica --hosts Host-1-1,Host-1-2,Host-1-3,
    Host-2-1,Host-2-2,Host-2-3,Host-3-1,Host-3-2,Host-3-3,
    Host-4-1,Host-4-2,Host-4-3,Host-5-1,Host-5-2,Host-5-3 -rpm
    <vertica-package-name> <other-required-options>
    --large-cluster 3

After the installation process completes, use the Administration Tools to create a database. This operation generates a Vertica cluster with three control nodes and their respective associated hosts that reside on the same racks as the control node.

If you want to expand an existing cluster

When you add a node to an existing cluster, Vertica places the new node in an appropriate location within the cluster ring. Vertica then assigns the newly-added node to a control node, based on the cluster's current allocations.

To give you more flexibility and control over which nodes run Spread, you can use the SET_CONTROL_SET_SIZE(integer) function. This function works like the installation script's --large-cluster <integer> option. See Defining and Realigning Control Nodes on an Existing Cluster for details.

The Vertica installation script cannot alter the database cluster.