Subclusters

Because Eon Mode separates compute and storage, you can create subclusters within your cluster to isolate work. For example, you might want to dedicate some nodes to loading data and others to executing queries. Or you might want to create subclusters for dedicated groups of users (who might have different priorities). You can also use subclusters to organize nodes into groups for easily scaling your cluster up and down.

Every node in your Eon Mode database must belong to a subcluster. This requirement means your database must always have at least one subcluster. When you create a new Eon Mode database, Vertica creates a subcluster named default_subcluster that contains the nodes you create when initially creating your database. If you add nodes to your database and do not specify a subcluster to add them to, Vertica adds them to the default subcluster. You can choose to designate another subcluster as the default subcluster, or rename default_subcluster to something more descriptive. See Altering Subcluster Settings for more information.

Fault Group Conversions when Upgrading to 9.3.0 or Beyond

In versions of Vertica prior to version 9.3.0, you defined subclusters in an Eon Mode database using a feature called Fault Groups The fault groups you defined in an Eon Mode database would be treated as a subcluster. Starting in Vertica version 9.3.0, you explicitly define subclusters. When you upgrade an Eon Mode database from a prior version to 9.3.0 or later, the upgrade script automatically converts fault groups defined in the database to subclusters.

During the upgrade, Vertica

  • converts the fault groups to subclusters. Unlike fault groups, subclusters do not have any form of nesting. Therefore, any nested fault groups become individual subclusters.
  • makes all of the converted subclusters into primary subclusters.
  • assigns nodes that are part of a fault to the corresponding converted subcluster.
  • assigns nodes that are not in a fault group to the default subcluster that Vertica creates during the upgrade process.

In Vertica 9.3.0, the connection load balancing policy has been updated to allow for load balancing groups based on subclusters. See About Connection Load Balancing Policies for more information about this feature. When Vertica upgrades an Eon Mode database to version 9.3.0 or beyond, it does not convert load balancing groups based on fault groups into groups based on the converted subclusters. You must redefine these load balance groups to be based on the newly-created subclusters yourself.

Using Subclusters for Work Isolation

Database administrators are often concerned about workload management. Intense analytics queries can consume so many resources that they interfere with other important database tasks, such as data loading. Subclusters help you prevent resource issues by isolating workloads from one another.

In Eon Mode, by default, queries only run on nodes in the subcluster that contains the initiator node. For example, consider the two subclusters shown in the following diagram. If you are connected to Node 4, your queries would run on nodes 4 through 9.

Image showing two subclusters, one labelled "Load Subcluster" contains nodes 1 through 3. The other, named "Query Subcluster" contains nodes 4 through 8.

Similarly, queries started on Node 1 only run on nodes 1 through 3.

This isolation lets you configure your database cluster to prevent workloads from interfering with each other. You can assign subclusters to specific tasks such as loading data, performing in-depth analytics, and short-running dashboard queries. You can also create subclusters for different groups in your organization, so their workloads do not interfere with one another.

Subcluster Types

There are two types of subclusters: primary and secondary. In most cases, primary subclusters form the core of your Vertica database. Your primary subclusters should always be running. They are best suited for tasks such as running DDL statements and data loading, as they can perform these tasks more efficiently than secondary subclusters. You usually do not dynamically change the size of primary subclusters or stop them to scale your cluster.

Secondary subclusters are designed for dynamic scaling: you add and remove or start and stop these subclusters based on your analytic needs. They are optimized for running queries, rather than DDL or data loading workloads. Their query performance should be better than primary subclusters. While you can perform DDL and data loading on a secondary subcluster, the performance of these operations is slower, as they must rely on a primary subcluster to finish database commits.

The nodes in the subcluster inherit their primary or secondary status from the subcluster that contains them; primary subclusters contain primary nodes and secondary subclusters contain secondary nodes.

Subcluster Types and Elastic Scaling

The most important difference between primary and secondary subclusters is their impact on how Vertica determines whether the database is K-Safe and has a quorum. Vertica only considers the nodes in primary subclusters (primary nodes) when determining whether all of the shards in the database have a subscribing node. It also only considers primary nodes when determining whether more than half the nodes in the database are running (also known as having a quorum of primary nodes). If either of these conditions is not met, the database shuts down to prevent data corruption. See Maintaining Data Integrity and High Availability in an Eon Mode Database for more information about how Vertica maintains data integrity.

Vertica does not consider the secondary nodes when determining whether the database has shard coverage or a quorum of nodes. This fact makes secondary subclusters perfect for managing groups of nodes that you plan to expand and reduce dynamically. You can stop or remove an entire subcluster of secondary nodes without triggering a safety shutdown.

If you stop or lose too many primary nodes, your database may shut down to maintain data integrity. For example, suppose you have a 3-node cluster that you use primarily to load data. When it comes time to create reports based on that data, you create a 6-node subcluster to perform analytics, bringing the total number of nodes in the cluster to 9. Once you are done with your analytics, you naturally want to stop the added subcluster to save money. If you added these nodes to a primary subcluster, stopping the subcluster (which stops all 6 nodes at once) causes your database to lose quorum because more than half of the primary nodes are no longer available. This loss of quorum causes your database to shut down.

To avoid this issue, use a secondary subcluster for the new nodes. The secondary subcluster makes the new nodes secondary nodes. Vertica does not count the secondary nodes when determining quorum of primary nodes. Therefore, you can shut down this subcluster without causing your database to shut down.

Additional Differences Between Primary and Secondary Nodes

In addition to not being considered when determining whether the database can continue to run safely, Vertica places several other restrictions on secondary nodes:

  • They cannot be the primary subscriber of a shard.
  • They cannot take part in cluster formation when the database starts. Only the primary nodes count when Vertica determines whether there are enough nodes that are up to start the database.
  • They are not responsible for committing a transaction. Primary nodes have the final say on whether a transaction is committed, even if the transaction ran on a secondary subcluster. This is the reason you should avoid using DDL statements or perform data loading on secondary subclusters. Primary subclusters are always more efficient for these workloads.
  • They do not persist their transaction logs to disk at the end of every commit. Persisting for each transaction is unnecessary because the primary nodes are responsible for commits. Secondary nodes do not write their logs to disk for each checkpoint. Because they spend less time persisting data, you should see higher query performance on secondary subclusters versus primary subclusters.

These restrictions are mainly necessary to ensure that Vertica can efficiently remove secondary nodes from the database.

Minimum Subcluster Size for K-Safe Databases

In a K-safe database, subclusters must have at least three nodes in order to operate. Each subcluster tries to maintain subscriptions to all shards in the database. If a subcluster has less than three nodes, it cannot maintain shard coverage. Vertica returns an error if you attempt to rebalance shards in a subcluster with less than three nodes in a K-safe database.

Subclusters and Shards

You can add new nodes to your database to help improve the number of queries that Vertica runs simultaneously (also called throughput scaling). The best practice to following when adding thee nodes is to create a new subcluster for them. Isolating the new nodes in a separate subcluster improves the database's ability to separately process queries.

Having more nodes in a subcluster than the number of shards in the database is less efficient than having separate subclusters. You will see the best throughput performance when you have multiple subclusters with the same number of nodes as you have shards.

When there are more nodes than shards in a subcluster, you may see some query throughput improvements. However, using subclusters to group nodes helps Vertica to more efficiently process the queries in parallel.

It is unlikely you will have a use case where you want to have more nodes in a cluster than you have shards in the database. However if you do, contact Vertica technical support to discuss configuration options that can improve performance in this case.

See Also