Configuring Your Vertica Cluster for Eon Mode

Running Vertica in Eon Mode decouples the cluster size from the data volume and lets you configure for your compute needs independently from your storage needs. There are a number of factors you must consider when deciding what sorts of instances to use and how large your Eon Mode cluster will be.

Before You Begin

Vertica Eon Mode works both in the cloud and on-premises. As a Vertica administrator setting up your production cluster running in Eon Mode, you must make decisions about the virtual or physical hardware you will use to meet your needs. This topic provides guidelines and best practices for selecting server types and cluster sizes for a Vertica database running in Eon Mode. It assumes that you have a basic understanding of the Eon Mode architecture and concepts such as communal storage, depot, and shards. If you need to refresh your understanding, see Eon Mode Architecture.

Cluster Size Overview

Because Eon Mode separates data storage from the computing power of your nodes, choosing a cluster size is more complex than for an Enterprise Mode database. Usually, you choose a base number of nodes that will form one or more primary subclusters. These subclusters contain nodes that are always running in your database. You usually use them for workloads such as data loading and executing DDL statements. You rarely alter the size of these subclusters dynamically. As you need additional compute resources to execute queries, you add one or more subclusters (usually secondary subclusters) of nodes to your database.

When choosing your instances and sizing your cluster for Eon Mode, consider the working data size that your database will be dealing with. This is the amount of data that most of your queries operate on. For example, suppose your database stores sales data. If most of the queries running on your database analyze the last month or two of sales to create reports on sales trends, then your working data size is the amount of data you typically load over a few months.

Choosing Instances or Physical Hardware

Depending on the complexity of your workload and expected concurrency, choose physical or virtual hardware that has sufficient CPU and memory. For production clusters Vertica recommends the following minimum configuration for either virtual or physical nodes in an Eon Mode database:

16 cores
128 GB RAM
2 TB of local storage

The above specifications are just a minimum recommendation. You should consider increasing these specifications based on your workloads and the amount of data you are processing.

You must have a minimum of 3 nodes in the an Eon Mode database cluster.

For specific recommendations of instances for cloud-based Eon Mode database, see:

Determining Local Storage Requirements

For both virtual and physical hardware, you must decide how much local storage your nodes need. In Eon Mode, the definitive copy of your database's data resides in the communal storage. This storage is provided by either a cloud-based object store such as AWS S3, or by an on-premises object store, such as a Pure Storage FlashBlade appliance.

Even though your database's data is stored in communal storage, your nodes still need some local storage. A node in an Eon Mode database uses local storage for three purposes:

Depot storage: To get the fastest response time for frequently executed queries, provision a depot large enough to hold your working data set after data compression. Divide the working data size by the number of nodes you will have in your subcluster to estimate the size of the depot you need for each node in a subcluster. See Choosing the Number of Shards and the Initial Node Count below to get an idea of how many nodes you want in your initial database subcluster. In cases where you expect to dynamically scale your cluster up, estimate the depot size based on the minimum number of nodes you anticipate having in the subcluster.

Also consider how much data you will load at once when sizing your depot. When loading data, Vertica defaults to writing uncommitted ROS files into the depot before uploading the files to communal storage. If the free space in the depot is not sufficient, Vertica evicts files from the depot to make space for new files.

Your data load fails if the amount of data you try to load in a single transaction is larger the total sizes of all the depots in the subcluster. To load more data than there is space in the subcluster's combined depots, set UseDepotForWrites to 0. This configuration parameter tells Vertica to load the data directly into communal storage.
Data storage: The data storage location holds files that belong to temporary tables and temporary data from sort operators that spill to disk. When loading data into Vertica, the sort operator may spill to disk. Depending on the size of the load, Vertica may perform the sort in multiple merge phases. The amount of data concurrently loaded into Vertica cannot be larger than the sum of temporary storage location sizes across all nodes divided by 2.
Catalog storage. The catalog size depends on the number of database objects per shard and the number of shard subscriptions per node.

Vertica recommends a minimum local storage capacity of 2 TB per node, out of which 60% is reserved for the depot and the other 40% is shared between the catalog and data location. If you determine that you need a depot larger than 1.2TB per node (which is 60% of 2TB) then add more storage than this minimum recommendation. You can calculate the space you need using this equation:

Disk SPace Per Node = Working Data Size / # of Nodes in subluster * 1.40

For example, suppose you have a compressed working data size of 24TB, and you want to have a initial primary subcluster of 3 nodes. Using these values in the equation results in 13.33TB:

24TB / 3 Nodes * 1.40 = 11.2TB

Choosing the Number of Shards and the Initial Node Count

Shards are how Vertica divides the responsibility for the data in communal storage among nodes. Each node in a subcluster subscribes to at least one shard in communal storage. During queries, data loads, and other database tasks, each node is responsible for the data in the shards it subscribes to. See Shards and Subscriptions for more information.

The relation between shards and nodes means that when picking the number of shards for your database, you must consider the number of nodes you will have in your database, and how you expect to expand your database in the future.

You set the number of shards when you create your database. Once created, you cannot change the number shards. Therefore, choose a shard count that will allow your database to grow over time.

The following table shows the recommended shard count and initial node count base on the working data size. The initial node count is the number of nodes you will have in your core primary subcluster. Note that the number of shards is always a multiple of the node count. This ensures that the shards are evenly divided between the nodes. For best performance, always make the number of shards a multiple or even divisor of the number of nodes in a subcluster.

Cluster Type	Working Data Size	Number of Shards	Initial Node Count
Small	Up to 24 TB	6	3
Medium	Up to 48 TB	12	6
Large	Up to 96 TB	24	12
Extra large	Up to 192 TB	48	24

A 2:1 ratio for shards to nodes is a performance recommendation, rather than a hard limit. If you attempt to go higher than 3:1, MC offers a warning to make sure you have taken all aspects of shard count into consideration because, once set, the shard count cannot be changed.

How Shard Count Affects Scaling Your Cluster

The number of shards you choose for your database impacts your ability to scale your database in the future. If you have too few shards, your ability to efficiently scale your database can be limited. If you have too many shards, your database performance can suffer. A larger number of shards increases inter-node communication and catalog complexity.

One key factor in deciding your shard count is determining how you want to scale your database. There are two strategies you can use when adding nodes to your database. Each of these strategies let you improve different types of database performance:

To increase the performance of complex, long-running queries, add nodes to an existing subcluster. These additional nodes improve the overall performance of these complex queries by splitting the load across more compute nodes. You can add more nodes to a subcluster than you have shards in the database. In this case, nodes in the subcluster that subscribe to the same shard will split up the data in the shard when performing a query. See Elasticity for more information.
To increase the throughput of multiple short-term queries (often called "dashboard queries"), improve your cluster's parallelism by adding additional subclusters. Subclusters work independently and in parallel on these shorter queries. See Subclusters for more information.

These two approaches have an impact on the number of shards you choose to start your database with. Complex analytic queries perform better on subclusters with more nodes, which means that 6 nodes with 6 shards perform better than 3 nodes and 6 shards. Having more nodes than shards can increase performance further, but the performance gain is not linear. For example, a subcluster containing 12 nodes in a 6-shard database is not quite as efficient as as 12-node subcluster in a 12-shard database. Dashboard-type queries operating on smaller data sets may not see much difference between a 3-node subcluster in a 6-shard database and 6-node subcluster in a 6-shard database.

In general, choose a shard count that matches your expected working data size 6–12 months in the future. For more information about scaling your database, see Elasticity.

Use Cases

Let’s look at some use cases to learn how to size your Eon Mode cluster to meet your own particular requirements.

Use Case 1: Save Compute by Provisioning When Needed, Rather than for Peak Times

This use case highlights increasing query throughput in Eon Mode by scaling a cluster from 6 to 18 nodes with 3 subclusters of 6 nodes each. In this use case, you need to support a high concurrent, short query workload on a 24 TB or less working data set. You create an initial database with 6 nodes and 6 shards. You scale your database for concurrent throughput on demand by adding one or more subclusters during certain days of the week or for specific date ranges when you are expecting a peak load. You can then shut down or terminate the additional subclusters when your database experiences lower demand. With Vertica in Eon Mode, you save money by provisioning on demand, rather than provisioning for the peak times.

Use Case 2: Complex analytic workload requires more compute nodes

This use case showcases the idea that complex analytic workloads on large working data sets benefit from high shard count and node count. You create an initial subcluster with 24 nodes and 24 shards. As needed, you can add an additional 24 nodes to your initial subcluster. These additional nodes enable the subcluster to use elastic crunch scaling to reduce the time it takes to complete complex analytic queries.

Use Case 3: Workload isolation

This use case showcases the idea of having separate subclusters to isolate ETL and report workloads. You create an initial primary subcluster with 6 nodes and 6 shards for servicing ETL workloads. Then add another 6-node secondary subcluster for executing query workloads. To separate the two workloads, you can configure a network load balancer or create connection load balancing policies in Vertica to direct clients to the correct subcluster based on the type of workloads they need to execute.