Sizing Your Vertica Cluster for an Eon Mode Database

This blog post was authored by Shrirang Kamat.

Vertica in Eon Mode is a new architecture that separates compute and storage, allowing users to take advantage of cloud economics that enable rapid scaling and shrinking of clusters in response to a variable workload. Eon Mode decouples the cluster size from the data volume and lets you configure by your compute needs instead. While a Vertica cluster can host an Eon Mode database or an Enterprise Mode database, this document focuses on Eon Mode. Currently, Eon Mode works on AWS. For more information, see CloudFormation Template (CFT) Overview in the Vertica documentation.

As a Vertica administrator setting up your production cluster running in Eon Mode, you have to make important decisions about picking the correct EC2 instances and cluster size to meet your needs. This document provides guidelines and best practices for selecting instance types and cluster sizes for a Vertica database running in Eon Mode.

This document assumes that you have a basic understanding of the Eon Mode architecture and references new Eon mode concepts like communal storage, depot, and shards. Make sure you are familiar with these concepts. You can find details about Eon Mode architecture in Eon Mode Architecture.

Cluster sizing guidelines

In Enterprise Mode, sizing your cluster depends a lot on the total compressed data size. In most Vertica implementations we get 2:1 compression or better on disk. For the number of nodes, you divide the total compressed data size by storage capacity of each node. Vertica recommends that you put no more than 10TB of compressed data per node. Depending on the complexity of your workload and expected concurrency you should pick instance types that have sufficient CPU and memory. For production clusters, Vertica recommends a minimum of 16 cores, 128GB RAM and a minimum of 3 nodes for high availability.

In Eon Mode, communal storage is like a data lake that can store unlimited data. Sizing for Eon Mode depends on the following factors:

Working Data Size: The amount of data on which most of your queries will operate.

Depot Size: To get the fastest response time for frequently executed short queries, you want the most frequently read data from your working data set to be in your depot at all times. Performance of queries directly against data in communal storage depends on the amount of data read by queries from communal storage. Vertica has optimizations like predicate pushdown to read only required data blocks for queries against the depot and communal storage. Our internal testing found that TPC-H queries were 2 times faster for data in depot as opposed to communal storage. Each Vertica node needs local storage for the depot, catalog, and temp space that is required for query execution. Vertica recommends a minimum local storage capacity of 600GB per node out of which 60% can be reserved for the depot and the other 40% can be shared between catalog and temp space. The size of the depot on each node must be large enough to hold data loaded in Vertica that is not committed plus the size of the data that is concurrently being loaded into Vertica, divided by the number of nodes in the cluster. The temp space size must be large enough to hold temporary files written during query processing plus the two times size of the data that is concurrently loaded in Vertica, divided by the number of nodes in the cluster.

Concurrency and throughput scaling: You can pick the instance type based on the complexity of queries in your workload and the expected concurrency. In Eon Mode, you can achieve elastic throughput scaling by adding more nodes and creating sub-cluster. To create a sub-cluster you need to define a fault group with the number of nodes equal or greater than the number of shards.

To pick an instance type and the number of nodes for the Vertica cluster running in Eon mode, you must know what your working data set is. The number of shards that you pick at database creation determines the maximum number of compute nodes that will execute your query in parallel. The number of shards cannot be changed in a Vertica database running in Eon Mode. Vertica recommends that you select your shard count based on the following table.

The following are recommended instance types based on the working data size:

Complex analytic queries will perform better on clusters with more nodes, which means 6 nodes with 6 shards performs better than 3 nodes and 6 shards. Dashboard type queries operating on smaller data sets may not see much difference between 3 nodes with 6 shards and six nodes with six shards.

You may choose instance types that support ephemeral instance storage or EBS volumes for your depot depending on the cost factor and availability. It is not mandatory to have an EBS backed depot because in Eon Mode a copy of the data is safely stored in communal storage.

The following table has information you can use to make a decision on how to pick instances with ephemeral instance storage or EBS only storage. Check with AWS for the latest prices.

Let’s take a look at some use cases to figure out how to size an Eon Mode cluster.

Use Case 1: Save compute by provisioning close to need, rather than peak times

This example highlights the elastic throughput scaling feature of Eon Mode to scale a cluster from 5 to 25 nodes with 5 subclusters of 5 nodes each. In this use case, we want to support a high concurrent, short query workload on a medium-sized working data set. We will create an initial cluster of type medium with 5 nodes and 5 shards. We can scale out throughput on demand by adding one or more subclusters during certain days of the week or for specific date ranges when we are expecting a peak load. The cluster can then be shrunk back to its initial size by dropping nodes for normal workloads. With Vertica in Eon Mode, you save compute by provisioning close to the need, rather than provisioning for the peak times.

Use Case 2: Complex analytic workload requires more compute nodes

This example showcases the idea that complex analytic workloads on large working data sets benefit from an initial cluster of type large with a high shard count. We will create an initial cluster of type large with 20 nodes and 20 shards. As needed, you can add and remove nodes to improve throughput scaling.

Use Case 3: Workload isolation

This example showcases the idea of having separate subclusters to isolate ETL and report workloads. We will create an initial cluster of type large with 10 nodes and 10 shards used to service queries, and add another 10 node subcluster of type medium for supporting ETL workloads. You may need to configure the network load balancer from AWS to separate the ETL workload from SELECT queries. Workload isolation can also be useful for isolating different users with varying Vertica skills.

Use Case 4: Shrink your cluster to save costs

To shrink the cluster size, drop nodes from the cluster and Vertica will automatically re-balance the shards among the remaining nodes. When you shrink the cluster to a size smaller than the initial cluster size, the nodes may subscribe to more than two shards, having the following impact:

• The catalog size will be larger because nodes are subscribing to more shards.

• The depot will be shared by more shard subscriptions, which may lead to the evictions of files.

• Each node will process more data, which may have performance impact on queries.

For more information, see Using Eon Mode in the Vertica documentation.