Vertica

Author Archive

Workload Management Metrics – A Golden Triangle

Modern databases are often required to process many different kinds of workloads, ranging from short/tactical queries, to medium complexity ad-hoc queries, to long-running batch ETL jobs to extremely complex data mining jobs (See my previous blog on workload classification for more information.) DBAs must ensure that all concurrent workload, along with their respective Service Level Agreements (SLAs), can co-exist well with each other while maximizing a system’s overall performance.

So what is concurrency? Why should a customer care about concurrency?

Concurrency is a term used to describe having multiple jobs running in an overlapping time interval in a system. It doesn’t necessarily mean that they are or ever will be running at the same instant. Concurrency is synonymous to multi-tasking and it is fundamentally different from parallelism, which is a common point of confusion. Parallelism represents a state in which two or more jobs are running at the exact same instant. The simplest example might be a single CPU computer. On such a computer, you can, in theory, run multiple jobs by context-switching between them. This gives the user the illusion of virtual parallelism or that multiple jobs are running on the single CPU at the same time. However if you take a snapshot at any given instant , you’ll find there is one and only one job running. In contrast, actual parallel processing is enabled by multiple working units (e.g. multiple cpu/cores in a modern database server such as the HP DL380p). Because Vertica is an MPP columnar database and an inherent multi-threaded application, it can take advantage of this multiple-CPU/core server architecture to process queries in both a concurrent and a parallel manner.

Most customers do not usually care about concurrency directly. Rather, they have a specific requirement to execute a certain workload in a database governed by a set of throughput and response time (latency) objectives. Throughput (TP) is defined as the number of queries/jobs that a database can perform in a unit of time and is the most commonly used metric to measure a database’s performance. Response time (or latency) is the sum of queuing time and runtime and as such it depends on both concurrency (as a proxy for overall system load) and query performance (= inverse of runtime).

For a given workload, the three metrics: throughput (TP), concurrency, and performance are related to each other by the simple equation:
Throughput = Concurrency * Performance

Knowing any two of these three metrics, you can derive the third. This relationship can be visually illustrated by the following Workload Management Metrics Triangle:

workload_golden_triangle

Concurrency is often NOT a direct customer requirement because it depends on query performance and throughput SLA. Customer requirements are usually in the form of something like this: “We need to process 10K queries in one hour with an average response time of 1 min or less.” So throughput (TP) is often the metric that customer is interested in and concurrency is a “derived” metric.

Let’s consider a hypothetical customer POC requirement of processing twelve hundred queries in one minute and assume that there are two competing systems, X and Y.

On System X, executing such a workload would require a currency level of 40 with an average query runtime of 2s.

On System Y, assuming average query response is 100ms, executing the same workload, requires a concurrency level of only 2 (because 20/s=2*1/100ms).

What does this mean for the customer? Clearly System Y with its superior query processing capability needs far less concurrency to satisfy the SLA than System X and hence it is a better platform (from a purely technical perspective).

To summarize, for a given throughput (TP) SLA, the better the query/job performance, the less concurrency it needs. Less concurrency generally means less or more efficient resource usage and better overall system performance (since there will be more spare system resources to process other workloads). The goal of any workload performance tuning exercise should never be about increasing concurrency. Instead it should focus on minimizing a query’s resource usage, improving its performance and applying the lowest possible concurrency level to satisfy a customer’s throughput (TP) and response time (latency) requirement.

Po Hong is a senior pre-sales engineer in HP Vertica’s Corporate Systems Engineering (CSE) group with a broad range of experience in various relational databases such as Vertica, Neoview, Teradata and Oracle.

A Method for Vertica Workload Classification

Zipper_I95_JRB_1

Modern analytic databases such as HP Vertica often need to process a myriad of workloads ranging from the simplest primary-key lookup to complex analytical queries that include dozens of large tables and joins between them. Different types of load jobs (such as batch type ETL jobs and near real-time trickle loads) keep the data up-to-date in an enterprise data warehouse (EDW). Therefore, an enterprise class database like HP Vertica must have a robust yet easy-to-use mixed-workload management capability.

The Resource Manager

HP Vertica manages complex workloads using the Resource Manager. With the Resource Manager, you manage resource pools, which are pre-defined subsets of system resources with an associated queue. HP Vertica is preconfigured with a set of built-in resource pools that allocate resources to different request types. The General pool allows for a certain concurrency level based on the RAM and cores in the machines.

HP Vertica provides a sophisticated resource management scheme that allows diverse, concurrent workloads to run efficiently on the database. For basic operations, the built-in general pool is usually sufficient. However, you can customize this pool to handle specific workload requirements.

In more complex situations, you can also define new resource pools that can be configured to limit memory usage, concurrency, and query priority. You can even optionally restrict each database user to use a specific resource pool to control memory consumption of their requests.

Understanding and Classifying Workloads

Before you start thinking about resource pools and workload optimization in HP Vertica, you must first develop a solid understanding of the customer’s workloads and know how to properly classify them.

What should you use for classification criterion? You could pick apart a complex query, studying its structure in detail, counting and adding up the number of tables, joins and aggregate functions, number and types of derived tables and analytical functions to come up with some weighted score. However this kind of approach is extremely tedious and subjective and, as a result is not a practical option.

What if we use the standalone runtime of a query as the criterion? This method is also problematic because a query that runs in one minute while using up 80% of a system’s resources should obviously not be in the same category as another query that runs in the same amount of time (one minute) but uses < 0.1% of the resources.
In HP Vertica, the best proxy for query complexity is its memory usage. Being a modern MPP columnar database, HP Vertica is rarely, if ever, I/O bound. HP Vertica is also less likely to hit the CPU bottleneck because of the tremendous power and speed of modern multi-core CPUs. Therefore, the most common resource bottleneck in a production HP Vertica cluster running a complex mixed-workload is often memory. Because of this, the HP Vertica Resource Manager focuses on establishing equitable memory allocation among different workloads or pools. This ensures that no resource pool is starved out of memory in the worst-case scenario -- under full system load.

Determining Memory Requirements for a Query

If you can somehow determine quickly how much memory a query requires per node, then you can use that value to classify an HP Vertica query (or any other job). Based on extensive real-world performance tuning experience gained from working with some of HP Vertica’s biggest and most demanding customers, I have found the following classification rules to be very easy to use and effective:

Small: <200 MB
Medium: 200-800 MB
Large: >800MB

How can you quickly determine a query’s memory requirement? It turns out that HP Vertica has a convenient profiling option (similar to EXPLAIN.) You can use the PROFILE statement to get the total memory required for the query (among other things). As a best practice, you should set up a small and dedicated profiling pool for this purpose, as shown in the following example:

CREATE RESOURCE POOL p_pool MEMORYSIZE ’1K’ PRIORITY 80 PLANNEDCONCURRENCY 4 MAXCONCURRENCY 4 EXECUTIONPARALLELISM 16;

Creating a specific profiling pool forces your query to borrow from the default general pool for any extra memory that it needs to execute. If you use the general pool (a common mistake), depending on the detailed pool settings, the reserved memory may be more than a query actually needs. HP Vertica could be “fooled” to report on reserved memory as opposed to the actual allocated/used memory under certain circumstances and this would skew your result.

For more information on the Resource Manager, see Managing Worloads in the HP Vertica Documentation set.

Po Hong is a senior pre-sales engineer in HP Vertica’s Corporate Systems Engineering (CSE) group with a broad range of experience in various relational databases such as Vertica, Neoview, Teradata and Oracle.

How to make ROLLUP fly in Vertica?

ROLLUP screenshot

ROLLUP is a very common Online Analytic Processing (OLAP) function and is part of ANSI SQL. Many customers use ROLLUP to write reports that automatically perform sub-total aggregations across multiple dimensions at different levels in one SQL query. The following is an example:

Select DEPT_CODE, STORE_CODE, SUM(SALE_TOT_QTY) As TOTAL_SALES
From AGG_CUST_MONTHLY A INNER JOIN DIM_DATE B
ON ( A.DATE_KEY = B.DATE_KEY )
WHERE B.DATE_KEY BETWEEN 20120301 AND 20120331
GROUP BY ROLLUP(DEPT_CODE, STORE_CODE);

Note: ROLLUP syntax is not supported in the current version of Vertica 7.0. This
is used only as an illustration. See blog text for more details

The business question the above query intends to answer is: For the month of March, 2014, show the total sales across both departments (dept_code) and stores (store_code) to generate sub-total/total sales figures at three levels:

  1. At the detailed level of each pair of (dept_code, store_code)
  2. At the level of only (dept_code) but aggregating across all stores
  3. Finally a grand sales total aggregating across all departments and stores

Assuming performance is not an issue, clearly ROLLUP is a very powerful and useful function that can do a lot of data crunching in the background and deliver a quite complex report.

Unlike the more familiar GROUP BY SQL function in which the column order is not material, ROLLUP works by creating subtotals that “roll up” from the most detailed level to a grand total and requires an ordered list of grouping expressions to be supplied as arguments. First ROLLUP calculates the standard aggregate values specified in the standard GROUP BY (without ROLLUP) and then ROLLUP moves from right to left through the list of grouping columns, starting with the lowest level of aggregation to create progressively higher-level subtotals. This process continues until all grouping columns have been consumed and aggregations computed. The operator essentially “rolls up” its grouping operations, culminating in a grand total in the end.

If N is the number of grouping columns, ROLLUP function will create (N+1) levels of subtotals (including a final grand total). The column order in a ROLLUP operator is critical since it works by removing the left most column at each step. For example:

ROLLUP(A, B, C) creates 4 groups: (A, B, C), (A, B), (A), ().

Note there are NO groups such as (A, C) and (B, C).

Unfortunately the current version of Vertica 7.0 does not support ROLLUP. So what is the alternative or workaround for the existing Vertica customers? Will the workaround perform on a large data set? Fortunately, the answers to both questions are yes. However it does take some ingenuity in physical design and SQL re-write to make ROLLUP work and perform in the currently supported Vertica releases (version 5.1 and later).

The key considerations are to create and pre-load a local temp table which is sorted on ALL the ROLLUP columns in the same exact order, rewrite SQL at each level using GROUP BY into a sub-query and finally UNION ALL of these sub-queries together. By applying this strategy to the ROLLUP SQL posted at the beginning of this blog, we arrive at the following alternative/workaround:

DROP TABLE IF EXISTS AGG_TEMP CASCADE;

CREATE LOCAL TEMPORARY TABLE AGG_TEMP
ON COMMIT PRESERVE ROWS
AS
/*+ DIRECT */
( Select DEPT_CODE, STORE_CODE, SUM(SALE_TOT_QTY) As TOTAL_SALES
From AGG_CUST_MONTHLY A INNER JOIN DIM_DATE B
ON ( A.DATE_KEY = B.DATE_KEY )
WHERE B.DATE_KEY BETWEEN 20140301 AND 20140331
GROUP BY DEPT_CODE, STORE_CODE )
ORDER BY DEPT_CODE,
STORE_CODE
SEGMENTED BY HASH (DEPT_CODE, STORE_CODE) ALL NODES;

SELECT ANALYZE_STATISTICS(‘AGG_TEMP’);

SELECT DEPT_CODE, STORE_CODE, SUM(TOTAL_SALES)
FROM AGG_TEMP
GROUP BY DEPT_CODE, STORE_CODE
UNION ALL
SELECT DEPT_CODE, NULL As STORE_CODE, SUM(TOTAL_SALES)
FROM AGG_TEMP
GROUP BY DEPT_CODE
UNION ALL
SELECT NULL As DEPT_CODE, NULL As STORE_CODE, SUM(TOTAL_SALES)
FROM AGG_TEMP;

The fact that temp table AGG_TEMP is sorted on the ROLLUP columns (dept_code, store_code) is important for performance because this means that all the GROUP BY sub-queries at individual levels will be able to exploit the highly efficient GROUPBY PIPELINED operator (explicit in the query’s explain plan) in Vertica to get the job done quickly and accurately.

A nice property of GROUPBY PIPELINED operator (in contrast to the more resource intensive GROUPBY HASH) is that its performance is largely independent of the actual table size. As a result this workaround can easily scale up to include tables with 10+B or even 100+B rows without any noticeable drop in query performance (as measured in first fetch time).

To conclude I would like to mention that there has been serious work going on in Vertica Engineering to implement ROLLUP (and many other OLAP functions) in the up-coming Vertica releases. Stay tuned!

Po Hong is a senior pre-sales engineer in HP Vertica’s Corporate Systems Engineering (CSE) group with a broad range of experience in various relational databases such as Vertica, Neoview, Teradata and Oracle.

Facebook and Vertica: A Case for MPP Databases

I have just come back from a business trip to China where I visited several large Chinese telecom customers to talk about the recent big Vertica win at Facebook. Two questions these customers had constantly asked me were: What’s the future of MPP databases? Will Hadoop become one database that rules the whole analytic space?
These seemed to be odd questions considering that Facebook, one of the juggernauts in the Open Source community in general and Hadoop world in particular, has recently picked Vertica to be the anchoring database to satisfy its ever-increasing analytical demands and has since put the biggest Vertica cluster (with ~300 nodes and effective data storage of 6+ PB) into production. It tells me that if a Hadoop power-house and the inventor of Hive (the most popular SQL-on-Hadoop database) like Facebook, with its teams of brilliant programmers and bound-less resources, still thinks that it needs a MPP database like Vertica in its “Big Data” technology stack in the foreseeable future, it sends a clear and strong message. Obviously Facebook thinks the answers to both questions are NO, not so fast. In the meantime, Facebook will continue to use Hive/HBase and other Hadoop technologies for the tasks they are good at: ETL, handling unstructured data and conducting complex data-mining types of deep analysis.

So why does Facebook think that it needs a MPP database? Facebook has been running an EDW (Oracle Exadata ~50TB) for some time but feels that their existing EDW is running out of steam because it cannot keep up with the rapid data growth especially as mobile platform becomes more and more popular. Facebook would like to take advantage of the established commercial MPP databases for lower cost, robust eco-system, improved data security and better scalability/performance. Their main reasons for going with an MPP database can be summarized as follows:

  • Rapidly expanding analytical needs at Facebook,
  • MapReduce is too slow, plus security concerns
  • In-Memory Database (IMDB) is too expensive and too immature
  • Current SQL-on-Hadoop databases are not good enough and too immature

Facebook has invited four MPP vendors (including Vertica) to participate in two rounds of competitive POCs before declaring Vertica as the ultimate winner on the basis of Vertica’s low TCO, ease of management and superior ad-hoc query performance.

There have recently been many SQL-on-Hadoop offerings in the last couple of years, both open source and proprietary, including but not limited to Hive, Hadapt, Citus, Impala, Stinger and Apache Drill. Though their effort in making Hadoop more SQL friendly is welcome, my general impression is that they are still a long way off in terms of closing the performance gap to the popular MPP databases in the marketplace (e.g. Vertica). Depending on your perspective, you may argue that this gap is not exactly getting narrower at any pace that foretells its closing any time soon.

There is strong reason for me to believe that the SQL-on-Hadoop camp may have over-estimated the effectiveness of bolting/wrapping around open source SQL optimizers (e.g. PostgreSQL) to HDFS and severely underestimated the effort and time it takes to produce an enterprise quality MPP database whose core optimizer/execution engine technology requires years of intensive real world use to mature, and 100s (if not 1000s) of customers to validate and millions of cases to test and train. This is certainly more about practice than theory or concept. Query optimization is fundamentally a software problem and there is a limit to what any “brute force” hardware-based approach can do. To echo and rephrase what the authors of the MapReduce and Parallel Databases: Friends or Foes?” said, smart software (like MPP databases) is still a good idea in the age of Hadoop and “Big Data” and there is plenty of room and opportunity for MPP databases to thrive for a long time to come….

Po Hong is a senior pre-sales engineer in HP Vertica’s Corporate Systems Engineering (CSE) group with a broad range of experience in various relational databases such as Vertica, Neoview, Teradata and Oracle.

Can Vertica Climb a Tree?

big_basin_0939_mg_1143

The answer is YES if it is the right kind of tree. Here “tree” refers to a common data structure that consists of parent-child hierarchical relationship such as an org chart. Traditionally this kind of hierarchical data structure can be modeled and stored in tables but is usually not simple to navigate and use in a relational database (RDBMS). Some other RDBMS (e.g. Oracle) has a built-in CONNECT_BY function that can be used to find the level of a given node and navigate the tree. However if you take a close look at its syntax, you will realize that it is quite complicated and not at all easy to understand or use.

For a complex hierarchical tree with 10+ levels and large number of nodes, any meaningful business questions that require joins to the fact tables, aggregate and filter on multiple levels will result in SQL statements that look extremely unwieldy and can perform poorly. The reason is that such kind of procedural logic may internally scan the same tree multiple times, wasting precious machine resources. Also this kind of approach flies in the face of some basic SQL principles, simple, intuitive and declarative. Another major issue is the integration with third-party BI reporting tools which may often not recognize vendor-specific variants such as CONNECT_BY.

Other implementations include ANSI SQL’s recursive SQL syntax using WITH and UNION ALL, special graph based algorithms and enumerated path technique. These solutions tend to follow an algorithmic approach and as such, they can be long on theory but short on practical applications.
Since SQL derives its tremendous power and popularity from its declarative nature, specifying clearly WHAT you want to get out of a RDBMS but not HOW you can get it, a fair question to ask is: Is there a simple and intuitive approach to the modeling and navigating of such kind of hierarchical (recursive) data structures in a RDBMS? Thankfully the answer is yes.

In the following example, I will discuss a design that focuses on “flattening” out such kind of hierarchical parent-child relationship in a special way. The output is a wide sparsely populated table that has extra columns that will hold the node-ids at various levels on a tree and the number of these extra columns is dependent upon the depth of a tree. For simplicity, I will use one table with one hierarchy as an example. The same design principles can be applied to tables with multiple hierarchies embedded in them. The following is a detailed outline of how this can be done in a program/script:

  1. Capture the (parent, child) pairs in a table (table_source).
  2. Identify the root node by following specific business rules and store this info in a new temp_table_1.
    Example: parent_id=id.
  3. Next find the 1st level of nodes and store them in a temp_table_2. Join condition:
    temp_table_1.id=table_source.parent_id.
  4. Continue to go down the tree and at the end of each step (N), store data in temp_table_N.
    Join condition: temp_table_M.parent_id=temp_table_N.id, where M=N+1.
  5. Stop at a MAX level (Mevel) when there is no child for any node at this level (leaf nodes).
  6. Create a flattened table: table_flat by adding in total (Mlevel+1) columns named as LEVEL,
    LEVEL_1_ID,….LEVEL_Mlevel_ID.
  7. A SQL insert statement can be generated to join all these temp tables together to load
    into the final flat table: table_flat.

  8. When there are multiple hierarchies in one table, the above procedures can be repeated for each
    hierarchy to arrive at a flattened table in the end.

 

This design is general and is not specific to any particular RDBMS architecture, row or column or hybrid. However the physical implementation of this design naturally favors columnar databases such as Vertica. Why? The flattened table is usually wide with many extra columns and these extra columns tend to be sparsely populated and they can be very efficiently stored in compressed format in Vertica. Another advantage is that when a small set of these columns are included in the select clause of an SQL, because of Vertica’s columnar nature, the other columns (no matter how many there are) will not introduce any performance overhead. This is as close to “free lunch” as you can get in a RDBMS.

Let’s consider the following simple hierarchical tree structure:

Vertica Tree diagram

There are four levels and the root node has an ID of 1. Each node is assumed to have one and only one parent (except for the root node) and each parent node may have zero to many child nodes. The above structure can be loaded into a table (hier_tab) having two columns: Parent_ID and Node_ID, which represent all the (parent, child) pairs in the above hierarchical tree:

CHart 1

It is possible to develop a script to “flatten” out this table by starting from the root node, going down the tree recursively one level at a time and stopping when there is no data left (i.e. reaching the max level or depth of the tree). The final output is a new table (hier_tab_flat):

Chart 2

What’s so special above this “flattened” table? First, this table has the same key (Node_ID) as the original table; Second, this table has several extra columns named as LEVEL_N_ID and the number of these columns is equal to the max number of levels (4 in this case) plus one extra LEVEL column; Third, for each node in this table, there is a row that includes the ID’s of all of its parents up to the root (LEVEL=1) and itself. This represents a path starting from a node and going all the way up to the root level.The power of this new “flattened” table is that it has encoded all the hierarchical tree info in the original table. Questions such as finding a level of a node and all the nodes that are below a give node, etc. can be translated into relatively simple SQL statements by applying predicates to the proper columns.

Example 1: Find all the nodes that are at LEVEL=3.Select Node_ID From hier_tab_flat Where LEVEL=3;Example 2: Find all the nodes that are below node= 88063633.

This requires two logical steps (which can be handled in a front-end application to generate the proper SQL).

Step 2.1. Find the LEVEL of node= 88063633 (which is 3).

Select LEVEL From hier_tab_flat Where Node_ID=88063633;

Step 2.2. Apply predicates to the column LEVE_3_ID:

Select Node_ID From hier_tab_flat Where LEVE_3_ID =88063633;

Complex business conditions such as finding all the nodes belonging to node=214231509 but excluding the nodes that are headed by node=88063633 can now be translated into the following SQL:

Select Node_ID
From hier_tab_flat
Where LEVE_2_ID=214231509
And LEVE_3_ID <> 88063633 ;

By invoking the script that flattens one hierarchy repeatedly, you can also flatten a table with multiple hierarchies using the same design. With this flattened table in your Vertica tool box, you can climb up and down any hierarchical tree using nothing but SQL.

Po Hong is a senior pre-sales engineer in HP Vertica’s Corporate Systems Engineering (CSE) group with a broad range of experience in various relational databases such as Vertica, Neoview, Teradata and Oracle

Get Started With Vertica Today

Subscribe to Vertica