Concurrency and Workload Management

Posted June 26, 2017 by Soniya Shah, Information Developer

This blog post was authored by Soniya Shah.

Vertica workloads range from simple primary key lookups to analytical queries that include several large tables and joins. Different types of load jobs must keep the data updated. Vertica has a mixed-workload management capability that is easy to use.

Vertica can process queries both concurrently and in parallel. Concurrency means having multiple jobs running in an overlapping time interval in a system, regardless of whether those jobs run at the same instant. Parallelism means that two or more jobs are running at the exact same instant.

Queuing Performance

The average response time in a system that does not process concurrently is shorter than the average response time in a time-sharing system with no cost for context-switching. In systems with highly concurrent workloads, queuing improves system performance over time. It is best to have multiple CPU cores on your Vertica nodes because this configuration allows for parallel processing of multiple tasks at the same time.

To better understand these concepts, imagine you have two one hour tasks to complete. You can complete these tasks in a few different ways:

• Complete them one by one (serially). Deliver one task in the first hour and the other in the second hour. The average runtime is 90 minutes.
• Complete them concurrently (multi-tasking). Finish both tasks in two hours, with an average runtime of 120 minutes. This option demonstrates ideal resource sharing, assuming there is no context-switching cost.
• Add another worker and perform the two tasks in parallel. Complete the jobs in an hour, with an average runtime of 60 minutes.

In concurrent mode, progress is made on both tasks. Vertica uses lockless concurrency control on queries. Using locks can introduce contention between shared resources. Vertica is based on a shared-nothing massively parallel processing architecture. This approach provides better workload management to achieve both system performance (throughput) and response time (latency) within a cluster.

Linear Concurrency

Linear concurrency means that increasing the number of outstanding requests increases the total response time proportionally.

There are a few types of linear concurrency:

• Perfectly linear: Each task is processed without wasting resources and there is perfect resource sharing. There is no apparent advantage to running multiple jobs concurrently.
• Worse than linear: Running multiple jobs takes more time than running each job one by one. Processing is obstructed by issues that include lock conflicts on shared resources, context-switching costs, and skews in data distribution.
• Better than linear: Running multiple jobs takes less time than serial execution. This happens because of cooperative caching, the ability to work on encoded data, and spare resources for each task.

Short Query Bias

In a mixed workload environment, short query bias prioritizes short, tactical queries over long-running or more complex queries. In Vertica, you can tune two resource parameters, RUNTIMEPRIORITYTHRESHOLD and RUNTIMEPRIORITY, to allow higher priority access to resources for those short queries. For more information, see Setting Run-Time Priority for the Resource Pool in the Vertica documentation.

Let’s suppose again we have two tasks to complete. One task takes one hour to finish and the other task takes two hours to finish. Consider three ways to accomplish these two tasks:

• Process the two tasks serially by completing the short task first and the longer task second. The average runtime is two hours.
• Process the tasks concurrently (multitasking). If we assume there is no context-switching, the average runtime is two hours at 15 minutes.
• Process the tasks serially, but queue the long task first and the short task second. The average runtime is two hours and 30 minutes.

The average response time in a system depends on how the tasks are ordered. Prioritizing short jobs over long jobs improves the throughput. The task order is also important when tuning a mixed workload.

While queuing improves the overall system performance in a busy system, prioritizing short queries over longer running queries is a better option.

Workload Management

Databases must provide robust mixed workload management capabilities. For effective workload management, you should use your available system resources to meet your business requirements. You manage workloads based on the relative priority or urgency of each request. In a mixed workload environment, you must tune your Vertica database so that it provides both a fast response time for short tactical queries, and has acceptable response times for longer queries.

Resource Manager

Vertica manages complex workloads using the Resource Manager. With this tool, you manage resource pools, which are predefined subsets of system resources with an associated queue. Vertica is preconfigured with a set of built-in resource pools that allocate resources to different request types. The GENERAL pool allows for a certain concurrency level, which is determined by the RAM and CPU cores on the node.

The Vertica resource management scheme allows diverse, concurrent workloads to run efficiently. For basic operations, the default GENERAL pool is usually sufficient. However, you can customize this pool to handle specific workload requirements.

In more complex situations, you can also define resource pools to configure memory usage and query priority. Optionally, to control memory consumption, you can restrict each database user’s request to a specific resource pool and limit the total memory used, the total amount of temporary space, and so on.

A resource pool contains several parameters that you can tune to fit any specific requirement. For more information, see Resource Pool Architecture in the Vertica documentation.

Workload Metric

Throughput represents the number of queries or jobs that a database can perform in a unit of time. This value is the most commonly used metric for measuring a database’s performance.

Response time is the sum of queuing time and runtime. It depends on both concurrency and query performance.

For a given workload, the three metrics – throughput, concurrency, and performance – are related through the following equation:

Throughput = Concurrency * Performance

If you know any of the two metrics, you can derive the third. This relationship is illustrated through the Workload Management Metrics Triangle:



Often, concurrency is not a direct requirement because it depends on query performance and throughput. For example, suppose you must process 1200 queries in 1 minute (20 queries per second). Assume that you have two competing systems, X and Y:

• On System X, executing this workload requires a concurrency level of 40 with an average query runtime of two seconds
• On System Y, executing this workload requires a concurrency level of 2, with an average query runtime of 100 milliseconds.

System Y has superior quality processing capabilities. For a given throughput, the better the query performance, the less concurrency it needs. Less concurrency means more efficient resource usage and a better overall system performance. This performance improvement happens because more system resources are available to process other workloads.

Performance Tuning

Before you start any workload management tuning, consider the following about your database:

• Workload types
• Maximum number of users
• Expected throughput
• Expected response time
• Maximum allowable runtime for each job type

In Vertica, the best way to determine query complexity is by looking at memory usage. The Resource Manager attempts to allocate memory equally among different workloads or resource pools. The goal is to verify that no resource pool is starved out of memory in the worst-case scenario, when under full system load.

If you can determine how much memory a query requires per node, you can use that value to classify a Vertica query. Vertica recommends the following query classification guidelines:

• Small: <200 MB
• Medium: 200 – 800 MB
• Large: >800 MB

You can use the PROFILE statement to get the total memory required for the query. You should set up a small profiling pool as shown in the following: =>CREATE RESOURCE POOL p_pool MEMORYSIZE ‘1K’ PRIORITY 80 PLANNED CONCURRENCY 4 MAXCONCURRENCY 4; Creating this dedicated profiling pool forces a query to borrow from the GENERAL pool for any extra memory it needs to execute. If you use the GENERAL pool, the reserved memory could be more than the query needs. If this is the case, Vertica reports on reserved memory instead of the actual memory used, which could skew results.

Conclusion

Vertica has resource pools that prioritize short running queries over long running queries. These pools enable multiple jobs to run concurrently and use the system resources efficiently, even under a changing mixed workload.