Optimizer, Execution Engine & Workload Management

The Vertica Optimizer and Execution Engine provide users the maximum performance from their database without worrying about the details required to get it done.

The Vertica Optimizer

The Optimizer is the brain of the analytics platform that produces execution plans for queries posed by users. Vertica’s Optimizer was purpose-built to reduce the need for manual tuning as much as possible, understanding and choosing the optimal plan in the presence of several choices even for the most complex analytic queries. This allows the end-user to think about questions, without having to worry about the optimal path to the answers. Vertica supports classic Star and Snowflake schemas as well as any type of arbitrary schema. Vertica also supports multiple schemas concurrently for multi-tenancy in a variety of configurations, especially when used in Software as a Service (SaaS) deployments.

Traditional data warehouse optimizers usually only consider startup and disk I/O, but Vertica’s holistic cost model accounts for all variables in today’s environments – disk, CPU, memory, network, concurrency, and parallelism. It also takes advantage of the unique details of columnar operator and runtime environments. Vertica’s ability to continuously evaluate and analyze data patterns feeds directly into the Optimizer in a self-learning manner so that there is constant improvement without user intervention, even as data volumes and patterns change. All Vertica components, especially the Optimizer are modular so that they can be changed in the future without rewriting significant amounts of code.

The Vertica Execution Engine

Our Optimizer and Execution Engine were developed together since inception, so they work in lock-step. This delivers faster predicate evaluation (remember in Vertica the sorted columnar data is its own “index” without the burden of an actual index), better compression and simplified processing. Vertica’s Execution Engine offers advanced CPU and memory pipeline aggregation, storage management and compressed data operations for superior performance. A key component of our Execution Engine is our Tuple Mover, which enables us to efficiently move data from memory to disk in near real-time, while also collecting statistics and cleaning up purged data. This is all done automatically without user intervention so that concurrent loading and querying is seamless. You provide the data, and we’ll handle the rest.

Vertica’s Cluster-Aware Workload Management

Finally, the Vertica Analytics Platform provides cluster-aware workload and resource management integrated throughout the platform, especially with the Optimizer and Execution Engine. This can be used to maximize varying workloads and SLA’s across an organization. In an environment where several thousands of concurrent users and queries of varying complexity are expected to run at once, Vertica can manage resources accordingly supporting extremely high concurrency. Contention exists between providing each query the maximum amount of available resource (thereby getting fastest run time for that particular query) and serving multiple queries simultaneously with a reasonable run time. The Vertica Resource Manager (RM) provides options and controls for resolving these scenarios while ensuring that every query gets serviced and true system limits are respected at all times. Best of all, these resource pools can be managed dynamically and with user profiles and roles, which reduces the burden on the DBA.

For more on our Optimizer, check out Vertica Under the Hood: The Query Optimizer on our blog.