Tell the Optimizer You Have a Better Plan

Posted December 17, 2015 by Michael Kronenberg

The problem

The year is 2017, and you are about to upgrade from Vertica 7.2 to the latest release. Before upgrading, you evaluate the new version on a staging server. Your evaluation includes identifying queries whose plans changed between versions, and comparing execution times. For many queries, the query optimizer has made different assumptions about the cost of broadcast joins. In most cases, these assumptions yield query plans with better cost estimates for join-intensive queries. However, your network topography is unusual. Now, you spot one query that appears to execute less efficiently than in the earlier version.

What can you do? You are committed to the upgrade, but your operation depends on business-critical queries continuing to run as efficiently as before.

Solution: Saving query plans in directed queries

Fortunately, Vertica addressed this concern in Vertica 7.2, when it introduced directed queries. Before then, your options were limited. You could compare query plans produced by different versions and identify differences between them. But realistically, you had no control over how the optimizer created those plans.

With directed queries, you can now prepare for the upcoming upgrade by saving query plans for your most important queries in the current database version. You can then reproduce those plans in the upgraded database. The process contains three steps:

  1. Create directed queries for the desired input queries.
  2. Export directed queries from the current database and import them to the upgraded database.
  3. Activate directed queries as needed to recreate saved query plans.

Creating a directed query

You save query plans with CREATE DIRECTED QUERY statements like this one:
=> CREATE DIRECTED QUERY OPTIMIZER HighestStoreSales
SELECT store.store_name, MAX(sales.sales_dollar_amount) AS highest_sale
FROM (store.store_sales sales JOIN store.store_dimension store
ON ((store.store_key = sales.store_key)))
GROUP BY store.store_name ORDER BY store.store_name;

Vertica executes this the statement as follows:

  1. Generates a query plan for the SELECT statement
  2. Uses that plan to create an annotated version of the original SELECT statement, or annotated query. The annotations capture critical elements of the query plan?for example, which projection to use, and how to join table data.
  3. Saves the annotated query in the database catalog under the name HighestStoreSales.

Directed query annotations

You can view the annotated queries by querying the system table V_MONITOR.DIRECTED_QUERIES. For example, given the previous input query, Vertica creates an annotated query similar to the following:
=> SELECT annotated_query from V_CATALOG.DIRECTED_QUERIES WHERE query_name=?HighestStoreSales?;
annotated_query
--------------------------------------------------------------------------------
annotated_query | SELECT /*+ syntactic_join,verbatim */
store.store_name AS store_name,
MAX (sales.sales_dollar_amount) AS highest_sale
FROM (store.store_sales AS sales/*+projs('store.store_sales')*/
JOIN /*+Distrib(L,L),JType(H)*/ store.store_dimension AS store
/*+projs('store.store_dimension_unseg')*/ ON (store.store_key = sales.store_key))
GROUP BY 1 ORDER BY 1 ASC

The SELECT statement contains several hints that tell the optimizer how to create a query plan. These include:

  • /*+SYNTACTIC_JOIN*/ enforces join order and enables other join hints.
  • /*+PROJS*/ specifies one or more projections to use for a queried table.
  • /*+DISTRIB*/ specifies how to distribute join key data to implement a join.
  • /*+JTYPE*/ specifies the join algorithm as hash or merge.

Exporting directed queries from one version to the next

After you create the desired directed queries, you export them from the current database catalog to an external script. After upgrading, you can run this script on the database to recreate the directed queries in the new catalog.

For example:

=> SELECT EXPORT_CATALOG('../../export_directedqueries', 'DIRECTED_QUERIES');
EXPORT_CATALOG
-------------------------------------
Catalog data exported successfully

EXPORT_CATALOG exports a script for recreating the directed queries. After upgrading to the new Vertica version, you run that script on the database. All directed queries are now available for activation and use in the upgraded database.

You can also bypass the database catalog, and batch-export query plans as directed queries to an external SQL file. For more information on this approach, see Batch Exporting Directed Queries.

Activating and using directed queries

After the Vertica upgrade, you determine which query plans, if any, you need to reproduce from the earlier version. You then activate the corresponding directed queries.

For example:
=> ACTIVATE DIRECTED QUERY HighestStoreSales;
ACTIVATE DIRECTED QUERY;

This directed query remains active across all database sessions until it is explicitly deactivated. Now, when users enter a query that matches the original input query shown earlier:
SELECT store.store_name, MAX(sales.sales_dollar_amount) AS highest_sale
FROM (store.store_sales sales JOIN store.store_dimension store
ON ((store.store_key = sales.store_key)))
GROUP BY store.store_name
ORDER BY store.store_name;

Vertica maps this query to the active directed query HighestStoreSales. It uses the directed query?s annotated query to create a query plan that replicates the plan produced in the earlier version.

In a nutshell

The following schematic depicts saving a query plan to a directed query, and using it to recreate the query plan in an upgraded database:

 

plan

 

For more information

This blog provides only a brief overview of directed queries. To review the full set of options that directed queries provide, see the online documentation.