Updating Statistics
The Vertica query optimizer relies on up-to-date statistics for tables, schemas, and the database. The statistics allow the optimizer to determine the most efficient plan to execute a query.
The following statistics can affect optimizer decisions:
- Eligible projections to answer the query
- The best order in which to perform joins
- Plans involving different algorithms, such as hash join/merge join or group by hash/group by pipelined operations
- Data distribution algorithms, such as broadcast and resegmentation
Without accurate statistics, the optimizer could choose a suboptimal projection or join order for a query.
As you update your data over time, the statistics become out of date. Stale statistics can cause the optimizer to choose a less-than-optimal plan for executing a query. The query plan that the EXPLAIN statement creates contains information about statistics.
How often you update statistics depends on how often your data changes. OpenText recommends that you schedule statistics analysis on a regular basis. Regular updates help the optimizer choose the most efficient plan for executing your queries.
For guidance on updating statistics, see Collecting Database Statistics in the Administrator's Guide.
Vertica provides the following functions related to collecting statistics.
Function | Description |
---|---|
ANALYZE_STATISTICS
|
Collects and aggregates statistics from all nodes that store projections associated with the specified table or column. |
DROP_STATISTICS
|
Removes statistics for the specified table and lets you optionally specify the category of statistics to drop. |
EXPORT_STATISTICS
|
Exports statistics to an XML file. |
IMPORT_STATISTICS
|
Imports statistics from an XML file created by EXPORT_STATISTICS. |