Analyzing Row Counts

Vertica lets you obtain row counts for projections and for external tables, through ANALYZE_ROW_COUNT and ANALYZE_EXTERNAL_ROW_COUNT, respectively.

Projection Row Count

ANALYZE_ROW_COUNT is a lightweight operation that collects a minimal set of statistics and aggregate row counts for a projection, and saves it in the database catalog. In many cases, this data satisifes many optimizer requirements for producing optimal query plans. This operation is invoked on the following occasions:

You can explicitly invoke ANALYZE_ROW_COUNT through calls to DO_TM_TASK. For example:

=> SELECT DO_TM_TASK('analyze_row_count', 'store_orders_fact_b0');
 Task: row count analyze
(Table: public.store_orders_fact) (Projection: public.store_orders_fact_b0)

(1 row)

You can change the intervals when Vertica regularly collects row-level statistics by setting configuration parameter AnalyzeRowCountInterval. For example, you can change the collection interval to 1 hour (3600 seconds):

=> ALTER DATABASE DEFAULT SET AnalyzeRowCountInterval = 3600;

External Table Row Count

ANALYZE_EXTERNAL_ROW_COUNT calculates the exact number of rows in an external table. The optimizer uses this count to optimize for queries that access external tables. This is especially useful when an external table participates in a join. This function enables the optimizer to identify the smaller table to use as the inner input to the join, and facilitate better query performance.

The following query calculates the exact number of rows in the external table loader_rejects: