ANALYZE_STATISTICS_PARTITION

Collects and aggregates data samples and storage information for a range of partitions in the specified table. Vertica writes the collected statistics to the database catalog.

If you specify multiple partitions, they must be continuous. Different collections can overlap—for example, given partition keys of 1 through 99, successive calls to ANALYZE_STATISTICS_PARTITION can collect statistics for partitions 1-60 and 50-99. Similarly, one collection of statistics can be a subset of an earlier collection. In all cases, Vertica stores only the most recent statistics for each partition range. Thus, given the previous example, Vertica uses only statistics from the second collection for partition keys 50 through 60.

When you query a partitioned table, the Vertica optimizer checks whether the partitions required by this query have recently been analyzed. If so, the optimizer retrieves those statistics and uses them to facilitate query planning. Otherwise, it uses table-level statistics, if available; if none, the optimizer assumes uniform distribution of data values and equal storage usage for all projections.

You can cancel statistics collection with CTRL+C or by calling INTERRUPT_STATEMENT.

Syntax

ANALYZE_STATISTICS_PARTITION ('[[database.]schema.]table', 'min‑range‑value','max‑range‑value' [, 'column[,…]'] [, percent ] )

Returns

0: Success

If an error occurs, refer to vertica.log for details.

Parameters

[database.]schema

Specifies a schema, by default public. If schema is any schema other than public, you must supply the schema name. For example:

myschema.thisDbObject

If you specify a database, it must be the current database.

table

The table on which to collect data.

min‑range‑value
max‑range‑value

The minimum and maximum value of partition keys to analyze, where min‑range‑value must be ≤ max‑range‑value. To analyze one partition, min‑range‑value and max‑range‑value must be equal.

column

The name of a column in table, typically a predicate column. You can specify multiple comma-delimited columns. Vertica narrows the scope of the data collection to the specified columns.

percent

A float value between 0 and 100 that specifies what percentage of data to read from disk (not the amount of data to analyze). If you omit this argument, Vertica sets the percentage to 10.

Analyzing more than 10 percent disk space takes proportionally longer to process, but produces a higher level of sampling accuracy.

Privileges

Non-superuser:

  • Schema: USAGE
  • Table: One of INSERT, DELETE, or UPDATE

Requirements and Restrictions

  • The table must be partitioned and must not contain unpartitioned data.
  • The table's partition expression can specify only one column.
  • Vertica collects no statistics from the following projections:
    • Live aggregate and Top-K projections
    • Projections that are defined to include an SQL function within an expression

See Also