ANALYZE_STATISTICS_PARTITION

Collects and aggregates data samples and storage information for a range of partitions in the specified table. Vertica writes the collected statistics to the database catalog.

You can cancel statistics collection with CTRL+C or meta-function INTERRUPT_STATEMENT.

Syntax

ANALYZE_STATISTICS_PARTITION ('[[database.]schema.]table', 'min‑range‑value','max‑range‑value' [, 'column‑list'] [, percent ] )

Returns

0: Success

If an error occurs, refer to vertica.log for details.

Parameters

[database.]schema

Specifies a schema, by default public. If schema is any schema other than public, you must supply the schema name. For example:

myschema.thisDbObject

If you specify a database, it must be the current database.

table

The table on which to collect data.

min‑range‑value
max‑range‑value

The minimum and maximum value of partition keys to analyze, where min‑range‑value must be ≤ max‑range‑value. To analyze one partition, min‑range‑value and max‑range‑value must be equal.

column‑list

A comma-delimited list of columns in table, typically a predicate column. Vertica narrows the scope of the data collection to the specified columns.

percent

A float value between 0 and 100 that specifies what percentage of data to read from disk (not the amount of data to analyze). If you omit this argument, Vertica sets the percentage to 10.

Analyzing more than 10 percent disk space takes proportionally longer to process, but produces a higher level of sampling accuracy.

Privileges

Non-superuser:

  • Schema: USAGE
  • Table: One of INSERT, DELETE, or UPDATE

Requirements and Restrictions

The following requirements and restrictions apply to running ANALYZE_STATISTICS_PARTITION on a table:

  • If you specify multiple partitions, they must be continuous. Different collections can overlap—for example, given partition keys of 1 through 99, successive calls to ANALYZE_STATISTICS_PARTITION can collect statistics for partitions 1-60 and 50-99. Similarly, one collection of statistics can be a subset of an earlier collection. In all cases, Vertica stores only the most recent statistics for each partition range. Thus, given the previous example, Vertica uses only statistics from the second collection for partition keys 50 through 60.
  • The table must be partitioned and cannot contain unpartitioned data.
  • The table partition expression must specify a single column. The following expressions are supported:
    • Expressions that specify only the column—that is, partition on all column values. For example:
      PARTITION BY ship_date GROUP BY CALENDAR_HIERARCHY_DAY(ship_date, 2, 2)
    • If the column is a DATE or TIMESTAMP/TIMESTAMPTZ, the partition expression can specify a supported date/time function that returns that column or any portion of it, such as month or year. For example, the following partition expression specifies to partition on the year portion of column order_date:
      PARTITION BY YEAR(order_date)
    • Expressions that perform addition or subtraction on the column. For example:
      PARTITION BY YEAR(order_date) -1
  • The table partition expression cannot coerce the specified column to another data type.
  • Vertica collects no statistics from the following projections:
    • Live aggregate and Top-K projections
    • Projections that are defined to include an SQL function within an expression

Examples

See Collecting Partition Statistics.