ANALYZE_CORRELATIONS

Analyzes the specified tables for pairs of columns that are strongly correlated. ANALYZE_CORRELATIONS stores the 20 pairs with the strongest correlation. ANALYZE_CORRELATIONS also analyzes statistics.

ANALYZE_CORRELATIONS analyzes only pairwise single-column correlations.

For example, state name and country name columns are strongly correlated because the city name usually, but perhaps not always, identifies the state name. The city of Conshohoken is uniquely associated with Pennsylvania, whilethe city of Boston exists in Georgia, Indiana, Kentucky, New York, Virginia, and Massachusetts. In this case, city name is strongly correlated with state name.

Behavior Type

Immutable

Syntax

ANALYZE_CORRELATIONS ('[[[database.]schema.]table ]' [, 'recalculate'] )

Parameters

[database.]schema

Specifies a schema, by default public. If schema is any schema other than public, you must supply the schema name. For example:

myschema.thisDbObject

If you specify a database, it must be the current database.

table-name

Identifies the table to analyze. If you omit specifying a schema, Vertica uses the current schema search path. If set to an empty string, Vertica analyzes all tables in the current schema.

recalculate

Boolean that specifies whether to analyze correlated columns that were previously analyzed.

Column correlation analysis typically needs to be done only once.

Default: false

Privileges

One of the following:

  • Superuser
  • User with USAGE privilege on the design schema

Analysis Follow-Up

To take advantage of the correlations that ANALYZE_CORRELATIONS discovers, run Database Designer programmatically. Run DESIGNER_SET_ANALYZE_CORRELATIONS_MODE to specify that Database Designer consider existing column correlations. Be sure to specify that Database Designer not analyze statistics so it does not override existing statistics.

Example

In the following example, ANALYZE_CORRELATIONS analyzes column correlations for all tables in the public schema,even if they currently exist. The correlations that ANALYZE_CORRELATIONS finds are saved, so Database Designer can use them the next time it runs on the VMart database:

=> SELECT ANALYZE_CORRELATIONS ('public.*', 'true');
 ANALYZE_CORRELATIONS
----------------------
                    0
(1 row)