CREATE PROJECTION

Creates metadata for a projection in the Vertica catalog. Vertica supports four types of projections:

  • Standard projection: Stores collection of table data in a format that optimizes execution of certain queries on that table.
  • Live aggregate projection: Stores the grouped results of queries that invoke aggregate functions (such as SUM) on table columns.
  • Top-K projection: Stores the top k rows from partitions of selected rows.
  • UDTF projection: Stores newly-loaded data after it is transformed and/or aggregated by user-defined transformation functions (UDTFs).

Complex data types have additional restrictions when used within a projection:

  • Each projection must include at least one column that is a primitive type or native array.
  • An AS SELECT clause can use a complex-type column, but any other expression must be of a scalar type or native array.
  • The ORDER BY, PARTITION BY, and GROUP BY clauses cannot use complex types.
  • If a projection does not include an ORDER BY or segmentation clause, Vertica uses only the primitive columns from the select list to order or segment data.
  • Projection columns cannot be complex types returned from functions such as ARRAY_CAT.
  • TopK and UDTF projections do not support complex types.

Syntax

CREATE PROJECTION [ IF NOT EXISTS ] [[database.]schema.]projection
[ ( 
   { projection-column | grouped-clause
   [ ENCODING encoding-type ] 
   [ ACCESSRANK integer ] }[,...] 
) ]
AS SELECT { * | { MATCH_COLUMNS('pattern') | expression [ [AS] alias ] }[,...] }
   FROM [[database.]schema.]table [ [AS] alias]
   [ ORDER BY column-expr[,...] ]
   [ segmentation-spec ] 
   [ KSAFE [ k-num ]
   [ ON PARTITION RANGE BETWEEN min-val AND max-val ] ]

Parameters

IF NOT EXISTS

If an object with the same name exists, do not create it and proceed. If you omit this option and the object exists, Vertica generates a ROLLBACK error message. In both cases, the object is not created if it already exists.

The IF NOT EXISTS clause is useful for SQL scripts where you want to create an object if it does not already exist.

For related information, see ON_ERROR_STOP.

[database.]schema

Specifies the schema for this projection and its anchor table, where schema must be the same for both. If you specify a database, it must be the current database.

projection

Identifies the projection to create, where projection conforms to conventions described in Identifiers. It must also be unique among all names of sequences, tables, projections, views, and models within the same schema.

projection-column

The name of a projection column. The list of projection columns must match the SELECT list columns and expressions in number, type, and sequence.

If projection column names are omitted, Vertica uses the anchor table column names specified in the SELECT list.

grouped-clause

See GROUPED Clause.

ENCODING encoding-type

The column encoding type, by default set to AUTO.

ACCESSRANK integer

Overrides the default access rank for a column. Use this parameter to increase or decrease the speed at which Vertica accesses a column. For more information, see Overriding Default Column Ranking.

AS SELECT

Specifies the columns or column expressions to select from the specified table:

  • * (asterisk)

    Lists all columns in the queried tables.

  • MATCH_COLUMNS('pattern')

    Returns the names of all columns in the queried anchor table that match pattern.

  • expression [[AS] alias]

    Resolves to column data from the queried anchor table.

    You can optionally assign an alias to each column expression and reference that alias elsewhere in the SELECT statement—for example, in the ORDER BY or segmentation clause.

If you specify projection column names, the two lists of projection columns and table columns/expressions must exactly match in number and order.

ORDER BY

Specifies columns from the SELECT list on which to sort the projection. The ORDER BY clause can only be set to ASC (the default). Vertica always stores projection data in ascending sort order.

If you order by a column with a collection data type (ARRAY or SET), queries that use that column in an ORDER BY clause perform the sort again. This is because projections and queries perform the ordering differently.

If you omit the ORDER BY clause, Vertica uses the SELECT list to sort the projection.

segmentation-spec

Specifies how to distribute projection data with one of the following clauses:

  • hash-segmentation-clause: Specifies to segment projection data evenly and distribute across cluster nodes:
    SEGMENTED BY expression ALL NODES [ OFFSET offset ]
  • unsegmented-clause: Specifies to create an unsegmented projection:
    UNSEGMENTED ALL NODES

If the anchor table and projection both omit specifying segmentation, the projection is defined with a hash segmentation clause that includes all columns in the SELECT list , as follows:

SEGMENTED BY HASH(column-expr[,...]) ALL NODES OFFSET 0;

Vertica recommends segmenting large tables.

KSAFE [k-num]

Specifies K-safety for the projection, where k-num must be equal to or greater than database K-safety. Vertica ignores this parameter if set for unsegmented projections. If you omit k-num, Vertica uses database K-safety.

For general information, see K-Safety in an Enterprise Mode Database.

ON PARTITION RANGE

Specifies to limit data of this projection to a range of partition keys, specified as follows:

ON PARTITION RANGE BETWEEN min-range-value AND max-range-value

where the following requirements apply to min‑range‑value and ≤ max‑range‑value:

  • min‑range‑value must be ≤ max‑range‑value
  • They must resolve to a data type that is compatible with the table partition expression.
  • They can be:

    • String literals—for example, 2021-07-31
    • Expressions with stable or immutable functions, for example:
      date_trunc('month', now()::timestamp - interval'1 month')

max-range-value can be set to NULL, to specify that the partition range has no upper bound.

For other requirements and usage details, see Partition Range Projections.

Privileges

Non-superusers:

  • Anchor table owner
  • CREATE privilege on the schema

Examples

See:

CREATE PROJECTION [ IF NOT EXISTS ] [[database.]schema.]projection
[ ( 
   { projection-column | grouped-clause
   [ ENCODING encoding-type ] 
   [ ACCESSRANK integer ] }[,...] 
) ]
AS SELECT { table-column | expr-with-table-columns }[,...] FROM [[database.]schema.]table [ [AS] alias]
   GROUP BY column-expr
   [ KSAFE [ k-num ] ]
					

Parameters

IF NOT EXISTS

If an object with the same name exists, do not create it and proceed. If you omit this option and the object exists, Vertica generates a ROLLBACK error message. In both cases, the object is not created if it already exists.

The IF NOT EXISTS clause is useful for SQL scripts where you want to create an object if it does not already exist.

For related information, see ON_ERROR_STOP.

[database.]schema

Specifies the schema for this projection and its anchor table, where schema must be the same for both. If you specify a database, it must be the current database.

projection

Identifies the projection to create, where projection conforms to conventions described in Identifiers. It must also be unique among all names of sequences, tables, projections, views, and models within the same schema.

projection-column

The name of a projection column. The list of projection columns must match the SELECT list columns and expressions in number, type, and sequence.

If projection column names are omitted, Vertica uses the anchor table column names specified in the SELECT list.

grouped-clause

See GROUPED Clause.

ENCODING encoding-type

The column encoding type, by default set to AUTO.

ACCESSRANK integer

Overrides the default access rank for a column. Use this parameter to increase or decrease the speed at which Vertica accesses a column. For more information, see Overriding Default Column Ranking.

AS SELECT

Specifies the table data to query:

{table-column | expr-with-table-columns } [ [AS] alias] }[,...]

You can optionally assign an alias to each column expression and reference that alias elsewhere in the SELECT statement.

If you specify projection column names, the two lists of projection columns and table columns/expressions must exactly match in number and order.

GROUP BY column‑expr[,...]

One or more column expressions from the SELECT list. The first column-expr must be the first column expression in the SELECT list, the second column-expr must be the second column expression in the SELECT list, and so on.

Privileges

Non-superusers:

  • Anchor table owner
  • CREATE privilege on the schema

Requirements and Restrictions

Vertica does not regard live aggregate projections as superprojections, even those that include all table columns. For other requirements and restrictions, see Creating Live Aggregate Projections.

Examples

See Live Aggregate Projection Example.

CREATE PROJECTION [ IF NOT EXISTS ] [[database.]schema.]projection
[ ( 
   { projection-column | grouped-clause
   [ ENCODING encoding-type ] 
   [ ACCESSRANK integer ] }[,...] 
) ]
AS SELECT { table-column | expr-with-table-columns }[,...] FROM [[database.]schema.]table [ [AS] alias]
   LIMIT num-rows OVER ( window-partition-clause [window-order-clause] )
   [ KSAFE [ k-num ] ]

Parameters

IF NOT EXISTS

If an object with the same name exists, do not create it and proceed. If you omit this option and the object exists, Vertica generates a ROLLBACK error message. In both cases, the object is not created if it already exists.

The IF NOT EXISTS clause is useful for SQL scripts where you want to create an object if it does not already exist.

For related information, see ON_ERROR_STOP.

[database.]schema

Specifies the schema for this projection and its anchor table, where schema must be the same for both. If you specify a database, it must be the current database.

projection

Identifies the projection to create, where projection conforms to conventions described in Identifiers. It must also be unique among all names of sequences, tables, projections, views, and models within the same schema.

projection-column

The name of a projection column. The list of projection columns must match the SELECT list columns and expressions in number, type, and sequence.

If projection column names are omitted, Vertica uses the anchor table column names specified in the SELECT list.

grouped-clause

See GROUPED Clause.

ENCODING encoding-type

The column encoding type, by default set to AUTO.

ACCESSRANK integer

Overrides the default access rank for a column. Use this parameter to increase or decrease the speed at which Vertica accesses a column. For more information, see Overriding Default Column Ranking.

AS SELECT

Specifies the table data to query:

{table-column | expr-with-table-columns } [ [AS] alias] }[,...]

You can optionally assign an alias to each column expression and reference that alias elsewhere in the SELECT statement.

If you specify projection column names, the two lists of projection columns and table columns/expressions must exactly match in number and order.

AS SELECT

Specifies the table data to query:

{table-column | expr-with-table-columns } [ [AS] alias] }[,...]

You can optionally assign an alias to each column expression and reference that alias elsewhere in the SELECT statement.

If you specify projection column names, the two lists of projection columns and table columns/expressions must exactly match in number and order.

LIMIT num‑rows The number of rows to return from the specified partition.
window-partition-clause Specifies window partitioning by one or more comma-delimited column expressions from the SELECT list. The first partition expression must be the first SELECT list item, the second partition expression the second SELECT list item, and so on.
window-order-clause

Specifies the order in which the top k rows are returned, by default in ascending (ASC) order. All column expressions must be from the SELECT list, where the first window order expression must be the first SELECT list item not specified in the window partition clause.

Top-K projections support ORDER BY NULLS FIRST/LAST.

Privileges

Non-superusers:

  • Anchor table owner
  • CREATE privilege on the schema

Requirements and Restrictions

Vertica does not regard Top-K projections as superprojections, even those that include all table columns. For other requirements and restrictions, see Creating Top-K Projections.

Examples

See Top-K Projection Examples.

Currently, projections can only reference UDTFs developed in C++.

CREATE PROJECTION [ IF NOT EXISTS ] [[database.]schema.]projection
[ ( 
   { projection-column | grouped-clause
   [ ENCODING encoding-type ] 
   [ ACCESSRANK integer ]  }[,...] 
) ]
AS { batch-query FROM { prepass-query sq-ref | table [[AS] alias] }
     | prepass-query }
batch-query
SELECT { table-column | expr-with-table-columns }[,...], batch-udtf(batch-args)
   OVER (PARTITION BATCH BY partition-column-expr[,...])
   [ AS (batch-output-columns) ]
prepass-query
SELECT { table-column | expr-with-table-columns }[,...], prepass-udtf(prepass-args)
   OVER (PARTITION PREPASS BY partition-column-expr[,...])
   [ AS (prepass-output-columns) ] FROM table

Parameters

IF NOT EXISTS

If an object with the same name exists, do not create it and proceed. If you omit this option and the object exists, Vertica generates a ROLLBACK error message. In both cases, the object is not created if it already exists.

The IF NOT EXISTS clause is useful for SQL scripts where you want to create an object if it does not already exist.

For related information, see ON_ERROR_STOP.

[database.]schema

Specifies the schema for this projection and its anchor table, where schema must be the same for both. If you specify a database, it must be the current database.

projection

Identifies the projection to create, where projection conforms to conventions described in Identifiers. It must also be unique among all names of sequences, tables, projections, views, and models within the same schema.

projection-column

The name of a projection column. The list of projection columns must match the SELECT list columns and expressions in number, type, and sequence.

If projection column names are omitted, Vertica uses the anchor table column names specified in the SELECT list.

grouped-clause

See GROUPED Clause.

ENCODING encoding-type

The column encoding type, by default set to AUTO.

ACCESSRANK integer

Overrides the default access rank for a column. Use this parameter to increase or decrease the speed at which Vertica accesses a column. For more information, see Overriding Default Column Ranking.

AS SELECT

Specifies the table data to query:

{table-column | expr-with-table-columns } [ [AS] alias] }[,...]

You can optionally assign an alias to each column expression and reference that alias elsewhere in the SELECT statement.

If you specify projection column names, the two lists of projection columns and table columns/expressions must exactly match in number and order.

batch-udtf(batch-args)

The batch UDTF to invoke each time the following events occur:

  • Tuple mover mergeout
  • Queries on the projection
  • If invoked singly, on data load operations

If the projection definition includes a pre-pass subquery, batch-args must exactly match the pre-pass UDTF output columns, in name and order.

prepass-udtf(prepass-args)

The pre-pass UDTF to invoke on each load operation such as COPY or INSERT.

If specified in a subquery, the pre-pass UDTF returns transformed data to the batch query for further processing. Otherwise, the pre-pass query results are added to projection data storage.

OVER (PARTITION
{ BATCH | PREPASS }
BY partition-column-expr[,...]

Specifies the UDTF type and how to partition the data it returns:

In both cases, the OVER clause specifies partitioning with one or more column expressions from the SELECT list. The first partition-column-expr is the first column expression in the SELECT list, the second partition-column-expr is the second column expression in the SELECT list, and so on.

The projection is implicitly segmented and ordered on PARTITION BY columns.

AS (batch-output-columns)
AS (prepass-output-columns)

Optionally names columns that are returned by the UDTF.

If a pre-pass subquery omits this clause, the outer batch query UDTF arguments (batch-args) must reference the column names as they are defined in the pre-pass UDTF.

table [[AS] alias] Specifies the projection's anchor table, optionally qualified by an alias.
sq-results Subquery result set that is returned to the outer batch UDTF.

Privileges

Non-superusers:

  • Anchor table owner
  • CREATE privilege on the schema
  • EXECUTE privileges on all UDTFs that are referenced by the projection

Examples

See Pre-Aggregating UDTF Results.