Hash Segmentation Clause
Specifies how to segment projection data for distribution across some or all cluster nodes. You can specify segmentation for a table and a projection. If a table definition specifies segmentation, Vertica uses it for that table's auto-projections.
It is strongly recommended that you use Vertica's built-in HASH
function, which distributes data evenly across the cluster, and facilitates optimal query execution.
Syntax
SEGMENTED BY expression { ALL NODES [ OFFSET offset ] | NODES node[,…] }
Parameters
SEGMENTED BY expression |
A general SQL expression. Hash segmentation is the preferred method of segmentation. Vertica recommends using its built-in The segmentation expression should specify columns with a large number of unique data values and acceptable skew in their data distribution. In general, primary key columns that meet these criteria are good candidates for hash segmentation. For details, see Expression Requirements below. |
ALL NODES |
Automatically distributes data evenly across all nodes when the projection is created. Node ordering is fixed. |
OFFSET offset |
A zero-based offset that indicates on which node to start segmentation distribution. This option is not valid for If you create a projection for a table with the You can ensure K-safety compliance when you create projections by combining |
NODES node[,…] |
Specifies a subset of the nodes in the cluster over which to distribute projection data. You can specify a node only once. To obtain a list of all cluster nodes, query the system table |
Expression Requirements
A segmentation expression must specify table columns as they are defined in the source table. Projection column names are not supported.
The following restrictions apply to segmentation expressions:
- All leaf expressions must be constants or column references to a column in the
CREATE PROJECTION
'sSELECT
list. - The expression must return the same value over the life of the database.
- Aggregate functions are not allowed.
-
The expression must return non-negative
INTEGER
values in the range0 <= x < 263
, and values are generally distributed uniformly over that range.If the expression produces a value outside the expected range—for example, a negative value—no error occurs, and the row is added to the projection's first segment.
Examples
The following CREATE PROJECTION
statement creates projection public.employee_dimension_super
. It specifies to include all columns in table public.employee_dimension
. The hash segmentation clause invokes the Vertica HASH
function to segment projection data on the column employee_key
; it also includes the ALL NODES
clause, which specifies to distribute projection data evenly across all nodes in the cluster:
=> CREATE PROJECTION public.employee_dimension_super AS SELECT * FROM public.employee_dimension ORDER BY employee_key SEGMENTED BY hash(employee_key) ALL NODES;