Hash Segmentation Clause
Specifies how to segment projection data for distribution across all cluster nodes. You can specify segmentation for a table and a projection. If a table definition specifies segmentation, Vertica uses it for that table's auto-projections.
It is strongly recommended that you use Vertica's built-in
HASH function, which distributes data evenly across the cluster, and facilitates optimal query execution.
SEGMENTED BY expression ALL NODES [ OFFSET offset ]
|SEGMENTED BY expression||
A general SQL expression. Hash segmentation is the preferred method of segmentation. Vertica recommends using its built-in
The segmentation expression should specify columns with a large number of unique data values and acceptable skew in their data distribution. In general, primary key columns that meet these criteria are good candidates for hash segmentation.
For details, see Expression Requirements below.
Automatically distributes data evenly across all nodes when the projection is created. Node ordering is fixed.
A zero-based offset that indicates on which node to start segmentation distribution.
This option is not valid for
If you create a projection for a table with the
You can ensure K-safety compliance when you create projections by combining
A segmentation expression must specify table columns as they are defined in the source table. Projection column names are not supported.
The following restrictions apply to segmentation expressions:
- All leaf expressions must be constants or column references to a column in the
- The expression must return the same value over the life of the database.
- Aggregate functions are not allowed.
The expression must return non-negative
INTEGERvalues in the range
0 <= x < 263, and values are generally distributed uniformly over that range.
If the expression produces a value outside the expected range—for example, a negative value—no error occurs, and the row is added to the projection's first segment.
CREATE PROJECTION statement creates projection
public.employee_dimension_super. It specifies to include all columns in table
public.employee_dimension. The hash segmentation clause invokes the Vertica
HASH function to segment projection data on the column
employee_key; it also includes the
ALL NODES clause, which specifies to distribute projection data evenly across all nodes in the cluster:
=> CREATE PROJECTION public.employee_dimension_super AS SELECT * FROM public.employee_dimension ORDER BY employee_key SEGMENTED BY hash(employee_key) ALL NODES;