Comparing Partitioning and Segmentation

In Vertica, partitioning and segmentation are separate concepts and achieve different goals to localize data:

For example: partitioning data by year makes sense for retaining and dropping annual data. However, segmenting the same data by year would be inefficient, because the node holding data for the current year would likely answer far more queries than the other nodes.

The following diagram illustrates the flow of segmentation and partitioning on a four-node database cluster:

  1. Example table data
  2. Data segmented by HASH(order_id)
  3. Data segmented by hash across four nodes
  4. Data partitioned by year on a single node

While partitioning occurs on all four nodes, the illustration shows partitioned data on one node for simplicity.

data partition vs segmentation