High Availability With Projections

To ensure high availability and recovery for database clusters of three or more nodes, Vertica:

  • Replicates small, unsegmented projections
  • Creates buddy projections for large, segmented projections.

Replication (Unsegmented Projections)

When it creates projections, Database Designer replicates them, creating and storing duplicates of these projections on all nodes in the database.

Replication ensures:

  • Distributed query execution across multiple nodes.
  • High availability and recovery. In a K-safe database, replicated projections serve as buddy projections. This means that you can use a replicated projection on any node for recovery.

We recommend you use Database Designer to create your physical schema. If you choose not to, be sure to segment all large tables across all database nodes, and replicate small, unsegmented table projections on all database nodes.

The following illustration shows two projections, B and C, replicated across a three node cluster.

Buddy Projections (Segmented Projections)

Vertica creates buddy projections which are copies of segmented projections that are distributed across database nodes (see Projection Segmentation.) Vertica distributes segments that contain the same data to different nodes. This ensures that if a node goes down, all the data is available on the remaining nodes. Vertica distributes segments to different nodes by using offsets. For example, segments that comprise the first buddy projection (A_BP1) are offset from projection A by one node, and segments from the second buddy projection (A_BP2) are offset from projection A by two nodes.

The following diagram shows the segmentation for a projection called A and its buddy projections, A_BP1 and A_BP2, for a three node cluster.

The following diagram shows how Vertica uses offsets to ensure that every node has a full set of data for the projection.

How Result Sets Are Stored

Vertica duplicates table columns on all nodes in the cluster to ensure high availability and recovery. Thus, if one node goes down in a K-Safe environment, the database continues to operate using duplicate data on the remaining nodes. Once the failed node resumes its normal operation, it automatically recovers its lost objects and data by querying other nodes.

Vertica compresses and encodes data to greatly reduce the storage space. It also operates on the encoded data whenever possible to avoid the cost of decoding. This combination of compression and encoding optimizes disk space while maximizing query performance.

Vertica stores table columns as projections. This enables you to optimize the stored data for specific queries and query sets. Vertica provides two methods for storing data:

  • Projection segmentation is recommended for large tables (fact and large dimension tables)
  • Replication is recommended for the rest of the tables.