Replacing Nodes

If you have a K-Safe database, you can replace nodes, as necessary, without bringing the system down. For example, you might want to replace an existing node if you:

Note: Vertica does not support replacing a node on a K-safe=0 database. Use the procedures to add and remove nodes instead.

The process you use to replace a node depends on whether you are replacing the node with:

Prerequisites

Best Practice for Restoring Failed Hardware

Following this procedure will prevent Vertica from misdiagnosing missing disk or bad mounts as data corruptions, which would result in a time-consuming, full-node recovery.

If a server fails due to hardware issues, for example a bad disk or a failed controller, upon repairing the hardware:

  1. Reboot the machine into runlevel 1, which is a root and console-only mode.

    Runlevel 1 prevents network connectivity and keeps Vertica from attempting to reconnect to the cluster.

  2. In runlevel 1, validate that the hardware has been repaired, the controllers are online, and any RAID recover is able to proceed.

    Note: You do not need to initialize RAID recover in runlevel 1; simply validate that it can recover.

  3. Once the hardware is confirmed consistent, only then reboot to runlevel 3 or higher.

At this point, the network activates, and Vertica rejoins the cluster and automatically recovers any missing data. Note that, on a single-node database, if any files that were associated with a projection have been deleted or corrupted, Vertica will delete all files associated with that projection, which could result in data loss.