Failure Recovery
Vertica can restore the database to a fully functional state after one or more nodes in the system experiences a software- or hardware-related failure. Vertica recovers nodes by querying replicas of the data stored on other nodes. For example, a hardware failure can cause a node to lose database objects or to miss changes made to the database (INSERTs, UPDATEs, and so on) while offline. When the node comes back online, queries other nodes in the cluster to recover lost objects and catch up with database changes.
K‑safety sets fault tolerance for the database cluster, where K can be set to 0, 1, or 2. The value of K specifies how many copies Vertica creates of segmented projection data. If K‑safety for a database is set to 1 or 2, Vertica creates K+1 instances, or buddies, of each projection segment. Vertica distributes these buddies across the database cluster, such that projection data is protected in the event of node failure. If any node fails, the database can continue to process queries so long as buddies of data on the failed node remain available elsewhere on the cluster.
You can monitor the cluster state through the View Database Cluster state menu option.
Recovery Scenarios
Vertica begins the database recovery process when you restart failed nodes or the database. The mode of recovery for a K-safe database depends on the type of failure:
- One or more nodes in the database failed, but the database continued to operate.
- The database shut down cleanly.
- The database shut down uncleanly.
In the first two cases, node recovery is automatic; in the third case (unclean shutdown), recovery requires manual intervention by the database administrator. The following sections discuss these cases in greater detail.
Recovery of failed nodes
One or more nodes failed but the remaining nodes in the database filled in for them, so database operations continued without interruption. Use Administration Tools to restart failed nodes through the Restart Vertica on Host option. While restarted nodes recover their data from other nodes, their status is set to RECOVERING
. Except for a short period at the end, the recovery phase has no effect on database transaction processing. After recovery is complete, the restarted nodes status changes to UP
.
Recovery after clean shutdown
The database was shut down cleanly through Administration Tools. To restart the database, use the Start Database option. On restart, all nodes whose status was UP
before the shutdown resume a status of UP
. If the database contained one or more failed nodes on shutdown and they are now available, they begin the recovery process as described above.
Recovery after unclean shutdown
Reasons for unclean shutdown include:
- A critical node failed, leaving part of the database's data unavailable.
- A site-wide event such as a power failure causes all nodes to reboot.
- Vertica processes on the nodes exited due to a software or hardware failure.
Unclean shutdown can put the database in an inconsistent state—for example, Vertica might have been in the middle of writing data from WOS to disk at the time of failure, and this process was left incomplete. When you restart the database through the Administration Tools, Vertica determines that normal startup is not possible and uses the Last Good Epoch to determine when data was last consistent on all nodes. When you restart the database, Vertica prompts you to accept recovery with the suggested epoch. If accepted, the database recovers and all data changes after the Last Good Epoch are lost. If not accepted, startup is aborted.
Instead of accepting the recommended epoch, you can recover from a backup. You can also choose an epoch that precedes the Last Good Epoch, through the Administration Tools Advanced Menu option Roll Back Database to Last Good Epoch. This is useful in special situations—for example the failure occurs during a batch of loads, where it is easier to restart the entire batch, even though some of the work must be repeated. In most cases, you should accept the recommended epoch.
Epochs and Node Recovery
The checkpoint epoch (CPE) for both the source and target projections are updated as ROS containers are moved. The start and end epochs of all storage containers, such as ROS containers, are modified to the commit epoch. When this occurs, the epochs of all columns without an actual data file rewrite advance the CPE to the commit epoch of move_partitions_to_table. If any nodes are down during the TM moveout and/or during the move partition, they will detect that there is storage to recover, and will recover from other nodes with the correct epoch upon rejoining the cluster.
Manual Recovery Notes
- You can manually recover a database where up to K nodes are offline—for example, they were physically removed for repair or not reachable at the time of recovery. When the missing nodes are restored, they recover and rejoin the cluster as described earlier in Recovery Scenarios.
- You can manually recover a database if the nodes to be restarted can supply all partition segments, even if more than K nodes remain down at startup. In this case, all data is available from the remaining cluster nodes, so the database can successfully start.
-
The default setting for the
HistoryRetentionTime
configuration parameter is 0, so Vertica only keeps historical data when nodes are down. This setting prevents use of the Administration Tools Roll Back Database to Last Good Epoch option because the AHM remains close to the current epoch and a rollback is not permitted to an epoch that precedes the AHM. If you rely on the Roll Back option to remove recently loaded data, consider setting a day-wide window to remove loaded data—for example:=> ALTER DATABASE mydb SET HistoryRetentionTime = 86400;
See Epoch Management Parameters in the Administrator's Guide.
- When a node is down and manual recovery is required, it can take a full minute or longer for Vertica processes to time out while the system tries to form a cluster. Wait approximately one minute until the system returns the manual recovery prompt. Do not press CTRL-C during database startup.