What Should I do to Shut Down Vertica Node for Maintenance?
If you need to shut down a Vertica node for maintenance, follow this checklist.
|1||Verify that all cluster nodes are UP.
||To avoid a long node recovery time after shutdown, if one or more nodes is DOWN, identify and restart them using the instructions in Restarting Vertica on a Host.|
|2||If you can’t restart a node:
||If the AHM has been held for a long time and the delete vector count is high and spread across multiple tables, node recovery could be slow due to replay deletes.
In that case, follow the steps in the node recovery checklist.
Suppose the following are true:
Your final option is to force the AHM to advance with this command:
Then, when you recover the down node, Vertica recovers that node from scratch by copying the data from its buddy nodes.
|3||Check node dependencies:
A clean node dependency lists (number of nodes + 1) lines.
|If the node dependencies are correct, go to Step 4.
If the node dependencies are incorrect, rebalance the data in the cluster:
If, after rebalancing the cluster, node dependencies are still incorrect, contact Vertica Support.
|4||Back up your database to avoid any loss of data.
For more information, see Backing Up the Database.
Consider a hard-link backup to speed up the process, as described in Creating Hard-Link Local Backups.
|If the backup is successful, go to Step 5.
If the backup does not complete, when the database is down, perform a cold backup or an offline backup.
To do so, copy the catalog and data directories to another location.
For Vertica 7.2.x and earlier:
For Vertica 8.0.x and later:
|5||Prepare to shut down the database.|
|a. View the maximum number of sessions:
||Note: MaxClientSessions = 0 allows five dbadmin sessions.|
|b. The shutdown process executes the Tuple Mover. However, to control the shutdown process, run the Tuple Mover to move all projections from the WOS to the ROS:
|c. Verify that Tuple Mover moved everything by querying the RESOURCE_USAGE system table to check for bytes used in WOS.
|d. Verify that mergeout is not running. Shutdown cannot occur until mergeout completes:
||If there are mergeout operations running, wait until they complete, or cancel the mergeouts by closing the sessions, as described in Step 5f.|
|e. Query the SESSIONS system table to see which sessions are still running:
||If sessions don’t complete, proceed to Step 5f.
If sessions are complete, proceed to Step 5g.
|f. Close any open sessions:
||To close mergeout sessions, get the session_id from the TUPLE_MOVER_OPERATIONS system table.|
|g. Verify that the sessions are closed:
||If sessions are still open, return to Step 5b and continue.
Otherwise, proceed to Step 5h.
|h. If Steps 5b through 5g took a long time, new data loads or Tuple Mover operations may have started. If that’s the case, return to Step 5b.
|i. Move the ancient history mark (AHM) to avoid replay deletes.
||Note: If a node is down, this step fails.
|6||Shut down the database:
|7||Verify that the Vertica process is properly shut down on each node:
||If Vertica is properly shut down on each node, proceed to Step 11.
Otherwise, continue to Step 8.
|8||Connect to any node that still has the Vertica process running. Run the following command to see what the Vertica process is doing:
||If you can’t wait for the process to complete, collect vstack information to send to Vertica Technical Support, letting them know that shutdown process did not complete successfully.
|9||To stop the shutdown process, kill the Vertica process in admintools > Advanced > Killing a Vertica Process on host.
If you shut down the database on two buddy nodes so that the database becomes unsafe, Vertica automatically shuts down on all the other nodes.
|10||Verify that Vertica process stopped on all the nodes using the command in Step 7.|
|11||After you have performed the necessary maintenance, restart the database following the instructions in Restart Vertica on a Node.||The checklist is complete.|