Restarting a Downed Node: Quick Tip

Posted December 6, 2018 by James Knicely, Vertica Field Chief Technologist

I’m a big fan of scripting with admintools which provides us with many database tools. One of those awesome tools, that I just became familiar with, is command_start. It allows us to start a downed node, with one caveat: you have to run the command on the node that is down. Example: Here I’ll use the command_start tool on a local downed node: [dbadmin@SE-Sandbox-43-node1 ~]$ admintools -t list_allnodes | grep mydb v_mydb_node0001 | 192.168.61.227 | UP | vertica-9.1.1.4 | mydb v_mydb_node0002 | 192.168.61.228 | UP | vertica-9.1.1.4 | mydb v_mydb_node0003 | 192.168.61.229 | UP | vertica-9.1.1.4 | mydb [dbadmin@SE-Sandbox-43-node1 ~]$ admintools -t kill_host -s 192.168.61.227 *** Terminating vertica and performing host cleanup *** Terminating vertica processes on host ‘192.168.61.227’ All signals sent successfully. a[dbadmin@SE-Sandbox-43-node1 ~]$ admintools -t list_allnodes | grep mydb v_mydb_node0001 | 192.168.61.227 | DOWN | vertica-9.1.1.4 | mydb v_mydb_node0002 | 192.168.61.228 | UP | vertica-9.1.1.4 | mydb v_mydb_node0003 | 192.168.61.229 | UP | vertica-9.1.1.4 | mydb [dbadmin@SE-Sandbox-43-node1 ~]$ admintools -t command_host -c start vertica process is not running vertica process is not running vertica process is not running vertica process is not running starting DB: mydb Restarting host [192.168.61.227] with catalog [v_mydb_node0001_catalog] Issuing multi-node restart Starting nodes: v_mydb_node0001 (192.168.61.227) Starting Vertica on all nodes. Please wait, databases with a large catalog may take a while to initialize. Node Status: v_mydb_node0001: (DOWN) v_mydb_node0003: (UP) Node Status: v_mydb_node0001: (DOWN) v_mydb_node0003: (UP) Node Status: v_mydb_node0001: (DOWN) v_mydb_node0003: (UP) Node Status: v_mydb_node0001: (INITIALIZING) v_mydb_node0003: (UP) Node Status: v_mydb_node0001: (UP) v_mydb_node0003: (UP) vertica process is running (PID 315240) vertica process is not running [dbadmin@SE-Sandbox-43-node1 ~]$ admintools -t list_allnodes | grep mydb v_mydb_node0001 | 192.168.61.227 | UP | vertica-9.1.1.4 | mydb v_mydb_node0002 | 192.168.61.228 | UP | vertica-9.1.1.4 | mydb v_mydb_node0003 | 192.168.61.229 | UP | vertica-9.1.1.4 | mydb Perfect! But what about a remote downed node? [dbadmin@SE-Sandbox-43-node1 ~]$ admintools -t kill_host -s 192.168.61.228 *** Terminating vertica and performing host cleanup *** Terminating vertica processes on host '192.168.61.228' All signals sent successfully. [dbadmin@SE-Sandbox-43-node1 ~]$ admintools -t list_allnodes | grep mydb v_mydb_node0001 | 192.168.61.227 | UP | vertica-9.1.1.4 | mydb v_mydb_node0002 | 192.168.61.228 | DOWN | vertica-9.1.1.4 | mydb v_mydb_node0003 | 192.168.61.229 | UP | vertica-9.1.1.4 | mydb [dbadmin@SE-Sandbox-43-node1 ~]$ admintools -t command_host -c start vertica process is not running vertica process is not running vertica process is not running vertica process is running (PID 315240) vertica process is running (PID 315240) vertica process is running (PID 315240) vertica process is running (PID 315240) vertica process is running (PID 315240) vertica process is not running [dbadmin@SE-Sandbox-43-node1 ~]$ admintools -t list_allnodes | grep mydb v_mydb_node0001 | 192.168.61.227 | UP | vertica-9.1.1.4 | mydb v_mydb_node0002 | 192.168.61.228 | DOWN | vertica-9.1.1.4 | mydb v_mydb_node0003 | 192.168.61.229 | UP | vertica-9.1.1.4 | mydb Darn, that didn’t work! What if I try the command on the downed node itself? [dbadmin@SE-Sandbox-43-node1 ~]$ ssh 192.168.61.228 "admintools -t command_host -c start" vertica process is not running vertica process is not running vertica process is not running vertica process is not running vertica process is running (PID 89977) vertica process is not running starting DB: mydb Restarting host [192.168.61.228] with catalog [v_mydb_node0002_catalog] Issuing multi-node restart Starting nodes: v_mydb_node0002 (192.168.61.228) Starting Vertica on all nodes. Please wait, databases with a large catalog may take a while to initialize. Node Status: v_mydb_node0002: (DOWN) v_mydb_node0003: (UP) Node Status: v_mydb_node0002: (DOWN) v_mydb_node0003: (UP) Node Status: v_mydb_node0002: (DOWN) v_mydb_node0003: (UP) Node Status: v_mydb_node0002: (RECOVERING) v_mydb_node0003: (UP) Node Status: v_mydb_node0002: (UP) v_mydb_node0003: (UP) [dbadmin@SE-Sandbox-43-node1 ~]$ admintools -t list_allnodes | grep mydb v_mydb_node0001 | 192.168.61.227 | UP | vertica-9.1.1.4 | mydb v_mydb_node0002 | 192.168.61.228 | UP | vertica-9.1.1.4 | mydb v_mydb_node0003 | 192.168.61.229 | UP | vertica-9.1.1.4 | mydb Nice! Helpful Link: https://www.vertica.com/docs/latest/HTML/Content/Authoring/AdministratorsGuide/AdminTools/WritingAdministrationToolsScripts.htm Have fun!