What Should I do if my Node Recovery is Slow?
If you are running Vertica 7.2.x or later, perform recovery by table. For details, see Recovery By Table in the Vertica documentation.
If you are running a Vertica version prior to 7.1.x, stop the ETL jobs and restart node recovery.
Step | Task | Results |
---|---|---|
1 | Monitor progress of recovery:
=> SELECT node_name, is_running FROM RECOVERY_STATUS;
If is_running = f, recovery completed.
|
If recovery is not progressing, go to Step 3.
If recovery is progressing or complete, go to Step 2. |
2 | Did node recovery complete successfully? | If yes, this is the end of your checklist.
If no, recovery completed with errors, go to Step 6. |
3 | Is recovery slower than expected? | If no, go to Step 4.
If yes, recovery is slow: => SELECT node_name, pool_name, max_concurrency, running_query_count from RESOURCE_POOL_STATUS;
|
4 | Does recovery seem to be stuck on a particular table? | If no, go to Step 6.
If yes, check if the node is recovering: => SELECT node_name, node_state from NODES;
If node_state=RECOVERING, node recovery is in progress. |
5 | Check whether node is in the Historical phase of recovery:
=> SELECT node_name, recovery_phase, historical_completed, historical_total FROM RECOVERY_STATUS; |
|
6 | Is the transaction stuck in replay deletes? | If no, go to Step 7.
If yes, do one of the following depending on the size of your data: Data < 1 TB: Data > 1 TB: |
7 | Check whether recovery is waiting for a lock.
=> SELECT node_name, user_id, transaction_id, object_name, mode FROM DC_LOCK_REQUESTS; |
If no, go to Step 6.
If yes, |
8 |
Node recovery failed. Check recovery errors.
$ grep "Recovery Error" vertica.log
Is the error a lock error? |
If no, not a lock error, contact Vertica Technical Support.
If yes, => SELECT node_name, user_id, transaction_id, object_name, mode FROM DC_LOCK_REQUESTS;
|
Learn More
Learn more about NODE_STATES in the Vertica Documentation.