What Should I do if my Node Recovery is Slow?

white cloud in vault type room representing cloud computing

If you are running Vertica 7.2.x or later, perform recovery by table. For details, see Recovery By Table in the Vertica documentation. If you are running a Vertica version prior to 7.1.x, stop the ETL jobs and restart node recovery.

Step	Task	Results
1	Monitor progress of recovery: `=> SELECT node_name, is_running FROM RECOVERY_STATUS;`If is_running = f, recovery completed. Repeat this statement to see if the current_completed value increases, meaning that recovery is progressing. Is recovery progressing?	If recovery is not progressing, go to Step 3. If recovery is progressing or complete, go to Step 2.
2	Did node recovery complete successfully?	If yes, this is the end of your checklist. If no, recovery completed with errors, go to Step 6.
3	Is recovery slower than expected?	If no, go to Step 4. If yes, recovery is slow: Use iostat to check disk I/O. If there is a problem, change the disk scheduler to DEADLINE_ R_NOOP. Check to see if the number of concurrently running queries is at or close to the maximum number:`=> SELECT node_name, pool_name, max_concurrency, running_query_count from RESOURCE_POOL_STATUS;` If max_concurrency = running_query_count, the query load is too high: a. Increase MAXCONCURRENCY. b. Restart recovery. c. Go to Step 1.
4	Does recovery seem to be stuck on a particular table?	If no, go to Step 6. If yes, check if the node is recovering: `=> SELECT node_name, node_state from NODES;`If node_state=RECOVERING, node recovery is in progress.
5	Check whether node is in the Historical phase of recovery: `=> SELECT node_name, recovery_phase, historical_completed, historical_total FROM RECOVERY_STATUS;`	If historical_completed is less than historical_total, node is in Historical phase. Vertica is moving storage containers. Repeat previous statement until historical_complete = historical_total. If Vertica is in the Historical phase of recovery, you need to wait until it completes.
6	Is the transaction stuck in replay deletes?	If no, go to Step 7. If yes, do one of the following depending on the size of your data: Data < 1 TB: Stop the node. Run MAKE_AHM_NOW(true). Restart node recovery. Go to Step 2. Data > 1 TB: Create a new table. Load the data from existing table. Delete old table. Restart recovery. Go to Step 2.
7	Check whether recovery is waiting for a lock. `=> SELECT node_name, user_id, transaction_id, object_name, mode FROM DC_LOCK_REQUESTS;`	If no, go to Step 6. If yes, Ask user to release the table. Resume recovery. Go to Step 1.
8	Node recovery failed. Check recovery errors. `$ grep "Recovery Error" vertica.log` Is the error a lock error?	If no, not a lock error, contact Vertica Technical Support. If yes, Check the transaction that is locking the recovery: `=> SELECT node_name, user_id, transaction_id, object_name, mode FROM DC_LOCK_REQUESTS;` If the transaction is the Tuple Mover, wait for it to complete. If the transaction is a load, stop the load, restart recovery, and go to Step 1. If running Vertica 7.2.x, perform recovery by table. See Recovery by Table in the Vertica documentation. Otherwise, if running a version earlier than Vertica 7.2.x, contact Vertica Technical Support.

Learn More

Learn more about NODE_STATES in the Vertica Documentation.

About the Author

Soniya Shah
Information Developer

Currently, a first year law student with a background in science and technology. Experienced technical writer, with specializations in software documentation, big data, blog development, and website development. I build user-centered content to communicate complex and technical information more easily.

I used to work for Vertica full time for about 3 years. I still work at Vertica part time while going to law school.

Update: Soniya is now doing her law internship, and no longer working at Vertica. Good luck, Soniya!

Product Overview

Vertica Announces Vertica 12 for Future-Proof Analytics

Harness the Internet of Things (IoT)

Support & Services

Partners

Vertica Inside – Embedded Analytics at Scale

Resources

About Vertica

Stay Informed

What Should I do if my Node Recovery is Slow?

Learn More

About the Author

Search The Blog

Explore Popular Topics

Subscribe For Email Updates

Product Overview

Vertica Announces Vertica 12 for Future-Proof Analytics

Harness the Internet of Things (IoT)

Support & Services

Partners

Vertica Inside – Embedded Analytics at Scale

Resources

About Vertica

Stay Informed

What Should I do if my Node Recovery is Slow?

Learn More

About the Author

Search The Blog

Explore Popular Topics

Subscribe For Email Updates

See More Under the Hood Posts