Troubleshooting Backup and Restore

These tips can help you avoid issues related to backup and restore with Vertica and to troubleshoot any problems that occur.

Check the Logs

The vbr log is separate from the Vertica log. The default location is /tmp/vbr, but you can change this location by setting the vbr configuration parameter tempDir. vbr logs are not included in Scrutinize reports.

If you cannot find an explanation for an error or unexpected results in the log, try increasing the logging level. You can set the level using the --debug option on the vbr command line. Specify an integer value from 0 (the default) to 3 (the most verbose). For example:

$ vbr -t backup -c full_backup.ini --debug 3

As you increase the logging level, the file size of the log increases.

Check Status of Backup Nodes

Backups fail if you run out of disk space on the backup hosts or if vbr cannot reach them all. Check that you have sufficient space on each backup host and that you can reach each host via ssh.

Sometimes vbr leaves rsync processes running on the database or backup nodes. These processes can interfere with new ones. If you get an rsync error in the console, look for runaway processes and kill them.

Object Replication Fails

Confirm that you have excluded all DOWN nodes from your object replication.

If you do not exclude the DOWN node, replication fails with the following error:

Error connecting to a destination database node on the host <hostname> : <error>  ...

Restoring an Archive Produces an Error

You might see an error like the following when restoring an archive:

$ vbr --task restore --archive prd_db_20190131_183111 --config-file /home/dbadmin/backup.ini
IOError: [Errno 2] No such file or directory: '/tmp/vbr/vbr_20190131_183111_s0rpYR/prd_db.info'

The problem is that the archive name is not in the correct format. Specify only the date/timestamp suffix of the directory name that identifies the archive to restore, as described in Restoring an Archive. For example:

$ vbr --task restore --archive 20190131_183111 --config-file /home/dbadmin/backup.ini

Backup or Restore Fails When Using an HDFS Storage Location

When performing a backup of a cluster that includes HDFS storage locations, you might see an error like the following:

ERROR 5127:  Unable to create snapshot No such file /usr/bin/hadoop: 
check the HadoopHome configuration parameter

This error is caused by the backup script not being able to back up the HDFS storage locations. You must configure Vertica and Hadoop to enable the backup script to back up these locations. See Requirements for Backing Up and Restoring HDFS Storage Locations.

Object-level backup and restore are not supported with HDFS storage locations. You must use full backup and restore.

Could Not Connect to Endpoint URL (Eon Mode)

When performing a cross-endpoint operation, you can see a connection error if you failed to specify the endpoint URL for your communal storage (VBR_COMMUNAL_STORAGE_ENDPOINT_URL). When the endpoint is missing but you specify credentials for communal storage, vbr attempts to use those credentials to access AWS. This access fails, because those credentials are for your on-premises storage, not AWS. When performing cross-endpoint operations, verify that all of the environment variables described in Configuring Backups to and from Cloud Storage are set correctly.