Additional Considerations for Cloud Storage

If you are backing up to a supported cloud storage location, you need to do some additional one-time configuration. You must also take additional steps if the cluster you are backing up is running on instances in the cloud. For Amazon Web Services (AWS), you might choose to encrypt your backups, which requires additional steps.

By default, bucket access is restricted to the communal storage bucket. For one-time operations with other buckets like backing up and restoring the database, use the appropriate credentials. See Google Cloud Storage Parameters and S3 Parameters for additional information.

Configuring Cloud Storage for Backups

As with any storage location, you must initialize a cloud storage location with the vbr task init.

Because cloud storage does not support file locking, Vertica uses either your local file system or the cloud storage file system to handle file locks during a backup. You identify this location using the cloud_storage_backup_file_system_path parameter in your vbr configuration file. During a backup, Vertica creates a locked identity file on your local or cloud instance, and a duplicate file in your cloud storage backup location. As long at the files match, Vertica proceeds with the backup, releasing the lock when the backup is complete. As long as the files remain identical, you can use the cloud storage location for backup and restore tasks.

If the files in your locking location become out of sync with the files in your backup location, backup and restore tasks fail with an error message. You can resolve locking inconsistencies by rerunning the init task with the --cloud-force-init parameter:

$ /opt/vertica/bin/vbr --task init --cloud-force-init -c filename.ini 

If a backup fails, confirm that your Vertica cluster has permission to access your cloud storage location.

Configuring Authentication for Google Cloud Storage

If you are backing up to Google Cloud Storage (GCS) from a Google Cloud Platform-based cluster, you must provide authentication to the GCS communal storage location. Set the environment variables as detailed in Configuring Backups to and from Cloud Storage to authenticate to your GCS storage.

See Eon Mode on GCP Prerequisites for additional authentication information, including how to create your hash-based message authentication code (HMAC) key.

Configuring EC2 Authentication for Amazon S3

If you are backing up to S3 from an EC2-based cluster, you must provide authentication to your S3 host. Regardless of the authentication type you choose, your credentials do not leave your EC2 cluster. Vertica supports the following authentication types:

  • AWS credential file
  • Environment variables
  • IAM role

AWS credential file - You can manually create a configuration file on your EC2 initiator host at ~/.aws/credentials.

[default]
aws_access_key_id = YOUR_ACCESS_KEY
aws_secret_access_key = YOUR_SECRET_KEY

For more information on credential files, refer to Amazon Web Services documentation.

Environment variables - Amazon Web Services provides the following environment variables:

  • AWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY

Use these variables on your initiator to provide authentication to your S3 host. When your session ends, AWS deletes these variables. For more information, refer to the AWS documentation.

IAM role - Create an AWS IAM role and grant that role permission to access your EC2 cluster and S3 resources. This method is recommended for managing long-term access. For more information, refer to Amazon Web Services documentation.

Encrypting Backups on Amazon S3

Backups made to Amazon S3 can be encrypted using native server-side S3 encryption capability. For more information on Amazon S3 encryption, refer to Amazon documentation.

Vertica supports server-side encryption only. Client-side encryption is not supported.

Vertica supports the following forms of S3 encryption:

  • Server-Side Encryption with Amazon S3-Managed Keys (SSE-S3)
    • Encrypts backups with AES-256
    • Amazon manages encryption keys
  • Server-Side Encryption with AWS KMS-Managed Keys (SSE-KMS)
    • Encrypts backups with AES-256
    • Requires an encryption key from Amazon Key Management Service
    • Your S3 bucket must be from the same region as your encryption key
    • Allows auditing of user activity

When you enable encryption of your backups, Vertica encrypts backups as it creates them. If you enable encryption after creating an initial backup, only increments added after you enabled encryption are encrypted. To ensure that your backup is entirely encrypted, create new backups after enabling encryption.

To enable encryption, add the following settings to your configuration file:

  • cloud_storage_encrypt_transport: Encrypts your backups during transmission. You must enable this parameter if you are using SSE-KMS encryption.
  • cloud_storage_encrypt_at_rest: Enables encryption of your backups. If you enable encryption and do not provide a KMS key, Vertica uses SSE-S3 encryption.
  • cloud_storage_sse_kms_key_id: If you are using KMS encryption, use this parameter to provide your key ID.

See [CloudStorage] for more information on these settings.

The following example shows a typical configuration for KMS encryption of backups.

[CloudStorage]
cloud_storage_encrypt_transport = True
cloud_storage_encrypt_at_rest = sse				
cloud_storage_sse_kms_key_id = 6785f412-1234-4321-8888-6a774ba2aaaa