Uploading Scrutinize to an Amazon S3 Bucket for Vertica Advisor Report

Vertica Advisor report is a Database Health Check Tool that helps you follow best practices and identify any environmental or configuration issues. The Vertica Advisor report is based on the information collected by the scrutinize tool, which is shipped with the Vertica server code. To generate a report, customers can open a support case and provide the scrutinize output using instructions the support team provides. Additionally, a more direct option is to upload the scrutinize output to a secure Vertica Amazon S3 Bucket. Either way, a Vertica Advisor report is generated automatically and is sent by email.

This best practice document outlines how to upload scrutinize to the Vertica S3 bucket, security, and troubleshooting information.

To upload scrutinize to an S3 bucket, there are two 2 options:

  • Uploading a scrutinize output using scrutinize command (Vertica version 9.3.1 or later)

  • Creating a scrutinize output and uploading manually (all Vertica versions)

Uploading a Scrutinize Output Using Scrutinize Command

If the Vertica version is 9.3.1 or later, the scrutinize tool can upload the output to the Vertica S3 bucket directly if the nodes have an internet connection. If the Vertica version is older than 9.3.1, or the nodes do not have an internet connection, or the nodes cannot reach Amazon S3 bucket due to network security policies, the scrutinize output can be uploaded manually.

Following are the recommended command options for uploading an output via the scrutinize command. Execute it as dbadmin or Vertica administrative OS user:

$ /opt/vertica/bin/scrutinize -d database_name -U db_user -P db_user_password \
  --exclude-tasks VerticaLog \
  --output_dir /directory/with/at_least_10GB_free_space_for_output \
  --tmpdir /directory/with/at_least_10GB_free_space_for_temp \
  --vadvise --email 'email@mycompany.com' \
  --account 'company_name' --caseid SDYYYYMMDD

Parameters:

Parameter Description
-U DB user name. User has to have Superuser or SYSMONITOR privileges.
-P DB user password. Password with special characters must be enclosed with single quotes.
-d Database name. Database must be UP.
--vadvise Upload created scrutinize output to an S3 bucket. Requires --email, --account and --caseid options.
--tmpdir Directory where scrutinizer saves its temporary data. How much space depends on variables such as the size of the Vertica log and extracted system tables. Recommend that directory has at least 10GB free space on all nodes.
--output_dir Directory where scrutinizer saves its output. How much space depends on variables such as the size of the Vertica log and extracted system tables. Recommend that directory has at least 10GB free space on initiator node.
--exclude-tasks VerticaLog Do not omit this parameter and 'VerticaLog' parameter value. Otherwise, scrutinize collects vertica.log files that are not necessary for a report and may consume many system resources.
---email Email address that receives a report. Email address must be enclosed with single quotes.
--caseid Vertica support case ID. Use actual case ID if you have one, or replace it with date in the format YYYYMMDD, for example, SD20230131.
--account Company name or support account name. It must be enclosed with single quotes and it has no space.

Note

  • tmpdir cannot be a Network File System (NFS) or Elastic File System (EFS) share. NFS and EFS shares are not suited for rapid and concurrent file creation and deletion due to NFS file metadata cache. Do not use NFS or EFS share as temporary directories.
  • Creating a Scrutinize Output and Uploading Manually

    To upload scrutinize to an S3 bucket manually, there are two tasks:

    1. Creating a scrutinize output

    2. Uploading the scrutinize output

    Creating a Scrutinize Output

    Following are the recommended command options for creating a scrutinize output. Execute it as dbadmin or Vertica administrative OS user:

    $ /opt/vertica/bin/scrutinize -d database_name -U db_user -P db_user_password \
      --exclude-tasks VerticaLog \
      --output_dir /directory/with/at_least_10GB_free_space_for_output \
      --tmpdir /directory/with/at_least_10GB_free_space_for_temp

    Parameters:

    Parameter Description
    -d Database name. Database must be UP.
    -U DB user name. User has to have Superuser or SYSMONITOR privileges.
    -P DB user password. Password with special characters must be enclosed with single quotes.
    --exclude-tasks VerticaLog Do not omit this parameter and 'VerticaLog' parameter value. Otherwise, scrutinize collects vertica.log files that are not necessary for a report and may consume many system resources.
    --output_dir Directory where scrutinizer saves its output. How much space depends on variables such as the size of the Vertica log and extracted system tables. Recommend that directory has at least 10GB free space on initiator node.
    --tmpdir Directory where scrutinizer saves its temporary data. How much space depends on variables such as the size of the Vertica log and extracted system tables. Recommend that directory has at least 10GB free space on all nodes.

    The scrutinize output is created in the output_dir directory. The file tar is created where XXXXXXXXXXXXXX is a number representing a time stamp in the format YYYYMMDDHHMMSS.

    VerticaScrutinize.XXXXXXXXXXXXXX.tar

    Important Do not rename or modify the .tar file. If the file is renamed, it will not be processed.

    Uploading the Scrutinize Output

    There are 2 ways to upload the scrutinize output using:

    • cURL (For scrutinize output less than 5GB)

    • AWS CLI Client (For scrutinize output larger than 5GB)

    Using cURL Command (For Scrutinize Output Less than 5GB)

    Note If you are uploading from a Windows machine command shell, use double quotes (") instead of single quotes (') in the following command.

    Following is a simple example of uploading a scrutinize output using cURL command. It is a single line command. To improve readability, format the command into six lines.

    $ curl --request PUT --upload-file VerticaScrutinize.XXXXXXXXXXXXXX.tar
        -H 'x-amz-acl: bucket-owner-full-control'
        -H 'x-amz-meta-email: email@mycompany.com'
        -H 'x-amz-meta-account: company_name'
        -H 'x-amz-meta-caseid: SDYYYYMMDD'
        https://vertica-scrutinize-upload.s3.amazonaws.com/incoming/

    Parameters:

    Parameter Description
    --request PUT Request cURL command to put the file.
    ---upload-file Scrutinize output to be uploaded. Do not rename it before uploading.
    x-amz-acl: bucket-owner-full-control Give Vertica team full control of the file.
    x-amz-meta-email Email address that receives a report.
    x-amz-meta-account Company name or support account name. It has no space.
    x-amz-meta-caseid Vertica support case ID. Use actual case ID if you have one, or replace it with date in the format YYYYMMDD, for example, SD20230131.
    https://vertica-scrutinize-upload.s3.amazonaws.com/incoming/ Location you are uploading to.

    Using AWS CLI Client (For Scrutinize Output Larger than 5GB)

    Amazon Web Services (AWS) does not allow anonymous uploads of files that are larger than 5GB. You need to have an AWS account. You can create an AWS account for free or use the one you already have.

    To upload the scrutinize output using AWS CLI client, you need to install and configure AWS CLI. For instructions on installing AWS CLI, see Installing the AWS CLI.

    To configure AWS CLI, execute the following command:

    $ aws configure
    AWS Access Key ID [********************]:
    AWS Secret Access Key [********************]:
    Default region name [us-east-1]:
    Default output format [None]:

    For more information on configuring AWS CLI, see Configuring the AWS CLI.

    To upload the scrutinize output to an S3 bucket, execute the following command. It is a single line command. To improve readability, format it into three lines.

    $ aws s3 cp VerticaScrutinize.XXXXXXXXXXXXXX.tar s3://vertica-scrutinize-upload/incoming/
        --metadata "email=email@mycompany.com,account=company_name,caseid=SDYYYYMMDD"
        --acl bucket-owner-full-control

    Parameters:

    Parameter Description
    s3 S3 command
    cp Copying a file
    VerticaScrutinize.XXXXXXXXXXXXXX.tar Scrutinize output to be uploaded. Do not rename it before uploading.
    s3://vertica-scrutinize-upload/incoming/ Location you are uploading to.
    email Email address that receives a report.
    account Company name or support account name. It has no space.
    caseid Vertica support case ID. Use actual case ID if you have one, or replace it with date in the format YYYYMMDD, for example, SD20230131.
    --acl bucket-owner-full-control V Give Vertica team full control of the file.

    The AWS user used to upload the scrutinize output must have at least PutObject permission. Set this in the Permissions tab as shown in the following image:

    Following is a simple policy example to set this permission.

    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": [
                    "s3:PutObject"
                ],
                "Resource": [
                    "arn:aws:s3:::*"
                ]
            }
        ]
    }

    Security

    To secure your uploads, the Vertica S3 bucket is WRITE-ONLY. You cannot read from it or list it. After you have uploaded the file, you will not be able to download it. You may re-upload the same file in case there was an error uploading the first time if it has not already been processed. If you try to download the file, the following error is displayed:

    $ curl --request GET https://vertica-scrutinize-upload.s3.amazonaws.com/incoming/VerticaScrutinize.XXXXXXXXXXXXXX.tar
    <?xml version="1.0" encoding="UTF-8"?>
    <Error>
    <Code>AccessDenied</Code>
    <Message>Access Denied</Message>
    <RequestId>YOUTALKINGTOME</RequestId>
    <HostId>BetTyButterBougHtSomeButterButtHeButterWAsBitter+k32BsHA=</HostId>
    </Error>

    Common Errors and Troubleshooting

    Typo in URL / no internet access / DNS name resolution failure

    The following error message is displayed if there is a typo in the URL, no internet access, or a DNS name resolution failure:

    $ curl: (6) Couldn't resolve host 'vertica-scrutinize-uploads3.amazonaws.com'

    Typo in URL

    The following error message is displayed if there is a typo in the URL:

    <?xml version="1.0" encoding="UTF-8"?>
    <Error>
    <Code>NoSuchBucket</Code>
    <Message>The specified bucket does not exist</Message>
    <BucketName>verticascrutinize-upload</BucketName>
    <RequestId>CA3987DE9B583E80</RequestId>
    <HostId>RJUrS1xgTlK+Wl/JRrPFDxMCpKQIHD7657575awtEPK3pn59I8qXoZx0ymUhBFRjaZI=</HostId>
    </Error>

    Scrutinize output is larger than 5GB.

    If the scrutinize output is larger than 5 GB, the following error message is displayed:

    <?xml version="1.0" encoding="UTF-8"?>
    <Error>
    <Code>EntityTooLarge</Code>
    <Message>Your proposed upload exceeds the maximum allowed size</Message>
    <ProposedSize>6907125760</ProposedSize>
    <MaxSizeAllowed>5368709120</MaxSizeAllowed>
    <RequestId>84C95606B11FE576</RequestId>
    <HostId>8UzIdsdfsdafsdfasdfBn5s8k6THZMwyZYWz5nRLu2/XkUm3/m3dE+Y=</HostId>
    </Error>

    Use AWS CLI instead of cURL.

    Firewall blocking outbound TLS access

    The following error message is displayed if a firewall is blocking outbound TLS access:

    * Trying 52.216.178.99...
    * TCP_NODELAY set
    * Connected to vertica-scrutinize-upload.s3.amazonaws.com (52.216.178.99) port 443 (#0)
    * schannel: SSL/TLS connection with vertica-scrutinize-upload.s3.amazonaws.com port 443 (step 1/3)
    * schannel: checking server certificate revocation
    * schannel: sending initial handshake data: sending 207 bytes...
    * schannel: sent initial handshake data: sent 207 bytes
    * schannel: SSL/TLS connection with vertica-scrutinize-upload.s3.amazonaws.com port 443 (step 2/3)
    * schannel: failed to receive handshake, need more data
    * schannel: SSL/TLS connection with vertica-scrutinize-upload.s3.amazonaws.com port 443 (step 2/3)
    * schannel: encrypted data got 7
    * schannel: encrypted data buffer: offset 7 length 4096
    * schannel: next InitializeSecurityContext failed: SEC_E_ILLEGAL_MESSAGE (0x80090326) - This error usually
    occurs when a fatal SSL/TLS alert is received (e.g. handshake failed). More detail may be available in the
    Windows System event log.
    * Closing connection 0
    * schannel: shutting down SSL/TLS connection with vertica-scrutinize-upload.s3.amazonaws.com port 443
    * schannel: clear security context handle
    curl: (35) schannel: next InitializeSecurityContext failed: SEC_E_ILLEGAL_MESSAGE (0x80090326) - This error
    usually occurs when a fatal SSL/TLS alert is received (e.g. handshake failed).
    More detail may be available in the Windows System event log.

    Attempt to rename the file during upload

    Adding VerticaScrutinize.something at the end of the URL is equivalent to renaming the file and this file will not be processed.

    curl --request PUT ... https://vertica-scrutinize-upload.s3.amazonaws.com/incoming/VerticaScrutinize.something

    AWS CLI was not configured with credentials

    The following error message is displayed if AWS CLI was not configured with credentials.

    $ aws s3 cp ./VerticaScrutinize.00000000000000.tar. s3://vertica-scrutinize-upload/incoming/
      --metadata "email@mycompany.com,account=company_name,caseid=SDYYYYMMDD"
      --acl bucket-owner-full-control
     
    upload failed: ./VerticaScurtinize.00000000000000.tar to s3://vertica-scrutinize-upload/incoming/
    VerticaScurtinize.00000000000000.tar Unable to locate credentials

    AWS User/Identity does not have PutObject privilege

    The following error is displayed if AWS user does not have PutObject privileges:

    $ aws s3 cp ./VerticaScurtinize.00000000000000.tar s3://vertica-scrutinize-upload/incoming/
      --acl bucket-owner-full-control
      --metadata "email=email@mycompany.com,account=company_name,caseid=SDYYYYMMDD"
     
    upload failed: ./VerticaScurtinize.00000000000000.tar to s3://vertica-scrutinize-upload/incoming/
    VerticaScurtinize.00000000000000.tar An error occurred (AccessDenied) when calling the CreateMultipartUpload operation:
    Access Denied

    Still not working

    If the commands do not work as expected, add "-v -k" to the command and send the output to Big Data Platform Customer Care Team <bigdataplatformcustomercareteam@microfocus.com>

    $ curl -v -k --request PUT --upload-file VerticaScrutinize...... <reset of the command>

    For More Information

    For more information about scrutinize, see the Vertica documentation.