Share this article:

Uploading Scrutinize to an Amazon S3 Bucket

This best practice document outlines how to upload scrutinize to an S3 bucket, security, and troubleshooting information.

To upload scrutinize to an S3 bucket, there are two important tasks:

  1. Creating a Scrutinize Tarball
  2. Uploading the VerticaScrutinize Tarball

Creating a Scrutinize Tarball

Following is a simple example for creating a scrutinize tarball:

$ /opt/vertica/bin/scrutinize --exclude-tasks VerticaLog --system-table-timeout 3600 --tmp-dir /directory/with/at_least_10GB_free_space

Note

  • Ensure you set the tmp-dir to a directory with at least 10GB free space and that there is 10 GB available on all nodes at that location.
  • tmp-dir cannot be a Network File System (NFS) or Elastic File System (EFS) share. NFS and EFS shares are not suited for rapid and concurrent file creation and deletion due to nfs file metadata cache. Do not use NFS or EFS share as temporary directories.
  • Alternatively, you can set the tmp-dir permanently in the admintools.conf file. You need to set it on all nodes as shown here:

    $ cat /opt/vertica/config/admintools.conf | grep tmp
    $ tmp_dir = /directory/with/at_least_10GB_free_space <== THIS LOCATION MUST HAVE MIN 10GB FREE SPACE

    The scrutinize tarball is created in the current directory. The file tar, tar.gz, or tar.bz2 is created where XXXXXXXXXXXXXX is a number representing a time stamp in the format YYYYMMDDHHMMSS.

    VerticaScrutinize.XXXXXXXXXXXXXX.tar
    VerticaScrutinize.XXXXXXXXXXXXXX.tar.gz
    VerticaScrutinize.XXXXXXXXXXXXXX.tar.bz2

    Important Do not rename or modify the .tar file. If you rename the tarball, it will not be processed.

    In Vertica 9.3.1 and higher versions, you can use the following command to create and upload a tarball. It is a single line command. To improve readability, format the command into four lines.

    $ /opt/vertica/bin/scrutinize --exclude-tasks VerticaLog
    			--system-table-timeout 3600 --tmp-dir /directory/with/at_least_10GB_free_space
    			--vadvise --email email@mycompany.com
    			--account company_name --caseid SDYYYYMMDD

    SDYYYYMMDD is your Vertica support caseid. Use your actual caseid if you have one, leave it blank, or replace it with date in the format YYYYMMDD, for example, SD20191225.

    Uploading the VerticaScrutinize Tarball

    There are several ways to upload the scrutinize tarball using

    • cURL (For Scrutinize Less than 5GB)
    • AWS CLI Client (For Scrutinize Tarball Larger than 5GB)
    • Support SFTP Site (Any Size)
    • vadvise Option (For Vertica 9.3.1 and Higher)

    Using cURL Command (For Scrutinize Less than 5GB)

    Note If you are uploading from a Windows machine command shell, use double quotes (") instead of single quotes (') in the following command.

    The following is a single line command. To improve readability, format the command into six lines.

    $ curl --request PUT --upload-file VerticaScrutinize.XXXXXXXXXXXXXX.tar.gz
    			-H 'x-amz-acl: bucket-owner-full-control'
    			-H 'x-amz-meta-email: email@mycompany.com'
    			-H 'x-amz-meta-account: company_name'
    			-H 'x-amz-meta-caseid: SDYYYYMMDD'
    			https://vertica-scrutinize-upload.s3.amazonaws.com/incoming/

    SDYYYYMMDD is your Vertica support caseid. Use your actual caseid if you have one, leave it blank, or replace it with date in the format YYYYMMDD, for example, SD20191225.

    Using AWS CLI Client (For Scrutinize Tarball Larger than 5GB)

    Amazon Web Services (AWS) does not allow anonymous uploads of files that are larger than 5GB. You need to have an AWS account. You can create an AWS account for free or use the one you already have.

    To upload using AWS CLI client, you need to install and configure AWS CLI.

    For instructions on installing AWS CLI, see Installing the AWS CLI.

    To configure AWS CLI, run the following command:

    $ aws configure
    AWS Access Key ID [********************]:
    AWS Secret Access Key [********************]:
    Default region name [us-east-1]:
    Default output format [None]:

    For more information on configuring AWS CLI, see Configuring the AWS CLI.

    To upload to S3, run the following command. It is a single line command. To improve readability format it into three lines.

    $ aws s3 cp ./VerticaScrutinize.XXXXXXXXXXXXXX.tar s3://vertica-scrutinize-upload/incoming/
    	--metadata "email=email@mycompany.com,account=company_name,caseid=SDYYYYMMDD"
    	--acl bucket-owner-full-control

    The AWS user used to upload the file must have at least PutObject permissions. You can set this in the Permissions tab as shown in the following image:

    Run the following command to set the permissions:

    {
    	"Version": "2012-10-17",
    	"Statement": [
    		{
    			"Effect": "Allow", 
    			"Action": [
    				"s3:PutObject"
    			],
    			"Resource": [
    				"arn:aws:s3:::*"
    			]
    		}
    	]
    }

    Using Support SFTP Site (Any Size)

    If you have access to Vertica Support SFTP, you can upload the tarball on the SFTP server. However, just uploading the tarball does not result in automatic processing. The support engineer needs to manually process the tarball.

    $ sftp sftp.vertica.com
    > user: <your_user_name>
    > password: <your_password>
    > put VerticaScrutinize.XXXXXXXXXXXXXX.tar
    > quit

    Using vadvise Option (In Vertica 9.3.1 and Higher)

    You can create and directly upload the tarball using a single option, vadvise. This wraps the cURL. However, if the size of the tarball is larger than 5GB, it fails to upload. You can upload either Using AWS CLI Client (For Scrutinize Tarball Larger than 5GB) or Using Support SFTP Site (Any Size).

    The following is a single line command. To improve readability format it into three lines.

    $ /opt/vertica/bin/scrutinize --vadvise --email email@mycompany.com
    			--account company_name --caseid SDYYYYMMDD
    			--exclude-tasks VerticaLog --system-table-timeout 3600

    Security

    To secure your uploads, the S3 bucket is WRITE-ONLY. You cannot read from it or list it. After you have uploaded the file, you will not be able to download it. You may re upload the same file in case there was an error uploading the first time if it has not already been processed. If you try to download the file, the following error is displayed:

    $ curl --request GET -k https://vertica-scrutinize-upload.s3.amazonaws.com/incoming/VerticaScrutinize.
    XXXXXXXXXXXXXX.tar.gz
    <?xml version="1.0" encoding="UTF-8"?>
    <Error>
    <Code>AccessDenied</Code>
    <Message>Access Denied</Message>
    <RequestId>YOUTALKINGTOME</RequestId>
    <HostId>BetTyButterBougHtSomeButterButtHeButterWAsBitter+k32BsHA=</HostId>
    </Error>

    If SSL certificates are not configured correctly on your host you can add the '-k' option in cURL. This is less secure (see cURL Command Options), but Vertica lets the sender decide its use.

    Common Errors and Troubleshooting

    Typo in URL / no internet access / DNS name resolution failure

    The following error message is displayed if there is a typo in the URL, no internet access, or a DNS name resolution failure:

    $ curl: (6) Couldn't resolve host 'vertica-scrutinize-uploads3.amazonaws.com'

    Typo in URL

    The following error message is displayed if there is a typo in the URL:

    <?xml version="1.0" encoding="UTF-8"?>
    <Error>
    <Code>NoSuchBucket</Code>
    <Message>The specified bucket does not exist</Message>
    <BucketName>verticascrutinize-upload</BucketName>
    <RequestId>CA3987DE9B583E80</RequestId>
    <HostId>RJUrS1xgTlK+Wl/JRrPFDxMCpKQIHD7657575awtEPK3pn59I8qXoZx0ymUhBFRjaZI=</HostId>
    </Error>

    Tarball is larger than 5GB.

    If the tarball is larger than 5 GB, the following error message is displayed:

    <?xml version="1.0" encoding="UTF-8"?>
    <Error>
    <Code>EntityTooLarge</Code>
    <Message>Your proposed upload exceeds the maximum allowed size</Message>
    <ProposedSize>6907125760</ProposedSize>
    <MaxSizeAllowed>5368709120</MaxSizeAllowed>
    <RequestId>84C95606B11FE576</RequestId>
    <HostId>8UzIdsdfsdafsdfasdfBn5s8k6THZMwyZYWz5nRLu2/XkUm3/m3dE+Y=</HostId>
    </Error>

    Use AWS CLI or SFTP server.

    If the tarball is more than 5GB, you cannot anonymously upload the tarball. You must

    1. Get access to Vertica support.
    2. Create a support case.
    3. Use a Vertica support SFTP access to upload the tarball.
    4. Ask the support engineer to process the scrutinize.

    Firewall blocking outbound SSL access

    The following error message is displayed if a firewall is blocking outbound SSL access:

    C:\Users\johndoe\Desktop\company_name>curl -v --request PUT --upload-file .\VerticaScrutinize.XXXXXXXXXXXXXX.tar -H "x-amzacl:
    bucket-owner-full-control" -H "x-amz-meta-email: email@mycompany.com" -H "x-amz-meta-account: company_name"
    https://vertica-scrutinize-upload.s3.amazonaws.com/incoming/
    * Trying 52.216.178.99...
    * TCP_NODELAY set
    * Connected to vertica-scrutinize-upload.s3.amazonaws.com (52.216.178.99) port 443 (#0)
    * schannel: SSL/TLS connection with vertica-scrutinize-upload.s3.amazonaws.com port 443 (step 1/3)
    * schannel: checking server certificate revocation
    * schannel: sending initial handshake data: sending 207 bytes...
    * schannel: sent initial handshake data: sent 207 bytes
    * schannel: SSL/TLS connection with vertica-scrutinize-upload.s3.amazonaws.com port 443 (step 2/3)
    * schannel: failed to receive handshake, need more data
    * schannel: SSL/TLS connection with vertica-scrutinize-upload.s3.amazonaws.com port 443 (step 2/3)
    * schannel: encrypted data got 7
    * schannel: encrypted data buffer: offset 7 length 4096
    * schannel: next InitializeSecurityContext failed: SEC_E_ILLEGAL_MESSAGE (0x80090326) - This error usually
    occurs when a fatal SSL/TLS alert is received (e.g. handshake failed). More detail may be available in the
    Windows System event log.
    * Closing connection 0
    * schannel: shutting down SSL/TLS connection with vertica-scrutinize-upload.s3.amazonaws.com port 443
    * schannel: clear security context handle
    curl: (35) schannel: next InitializeSecurityContext failed: SEC_E_ILLEGAL_MESSAGE (0x80090326) - This error
    usually occurs when a fatal SSL/TLS alert is received (e.g. handshake failed). More detail may be available in

    the Windows System event log.

    Attempt to rename the file during upload

    Adding VerticaScrutinize.something at the end of the URL is equivalent to renaming the file and this file will not be processed.

    curl --request PUT .....
    		https://vertica-scrutinize-upload.s3.amazonaws.com/incoming/VerticaScrutinize.something

    AWS CLI was not configured with credentials

    The following error message is displayed if AWS CLI was not configured with credentials.

    aws s3 cp ./VerticaScurtinize.00000000000000.tar.bz s3://vertica-scrutinize-upload/incoming/ --acl bucket-owner-full-control 
    --metadata "email@mycompany.com,account=company_name,caseid=SDYYYYMMDD"
    upload failed: ./VerticaScurtinize.00000000000000.tar.bz to s3://vertica-scrutinize-upload/incoming/
    VerticaScurtinize.00000000000000.tar.bz Unable to locate credentials

    Run AWS configure as shown in Using AWS CLI Client (For Scrutinize Tarball Larger than 5GB)

    AWS User/Identity does not have PutObject privilege

    The following error is displayed if AWS user does not have PutObject privileges:

    johndoe@node01:/vertica/data$ aws s3 cp ./VerticaScurtinize.00000000000000.tar.bz s3://vertica-scrutinize-upload/incoming/ 
    --acl bucket-owner-full-control --metadata "email=email@mycompany.com,account=company_name,caseid=SDYYYYMMDD"
    upload failed: ./VerticaScurtinize.00000000000000.tar.bz to s3://vertica-scrutinize-upload/incoming/
    VerticaScurtinize.00000000000000.tar.bz An error occurred (AccessDenied) when calling the CreateMultipartUpload operation: 
    Access Denied

    Still not working

    If the commands do not work as expected, add "-v -k" to the command and send the output to Big Data Platform Customer Care Team <bigdataplatformcustomercareteam@microfocus.com>

    $ curl -v -k --request PUT --upload-file VerticaScrutinize...... <reset of the command>

    cURL Command Options

    Option Description
    -v Verbose
    -k Insecure (certificate is not validated)
    --request PUT You are putting the file
    --upload-file VerticaScrutinize.XXXXXXXXXXXXXX.tar.gz Uploading the tarball
    -H 'x-amz-acl: bucket-owner-full-control' You are giving Vertica full control of the file
    -H 'x-amz-meta-email: email@mycompany.com' Email address (REQUIRED)
    -H 'x-amz-meta-account: company_name' Account name (REQUIRED, NO SPACES)
    -H 'x-amz-meta-caseid: SD02501234' Support case id (OPTIONAL)
    https://vertica-scrutinize-upload.s3.amazonaws.com/incoming/ The location you are putting to

    AWS CLI Options

    Option Description
    s3 S3 command
    cp Copying a file
    VerticaScurtinize.00000000000000.tar.bz The file you are putting
    s3://vertica-scrutinize-upload/incoming/ The location you are putting to
    acl: bucket-owner-full-control' You are giving Vertica full control of the file

    metadata

    "email=email@mycompany.com,account=company_name,caseid=SDYYYYMMDD"

    Email address (REQUIRED)

    Account name (REQUIRED, NO SPACES)

    Support case id (OPTIONAL)

    Scrutinize Options

    Option Description
    --exclude-tasks VerticaLog
    Do not capture Vertica logs
    system-table-timeout 3600 Timeout after 1 hour
    --tmp-dir /directory/with/at_least_10GB_free_space Temporary directory
    vadvise Create and upload scrutinize
    --email email@mycompany.com

    Email address (REQUIRED)

    account company_name Account name (REQUIRED, NO SPACES)
    --caseid SDYYYYMMDD Support case id (OPTIONAL)

    Share this article: