S3 Parameters

Use the following parameters to configure reading from S3 file systems and on-premises storage with S3-compatible APIs, such as Pure Storage, using COPY FROM. For more information about reading data from S3, see S3 Object Store.

For the parameters to control the AWS Library (UDSource), see Configure the Vertica Library for Amazon Web Services.

When using AWS, using ALTER SESSION to change these parameters also changes the corresponding parameters for the AWS Library (UDSource).

Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.

Parameter Description
AWSAuth

ID and secret key for authentication. For extra security, do not store credentials in the database; use ALTER SESSION...SET PARAMETER to set this value for the current session only. If you use a shared credential, you can set it in the database with ALTER DATABASE...SET PARAMETER. For example:

=> ALTER SESSION SET AWSAuth='ID:secret';

AWS calls these AccessKeyID and SecretAccessKey.

To use admintools create_db or revive_db for Eon Mode on-premises, create a configuration file called auth_params.conf with these settings:

AWSAuth = key:secret
AWSEndpoint = IP:port
AWSCAFile

File name of the TLS server certificate bundle to use. Setting this parameter overrides the Vertica default CA bundle path specified in the SystemCABundlePath parameter.

If set, this parameter overrides the Vertica default CA bundle path specified in the SystemCABundlePath parameter.

=> ALTER DATABASE DEFAULT SET AWSCAFile = '/etc/ssl/ca-bundle.pem';

Default: system-dependent

AWSCAPath

Path Vertica uses to look up TLS server certificates. The file name of the TLS server certificate bundle to use.

If set, this parameter overrides the Vertica default CA bundle path specified in the SystemCABundlePath parameter.

=> ALTER DATABASE DEFAULT SET AWSCAPath = '/etc/ssl/';

Default: system-dependent

AWSEnableHttps

Boolean, specifies whether to use the HTTPS protocol when connecting to S3, can be set only at the database level with ALTER DATABASE...SET PARAMETER. If you choose not to use TLS, this parameter must be set to 0.

Default: 1 (enabled)

AWSEndpoint

Endpoint to use when interpreting S3 URLs, set as follows.

Do not include http(s):// for AWS endpoints.

  • AWS: hostname_or_ip:port_number.
  • AWS with a FIPS-compliant S3 Endpoint: S3_hostname and enable virtual addressing:

    Do not include http(s)://

  • AWSEndpoint = s3-fips.dualstack.us-east-1.amazonaws.com
    S3EnableVirtualAddressing = 1
  • On-premises/Pure: IP address of the Pure Storage server. If using admintools create_db or revive_db, create configuration file auth_params.conf and include these settings:
    awsauth = key:secret
    awsendpoint = IP:port
  • When AWSEndpoint is not set, the default behavior is to use virtual-hosted request URLs.

Default: s3.amazonaws.com

AWSLogLevel

Log level, one of the following:

  • OFF
  • FATAL
  • ERROR
  • WARN
  • INFO
  • DEBUG
  • TRACE

Default: ERROR

AWSRegion

AWS region containing the S3 bucket from which to read files. This parameter can only be configured with one region at a time. If you need to access buckets in multiple regions, change the parameter each time you change regions.

If you do not set the correct region, you might experience a delay before queries fail because Vertica retries several times before giving up.

Default: us-east-1

AWSSessionToken

Temporary security token generated by running the get-session-token command, which generates temporary credentials you can use to configure multi-factor authentication.

If you use session tokens, you must set all parameters at the session level, even if some of them are set at the database level. Use ALTER SESSION to set session parameters.

S3BucketConfig

Contains S3 bucket configuration information as a JSON object with the following properties. Each property has an equivalent parameter (shown in parentheses). If both the property in S3BucketConfig and the equivalent S3 parameter are set, the S3BucketConfig property takes precedence.

Properties:

  • bucket: The name of the bucket
  • region: The name of the region (AWSRegion)
  • protocol: Specifies whether to secure the connection, one of the following:
    • http: Unencrypted connection
    • https: Encrypted connection
  • endpoint: The endpoint URL or IP address (AWSEndpoint)
  • enableVirtualAddressing: Whether to rewrite the S3 URL to use a virtual hosted path (S3BucketCredentials)
  • requesterPays: Boolean, specifies whether requester (instead of bucket owner) pays the cost of accessing data on the bucket; must be set in order to access S3 buckets configured as Requester Pays buckets. By setting this property to true, you are accepting the charges for accessing data. If not specified, the default value is false.

The configuration properties for a given bucket may differ based on its type. For example, the following S3BucketConfig is for an AWS bucket AWSBucket and a Pure Storage bucket PureStorageBucket. AWSBucket doesn't specify an endpoint, so Vertica uses the AWSEndpoint, which defaults to s3.amazonaws.com:

ALTER DATABASE DEFAULT SET S3BucketConfig=
'[
    {
        "bucket": "AWSBucket",
        "region": "us-east-2",
        "protocol": "https",
        "requesterPays": true
    },
    {
        "bucket": "PureStorageBucket",
        "endpoint": "pure.mycorp.net:1234",
        "protocol": "http",
        "enableVirtualAddressing": false
    }
]';
S3BucketCredentials

Contains credentials for accessing an S3 bucket. Each property in S3BucketCredentials has an equivalent parameter (shown in parentheses). When set, S3BucketCredentials takes precedence over both AWSAuth and AWSSessionToken.

Providing credentials for more than one bucket authenticates to them simultaneously, allowing you to perform cross-endpoint joins, export from one bucket to another, etc.

Properties:

  • bucket: The name of the bucket
  • accessKey: The access key for the bucket (the ID in AWSAuth)
  • secretAccessKey: The secret access key for the bucket (the secret in AWSAuth)
  • sessionToken: The session token, only used when S3BucketCredentials is set at the session level (AWSSessionToken)

For example, the following S3BucketCredentials is for an AWS bucket AWSBucket and a Pure Storage bucket PureStorageBucket and sets all possible properties:

ALTER SESSION SET S3BucketCredentials='
[
    {
        "bucket": "AWSBucket",
        "accessKey": "<AK0>",
        "secretAccessKey": "<SAK0>",
        "sessionToken": "1234567890"
    },
    {
        "bucket": "PureStorageBucket",
        "accessKey": "<AK1>",
        "secretAccessKey": "<SAK1>"
    }
]';

This parameter is only visible to the superuser. Users can set this parameter at the session level with ALTER SESSION.

S3EnableVirtualAddressing

Boolean, specifies whether to rewrite S3 URLs to use virtual-hosted paths. For example, if you use AWS, the S3 URLs change to bucketname.s3.amazonaws.com instead of s3.amazonaws.com/bucketname. This configuration setting takes effect only when you have specified a value for AWSEndpoint.

If you set AWSEndpoint to a FIPS-compliant S3 Endpoint, you must enable S3EnableVirtualAddressing in auth_params.conf:

AWSEndpoint = s3-fips.dualstack.us-east-1.amazonaws.com
S3EnableVirtualAddressing = 1

The value of this parameter does not affect how you specify S3 paths.

Default: 0 (disabled)

As of September 30, 2020, AWS requires virtual address paths for newly created buckets.

S3RequesterPays

Boolean, specifies whether requester (instead of bucket owner) pays the cost of accessing data on the bucket. When true, the bucket owner is only responsible for paying the cost of storing the data, rather than all costs associated with the bucket; must be set in order to access S3 buckets configured as Requester Pays buckets. By setting this property to true, you are accepting the charges for accessing data. If not specified, the default value is false.

AWSStreamingConnectionPercentage

Controls the number of connections to the communal storage that Vertica uses for streaming reads. In a cloud environment, this setting helps prevent streaming data from communal storage using up all available file handles. It leaves some file handles available for other communal storage operations.

Due to the low latency of on-premises object stores, this option is unnecessary for an Eon Mode database that uses on-premises communal storage, such as Pure Storage and MinIO. In this case, disable the parameter by setting it to 0.