Loading Data From Amazon S3
After you configure the Vertica library for Amazon Web Services (AWS), you can copy data from S3. To do so, use COPY with the S3 UDSource containing the location of your S3 bucket and object. You can use either a standard HTTPS URL, or the S3 URL scheme, as the following examples show.
Use COPY with a standard HTTPS URL:
=> COPY exampleTable SOURCE S3(url='https://s3.amazonaws.com/exampleBucket/object');
Use COPY with the S3 URL scheme:
=> COPY exampleTable SOURCE S3(url='s3://exampleBucket/object');
You can use the S3 UDSource with any UDParser or UDFilter to import any data format supported by Vertica. If Vertica encounters an error with AWS during an S3 import operation which it cannot resolve, it will abort the import and pass the AWS error information on in the Vertica error report.
Importing Multiple Specific Files
Import multiple specific files by separating the URLs with a bar:
=> COPY exampleTable SOURCE s3(url='s3://exampleBucket/object1|S3://exampleBucket/object2');
Specify your own delimiter:
=> COPY exampleTable SOURCE S3(url='s3://exampleBucket/object1,s3://exampleBucket/object2', delimiter=',');
Importing Multiple Files Using Glob Expansion
In addition to importing single and multiple files by specifying exact URL addresses and the URL parameter, you can use glob expansion to import multiple files in a bucket or subdirectory by specifying the location with the bucket parameter.
Import from all files in a bucket using glob expansion:
=> COPY exampleTable SOURCE S3(bucket='s3://exampleBucket/*');
Import from all files in a subdirectory using glob expansion:
=> COPY exampleTable SOURCE S3(bucket='s3://exampleBucket/subdirectory/*');
Import all files with a 'db_' prefix:
=> COPY exampleTable SOURCE S3(bucket='s3://exampleBucket/db_*');
Import all files with a .csv suffix:
=> COPY exampleTable SOURCE S3(bucket='s3://exampleBucket/*.csv');