Export Data to Amazon S3 From Vertica

After you configure the library for Amazon Web Services (AWS), you can export Vertica data to Amazon S3 by calling the S3EXPORT() transform function. S3EXPORT() writes data to files, based on the URL you provide. Vertica performs all communication over HTTPS, regardless of the URL type you use. Vertica does not support virtual host style URLs. If you use HTTPS URL constructions, you must use path style URLs.

If your S3 bucket contains a period in its path, set the prepend_hash parameter to True.

You can control the output of S3EXPORT() in the following ways:

Adjust the Query Provided to S3EXPORT

By adjusting the query given to S3EXPORT(), you can export anything from tables to reporting queries.

This example exports a whole table:

=> SELECT S3EXPORT( * USING PARAMETERS url='s3://exampleBucket/object') OVER(PARTITION BEST) FROM exampleTable;
 rows | url
------+------------------------------
  606 | https://exampleBucket/object
(1 row)

This example exports the results of a query:

=> SELECT S3EXPORT(customer_name, annual_income USING PARAMETERS url='s3://exampleBucket/object') OVER()
    FROM public.customer_dimension
      WHERE (customer_gender, annual_income) IN
        (SELECT customer_gender, MAX(annual_income) 
         FROM public.customer_dimension
         GROUP BY customer_gender);

 rows | url
------+------------------------------
  10  | https://exampleBucket/object
(1 row)

Adjust the Partition of Your Result Set with the OVER Clause

Use the OVER clause to control your export partitions. Using the OVER() clause without qualification results in a single partition processed by the initiator for all of the query data. This example shows how to call the function with an unqualified OVER() clause:

=> SELECT S3EXPORT(name, company USING PARAMETERS url='s3://exampleBucket/object',
                                                  delimiter=',') OVER()
     FROM exampleTable WHERE company='Vertica';
 rows | url
------+------------------------------
  10  | https://exampleBucket/object
(1 row)

You can also use window clauses, such as window partition clauses and window order clauses, to manage exported objects.

This example shows how you can use a window partition clause to partition S3 objects based on company values:

=> SELECT S3EXPORT(name, company
                    USING PARAMETERS url='s3://exampleBucket/object',
                                     delimiter=',') OVER(PARTITION BY company) AS MEDIAN
      FROM exampleTable;

Related Topics