ORC (Parser)

Use the ORC clause with the COPY statement to load data in the ORC format. When loading data into Vertica you can read all primitive types, UUIDs, and complex types.

When loading ORC data, you must account for all columns in the data; you cannot select only some columns.

The ORC clause can be used alone or with optional parameters.

Syntax

ORC ( [parameter=value[,...]] )

Parameters

hive_partition_cols Comma-separated list of columns that are partition columns in the data. See Using Partition Columns.

allow_no_match

Whether to accept a path containing a glob with no matching files and report zero rows in query results. If this parameter is not set, Vertica returns an error if the path in the FROM clause does not match at least one file.

Examples

Use the ORC clause without parameters if your data is not partitioned:

=> CREATE EXTERNAL TABLE orders
 (orderkey    INT,
  custkey     INT,
  prodkey     ARRAY[VARCHAR(10)],
  orderprices ARRAY[DECIMAL(12,2)],
  orderdate   DATE
 ) AS COPY FROM 's3://AWS_DataLake/orders.orc' ORC;

In the following example, the "id" and "name" columns are included in the data and the "created" and "region" columns are partition columns. Partition columns, if not also data columns, must be listed last when defining columns:

=> CREATE EXTERNAL TABLE stores 
 (id      INT, 
  name    VARCHAR(50), 
  created DATE, 
  region  VARCHAR(50))
   AS COPY FROM 'webhdfs:///path/*/*/*' 
   ORC(hive_partition_cols='created,region');

You can read a map column as an array of rows, as in the following example:

=> CREATE EXTERNAL TABLE orders
 (orderkey INT,
  custkey INT,
  prods ARRAY[ROW(key VARCHAR(10), value DECIMAL(12,2))],
  orderdate DATE
 ) AS COPY FROM '...' ORC;