ORC (Parser)
Use the ORC
clause with the COPY statement to load data in the ORC format.
The ORC parser supports all primitive types and one-dimensional arrays of primitive types. For external tables only, the ORC parser can read some additional complex types. See Reading Complex Types from ORC Files.
When loading ORC data, you must account for all columns in the data; you cannot select only some columns. For some complex types, you can define the column (to account for it) but cannot query it.
The ORC
clause can be used alone or with optional parameters.
Parameters
hive_partition_cols
|
Comma-separated list of columns that are partition columns in the data. See Using Partition Columns. |
|
Whether to flatten a null struct value to null values for all of its fields (true, default) or reject a row containing a null struct value (false). See Reading Structs. This parameter is deprecated and will be removed in a future release. |
|
Whether to accept a path containing a glob with no matching files and report zero rows in query results. If this parameter is not set, Vertica returns an error if the path in the FROM clause does not match at least one file. |
Examples
Use the ORC
clause without parameters if your data is not partitioned.
=> CREATE EXTERNAL TABLE orders (orderkey INT, custkey INT, prodkey ARRAY[VARCHAR(10)], orderprices ARRAY[DECIMAL(12,2)], orderdate DATE ) AS COPY FROM 's3://AWS_DataLake/orders.orc' ORC;
In the following example, the "id" and "name" columns are included in the data and the "created" and "region" columns are partition columns. Partition columns must be listed last when defining columns.
=> CREATE EXTERNAL TABLE stores (id INT, name VARCHAR(50), created DATE, region VARCHAR(50)) AS COPY FROM 'hdfs:///path/*/*/*' ORC(hive_partition_cols='created,region');
Map columns cannot be queried, but you can define them to account for the column, as in the following example.
=> CREATE EXTERNAL TABLE store (storeID INT, inventory MAP<INT,VARCHAR(100)>) AS COPY FROM '...' ORC;