ARRAY
Represents array data in external tables in the Parquet format only. Arrays must use only primitive types (and nested arrays, for multi-dimensionality). Arrays may not contain other complex types. For more information, see Reading Structs.
Syntax for Direct Construction
You can use the keyword ARRAY to construct an array type inside of a query. For example, to create an array of INT, you would do the following:
=> SELECT array_max(ARRAY[1,2,3]);
A two-dimensional array of INT would be declared as:
=> SELECT array_dims(ARRAY[[1,3,5],[2,4,6]]);
Array Input Format for Column Definition
Use "ARRAY[data_type]" to declare an array column in a table. The following example defines an external table for customers (abbreviated as cust):
=> CREATE EXTERNAL TABLE cust ( cust_custkey int, cust_custname varchar(50), cust_custstaddress ARRAY[varchar(100)], cust_custaddressln2 ARRAY[varchar(100)], cust_custcity ARRAY[varchar(50)], cust_custstate ARRAY[char(2)], cust_custzip ARRAY[int], cust_email varchar(50), cust_phone varchar(30) ) AS COPY FROM '...' PARQUET;
To declare a multi-dimensional array, use nesting. For example, ARRAY[ARRAY[int]] specifies a two-dimensional array.
Array Output Format
A query reading arrays from external tables results in the following output:
=> SELECT cust_custkey,cust_custstaddress,cust_custcity,cust_custstate from cust; cust_custkey | cust_custstaddress | cust_custcity | cust_custstate -------------+------- ----------------------------------------------+---------------------------------------------+---------------- 342176 | ["668 SW New Lane","518 Main Ave","7040 Campfire Dr"] | ["Winchester","New Hyde Park","Massapequa"] | ["VA","NY","NY"] 342799 | ["2400 Hearst Avenue","3 Cypress Street"] | ["Berkeley","San Antonio"] | ["CA","TX"] 342845 | ["336 Boylston Street","180 Clarkhill Rd"] | ["Boston","Amherst"] | ["MA","MA"] 342321 | ["95 Fawn Drive"] | ["Allen Park"] | ["MI"] 342989 | ["5 Thompson St"] | ["Massillon"] | ["OH"] (5 rows)
Element Access
You can access (dereference) elements from an array by index (zero-based):
=> SELECT (ARRAY['a','b','c','d','e'])[1]; array ------- "b" (1 row)
To specify a range, use the format start:end. The end of the range is non-inclusive.
=> SELECT(ARRAY['a','b','c','d','e','f','g'])[1:4]; array --------- ["b","c","d"] (1 row)
Out-of-bound index references return NULL.
Restrictions
- Arrays support only data of primitive types, for example, int, UUID, and so on.
- Arrays are supported only for external tables using Parquet data.
- Arrays are 0-indexed. The first element's ordinal position is 0, second is 1, and so on.
- Array dimensionality is enforced. A column cannot contain arrays of varying dimensions. For example, a column that contains a three-dimensional array can only contain other three-dimensional arrays; it cannot simultaneously include a one-dimensional array. However, the arrays in a column can vary in size, where one array can contain four elements while another contains ten.
- Out-of-bound indexes into arrays return NULL.
Null Handling
Null semantics for arrays are consistent with normal columns. See NULL Sort Order for more information on null-handling.Functions and Operators
See Array Functions for a comprehensive list of functions that can be used to manipulate arrays.