ARRAY

Represents array data in external tables in the Parquet format only. Arrays must use only primitive types (and nested arrays, for multi-dimensionality). Arrays may not contain other complex types. For more information, see Reading Structs.

Syntax for Direct Construction

You can use the keyword ARRAY to construct an array type inside of a query. For example, to create an array of INT, you would do the following:

=> SELECT array_max(ARRAY[1,2,3]);

A two-dimensional array of INT would be declared as:

=> SELECT array_dims(ARRAY[[1,3,5],[2,4,6]]);

Array Input Format for Column Definition

Use "ARRAY[data_type]" to declare an array column in a table. The following example defines an external table for customers (abbreviated as cust):

=> CREATE EXTERNAL TABLE cust 
(
cust_custkey int, 
cust_custname varchar(50), 
cust_custstaddress ARRAY[varchar(100)], 
cust_custaddressln2 ARRAY[varchar(100)],
cust_custcity ARRAY[varchar(50)], 
cust_custstate ARRAY[char(2)],
cust_custzip ARRAY[int],
cust_email varchar(50), 
cust_phone varchar(30)
) 
AS COPY FROM '...' PARQUET;

To declare a multi-dimensional array, use nesting. For example, ARRAY[ARRAY[int]] specifies a two-dimensional array.

Array Output Format

A query reading arrays from external tables results in the following output:

=> SELECT cust_custkey,cust_custstaddress,cust_custcity,cust_custstate from cust;
cust_custkey |               cust_custstaddress                      |                cust_custcity                | cust_custstate
-------------+-------  ----------------------------------------------+---------------------------------------------+----------------
      342176 | ["668 SW New Lane","518 Main Ave","7040 Campfire Dr"] | ["Winchester","New Hyde Park","Massapequa"] | ["VA","NY","NY"]
      342799 | ["2400 Hearst Avenue","3 Cypress Street"]             | ["Berkeley","San Antonio"]                  | ["CA","TX"]
      342845 | ["336 Boylston Street","180 Clarkhill Rd"]            | ["Boston","Amherst"]                        | ["MA","MA"]
      342321 | ["95 Fawn Drive"]                                     | ["Allen Park"]                              | ["MI"]
      342989 | ["5 Thompson St"]                                     | ["Massillon"]                               | ["OH"]
(5 rows)

Element Access

You can access (dereference) elements from an array by index (zero-based):

=> SELECT (ARRAY['a','b','c','d','e'])[1];
array
-------
    "b"
(1 row)

To specify a range, use the format start:end. The end of the range is non-inclusive.

=> SELECT(ARRAY['a','b','c','d','e','f','g'])[1:4];
array
---------
["b","c","d"]
(1 row)

Out-of-bound index references return NULL.

Restrictions

  • Arrays support only data of primitive types, for example, int, UUID, and so on.
  • Arrays are supported only for external tables using Parquet data.
  • Arrays are 0-indexed. The first element's ordinal position is 0, second is 1, and so on.
  • Array dimensionality is enforced. A column cannot contain arrays of varying dimensions. For example, a column that contains a three-dimensional array can only contain other three-dimensional arrays; it cannot simultaneously include a one-dimensional array. However, the arrays in a column can vary in size, where one array can contain four elements while another contains ten.
  • Out-of-bound indexes into arrays return NULL.

Null Handling

Null semantics for arrays are consistent with normal columns. See NULL Sort Order for more information on null-handling.

Functions and Operators

See Array Functions for a comprehensive list of functions that can be used to manipulate arrays.