Reading Complex Types from Parquet Files

Parquet files can contain complex types, including structs, arrays, and maps. When defining an external table, you can use the ROW, ARRAY, and MAP types to define these columns as you would define strong types for any other column. You can query the columns or fields within them.

If a column in the Parquet data contains mixed complex types, such as an array of structs or a struct containing arrays and maps, you cannot fully specify those types in the table definition. You can, however, define a flexible column to read the values into, and then extract particular values at query time. This is the same approach used for flex tables, where all data is initially loaded into a single binary column and materialized from there as needed. See Using Flexible Complex Types for more information about using this approach for complex types.

Even when you can fully specify a column, there might be cases where you prefer to use a flexible column. If the data contains a struct with hundreds of fields, only a few of which you need, you might prefer to extract just those few at query time instead of defining all of the fields. Similarly, if the data structure is likely to change, you might prefer to defer fully specifying the complex types.

For limited support of complex types for ORC files, see Reading Structs as Expanded Columns.

In This Section