GET_METADATA

Returns the metadata of a Parquet file. Metadata includes the number and sizes of row groups, column names, and information about chunks and compression. Metadata is returned as JSON.

This function inspects one file. Parquet data usually spans many files in a single directory; choose one. The function does not accept a directory name as an argument.

Syntax

GET_METADATA( 'filename' )

Arguments

filename
      

The name of a Parquet file. Any path that is valid for COPY is valid for this function. This function does not operate on files in other formats.

Privileges

Superuser, or non-superuser with READ privileges on the USER-accessible storage location (see GRANT (Storage Location)).

Examples

In the following example, the "orders" directory contains many Parquet files containing data for a single table.

=> SELECT GET_METADATA('/data/orders/000000_0');
				GET_METADATA
----------------------------------------------------------------------------------------------------
{
    "FileName": "/data/orders/000000_0",
    "FileFormat": "Parquet",
    "Version": "0",
    "CreatedBy": "parquet-mr version 1.8.1 (build 4aba4dae7bb0d4edbcf7923ae1339f28fd3f7fcf)",
    "TotalRows": "417189",
    "NumberOfRowGroups": "1",
    "NumberOfRealColumns": "1",
    "NumberOfColumns": "1",
    "Columns": [
          { "Id": "0", "Name": "o_orderkey", "PhysicalType": "INT32", "LogicalType": "NONE" }

     ],
    "RowGroups": [
         {
               "Id": "0", "TotalBytes": "1668852", "Rows": "417189",
               "ColumnChunks": [
                           {"Id": "0", "Values": "417189", "StatsSet": "True", "Stats": {"NumNulls": "0", "DistinctValues": "0", "Max": "453984454", "Min": "414194851" },
                             "Compression": "UNCOMPRESSED", "Encodings": "BIT_PACKED RLE PLAIN ", "UncompressedSize": "1668852", "CompressedSize": "1668852" }
               ]
         }
   ]
}

(1 row)