FJSONPARSER (Parser)

Parses and loads a JSON file. This file can contain either repeated JSON data objects (including nested maps), or an outer list of JSON elements.

For a flex table, the parser stores the JSON data in a single-value VMap. For a hybrid or columnar table, the parser loads data directly in any table column with a column name that matches a key in the JSON source data.

For materialized columns in a columnar table, the parser supports all scalar types, one-dimensional arrays of scalar types, and complex types loaded as VMap columns. To load complex types as VMap columns, specify a column type of LONG VARBINARY. To preserve the indexing in complex types, set flatten_maps to false.

Syntax

FJSONPARSER ( [parameter='value'[,...]] )

Parameters

flatten_maps

Boolean, specifies whether to flatten sub-maps within the JSON data, separating map levels with a period (.). This parameter applies only to flex tables or VMap columns.

Default: true

flatten_arrays

Boolean, specifies whether to convert lists to sub-maps with integer keys. When lists are flattened, key names are concatenated as for maps. Lists are not flattened by default.

This parameter applies only to flex tables or VMap columns. It does not apply to 1D arrays of primitive types that are declared as such.

Default: false

reject_on_duplicate

Boolean, specifies whether to ignore duplicate records (false), or to reject duplicates (true). In either case, the load continues.

Default: false

reject_on_empty_key

Boolean, specifies whether to reject any row containing a field key without a value.

Default: false

omit_empty_keys

Boolean, specifies whether to omit any field key from the load data that does not have a value.

Default: false

record_terminator

When set, any invalid JSON records are skipped and parsing continues with the next record. Records must be terminated uniformly. For example, if your input file has JSON records terminated by newline characters, set this parameter to E'\n'). If any invalid JSON records exist, parsing continues after the next record_terminator.

Even if your data does not contain invalid records, specifying an explicit record terminator can improve load performance by allowing cooperative parse and apportioned load to operate more efficiently.

When you omit this parameter, parsing ends at the first invalid JSON record.

reject_on_materialized_type_error

Boolean, specifies whether to reject a data row that contains a materialized column value that cannot be coerced into a compatible data type.

Default: false

start_point

String that specifies the name of a key in the JSON load data at which to begin parsing. The parser ignores all data before the start_point value. The value is loaded for each object in the file. The parser processes data after the first instance, and up to the second, ignoring any remaining data.

start_point_occurrence

Iinteger that indicates the nth occurrence of the value you specify with start_point. Use in conjunction with start_point when load data has multiple start values and you know the occurrence at which to begin parsing.

Default: 1

suppress_nonalphanumeric_key_chars

Boolean, specifies whether to suppress non-alphanumeric characters in JSON key values. The parser replaces these characters with an underscore (_) when this parameter is true.

Default: false

key_separator

Specifies a non-default character for the parser to use when concatenating key names.

Default: period (.)

Examples

The following example loads JSON data from STDIN using the default parameters.

=> CREATE TABLE super(age INT, name VARCHAR);
CREATE TABLE
				
=> COPY super FROM STDIN PARSER FJSONPARSER();
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> {"age": 5, "name": "Tim"}
>>  {"age": 3}
>>  {"name": "Fred"}
>>  {"name": "Bob", "age": 10}
>> \.
=> SELECT * FROM super;
 age | name
-----+------
     | Fred
  10 | Bob
   5 | Tim
   3 |
(4 rows)
				

The following example loads array data.

$ cat addrs.json
 {"number": 301, "street": "Grant", "attributes": [1, 2, 3, 4]} 
				
=> CREATE EXTERNAL TABLE customers(number INT, street VARCHAR, attributes ARRAY[INT])
    AS COPY FROM 'addrs.json' PARSER fjsonparser();
				
=> SELECT number, street, attributes FROM customers;
 num | street| attributes
-----+-----------+---------------
301  | Grant | [1,2,3,4]
(1 row)

The following example loads a complex type, rejecting rows that are missing keys within the nested records. Notice that while the data has two restaurants, only one is loaded.

$ cat rest1.json
{
    "name" : "Bob's pizzeria",
    "cuisine" : "Italian",
    "location_city" : ["Cambridge", "Pittsburgh"],
    "menu" : [{"item" : "cheese pizza", "" : "$8.25"},
              {"item" : "spinach pizza", "price" : "$10.50"}]
}
{
    "name" : "Bakersfield Tacos",
    "cuisine" : "Mexican",
    "location_city" : ["Pittsburgh"],
    "menu" : [{"item" : "veggie taco", "price" : "$9.95"},
              {"item" : "steak taco", "price" : "$10.95"}]
}

=> CREATE TABLE rest (name VARCHAR, cuisine VARCHAR, location_city LONG VARBINARY, menu LONG VARBINARY);
				
=> COPY rest FROM '/data/rest1.json' PARSER fjsonparser(flatten_maps=false, reject_on_empty_key=true);
Rows Loaded
------------
	   1
(1 row)
				
=> SELECT maptostring(location_city), maptostring(menu) FROM rest;
        maptostring        |                          maptostring                                                   
---------------------------+-------------------------------------------------------
 {
    "0": "Pittsburgh"
} | {
    "0": {
        "item": "veggie taco",
        "price": "$9.95"
    },
    "1": {
        "item": "steak taco",
        "price": "$10.95"
    }
}
(1 row)

To instead load partial data, use omit_empty_keys to bypass the missing keys while loading everything else:


=> COPY rest FROM '/data/rest1.json' PARSER fjsonparser(flatten_maps=false, omit_empty_keys=true);
 Rows Loaded
-------------
           2
(1 row)

=> SELECT maptostring(location_city), maptostring(menu) from rest;
                   maptostring                   |               maptostring                             
-------------------------------------------------+---------------------------------
 {
    "0": "Pittsburgh"
}                       | {
    "0": {
        "item": "veggie taco",
        "price": "$9.95"
    },
    "1": {
        "item": "steak taco",
        "price": "$10.95"
    }
}
 {
    "0": "Cambridge",
    "1": "Pittsburgh"
} | {
    "0": {
        "item": "cheese pizza"
    },
    "1": {
        "item": "spinach pizza",
        "price": "$10.50"
    }
}
(2 rows)

For other examples, see JSON Data.