Avro Data
Load Avro data files into flex tables and columnar tables using the FAVROPARSER (Parser). Before loading, verify that Avro files are encoded in the Avro binary serialization encoding format, described in the Apache Avro standard. The parser also supports Snappy and deflate compression. You cannot load Avro data directly from STDIN.
The parser favroparser
does not support Avro files with separate schema files. The Avro file must have its related schema in the file you are loading.
The favroparser
supports data types described in the following sections:
- Primitive Data Types for favroparser
- Complex Data Types for favroparser
- Logical Data Types for favroparser
Avro Schemas and Columnar Tables
Avro includes the schema with the data. When you load Avro data into a columnar table, the column names in the schema in the data must match the column names in the table. You do not need to load all of the columns in the data.
For example, the following Avro schema uses the record complex type to represent a user profile:
{ "type": "record", "name": "Profile", "fields" : [ {"name": "UserName", "type": "string"}, {"name": "Email", "type": "string"}, {"name": "Address", "type": "string"} ] }
To successfully load the data into a columnar table with this schema, each target column name must match the "name" value in the schema. In the following query, the profiles
table does not load values corresponding to the schema's Email
field because the target column is named EmailAddr
:
=> SELECT * from profiles;
UserName | EmailAddr | Address
-----------------+--------------------+---------------------
dbadmin | | 123 Vertica Way
Rejecting Data on Materialized Column Type Errors
The favroparser
has a Boolean parameter, reject_on_materialized_type_error
. If you set this parameter to true
, Vertica rejects rows and returns an invalidConversion
error when the input data presents both of the following conditions:
- Includes keys matching an existing materialized column
- Has a value that cannot be coerced into the materialized column's data type
Suppose the flex table has a materialized column, Temperature
, declared as a FLOAT
. If you try to load a row with a Temperature
key that has a VARCHAR
value, favroparser
rejects the data row.