Avro Data

Load Avro data files into flex tables and columnar tables using the FAVROPARSER (Parser). Before loading, verify that Avro files are encoded in the Avro binary serialization encoding format, described in the Apache Avro standard. The parser also supports Snappy and deflate compression. You cannot load Avro data directly from STDIN.

The parser favroparser does not support Avro files with separate schema files. The Avro file must have its related schema in the file you are loading.

The favroparser supports data types described in the following sections:

Avro Schemas and Columnar Tables

Avro includes the schema with the data. When you load Avro data into a columnar table, the column names in the schema in the data must match the column names in the table. You do not need to load all of the columns in the data.

For example, the following Avro schema uses the record complex type to represent a user profile:

{
  "type": "record",
  "name": "Profile",
  "fields" : [
      {"name": "UserName", "type": "string"},
      {"name": "Email", "type": "string"},
      {"name": "Address", "type": "string"}
   ]
}

To successfully load the data into a columnar table with this schema, each target column name must match the "name" value in the schema. In the following query, the profiles table does not load values corresponding to the schema's Email field because the target column is named EmailAddr:

=> SELECT * from profiles;
     UserName    |     EmailAddr      |       Address
-----------------+--------------------+---------------------
    dbadmin      |                    |    123 Vertica Way

Rejecting Data on Materialized Column Type Errors

The favroparser has a Boolean parameter, reject_on_materialized_type_error. If you set this parameter to true, Vertica rejects rows and returns an invalidConversion error when the input data presents both of the following conditions:

  • Includes keys matching an existing materialized column
  • Has a value that cannot be coerced into the materialized column's data type

Suppose the flex table has a materialized column, Temperature, declared as a FLOAT. If you try to load a row with a Temperature key that has a VARCHAR value, favroparser rejects the data row.

See Also