Capturing Load Rejections and Exceptions

Loading data with COPY has two main phases, parsing and loading. Rejected data is created whenever COPY cannot parse a row of data. Following are some parser errors that can cause a rejected row: 

Other problems can occur during the load phase, but such problems are not rejected data from parser errors.

Several optional parameters let you determine how strictly COPY handles rejections when parsing data. For example, you can have COPY fail when it rejects a single row, or allow a specific number of parsing rejections before the load fails. This section presents the parameters to determine how COPY handles rejected data.

Save Rejected Rows (REJECTED DATA and EXCEPTIONS)

The COPY statement automatically saves a copy of each rejected row in a rejected-data file. COPY also saves a corresponding explanation of what caused the rejection in an exceptions file. By default, Vertica saves both files in a database catalog subdirectory, called CopyErrorLogs, as listed in this example:

v_mart_node003_catalog\CopyErrorLogs\trans-STDIN-copy-from-rejected-data.1
v_mart_node003_catalog\CopyErrorLogs\trans-STDIN-copy-from-exceptions.1

You can optionally save COPY rejections and exceptions in two other ways: 

Note: Vertica recommends saving rejected data to a table. However, saving to a table excludes saving to a default or specific rejected data file.

If you save COPY rejected data to a table, the table files are stored in the data subdirectory. For example, in a VMart database installation, rejected data table records are stored in the RejectionTableData directory as follows: 

=> cd v_mart_node003_data\RejectionTableData\
=> ls 
TABLE_REJECTED_RECORDS_"bg"_mytest01.example.-25441:0x6361_45035996273805099_1.1
TABLE_REJECTED_RECORDS_"bg"_mytest01.example.-25441:0x6361_45035996273805113_2.2
.
.
.
TABLE_REJECTED_RECORDS_"delimr"_mytest01.example.-5958:0x3d47_45035996273815749_1.1
TABLE_REJECTED_RECORDS_"delimr"_mytest01.example.-5958:0x3d47_45035996273815749_1.2

COPY LOCAL Rejected Data

For COPY LOCAL operations, you must use the REJECTED DATA reject_file and EXCEPTIONS except_file parameters explicitly. The reject_file and except_file paths must reside on the client. If path resolves to a storage location and the user invoking COPY is not a superuser, the following permissions are required:

Enforce Truncating or Rejecting Rows (ENFORCELENGTH)

When parsing data of type CHAR, VARCHAR, BINARY, or VARBINARY, rows may exceed the target table length. By default, COPY truncates such rows without rejecting them.

Use the ENFORCELENGTH parameter to reject rows that exceed the target table.

For example, loading 'abc' into a table column specified as VARCHAR(2) results in COPY truncating the value to 'ab' and loading it. Loading the same row with the ENFORCELENGTH parameter causes COPY to reject the row.

Note: Vertica supports NATIVE and NATIVE VARCHAR values up to 65K. If any value exceeds this limit, COPY rejects the row, even when ENFORCELENGTH is not in use.

Specify a Maximum Number of Rejections (REJECTMAX)

The REJECTMAX parameter specifies the maximum number of logical records that can be rejected before a load fails. A rejected row consists of the data that could not be parsed into the corresponding data type during a bulk load. Rejected data does not indicate referential constraints. For information about using constraints, and the option of enforcing constraints during bulk loading, see About Constraints.

When the number of rejected records becomes equal to the REJECTMAX value, the load fails. If you do not specify a value for REJECTMAX, or if the value is 0, COPY allows an unlimited number of exceptions to occur.

Abort Data Loads for Any Error (ABORT ON ERROR)

Using the ABORT ON ERROR argument is the most restrictive way to load data, because no exceptions or rejections are allowed. A COPY operation stops if any row is rejected. No data is loaded and Vertica rolls back the command.

If you use the ABORT ON ERROR as part of a CREATE EXTERNAL TABLE AS COPY FROM statement, the option is used whenever a query references the external table. The offending error is saved in the COPY exceptions or rejected data file.

Understanding Row Rejections and Roll Back Errors

Depending on the type of error that COPY encounters, Vertica does one of the following:

Note: If you specify ABORT ON ERROR with the COPY statement, the load automatically rolls back if COPY cannot parse any row.

The following table summarizes the difference between a rejected row or rollback.

Rejected Rows Load Roll back

COPY cannot parse rows that contain any of the following:

  • Incompatible data types
  • Missing fields
  • Missing delimiters

COPY rolls back a statement if it encounters any of these conditions:

  • Server-side errors, such as lack of memory
  • Primary key or foreign key constraint violations
  • Loading NULL data into a NOT NULL column

This example illustrates what happens when Vertica cannot coerce a row to the requested data type. For example, in the following COPY statement, "a::INT + b::INT" is a SQL expression in which a and b are derived values:

=> CREATE TABLE t (i INT);
=> COPY t (a FILLER VARCHAR, b FILLER VARCHAR, i AS a::INT + b::INT) 
   FROM STDIN; 
Enter data to be copied followed by a newline. 
End with a backslash and a period on a line by itself.
>> cat|dog
>> \.

Vertica cannot parse the row to the requested data type and rejects the row:

ERROR 2827:  Could not convert "cat" from column "*FILLER*".a to an int8

If a resolved to 'cat' and b to 'dog', the next expression 'cat'::INT + 'dog'::INT would return an expression evaluator error:

=> SELECT 'cat'::INT + 'dog'::INT;
ERROR 3681:  Invalid input syntax for integer: "cat"

The following COPY statement would also roll back because Vertica cannot parse the row to the requested data type:

=> COPY t (a FILLER VARCHAR, i AS a::INT) FROM STDIN; 

In the following COPY statement, Vertica rejects only the offending row without rolling back the statement. Instead of evaluating the 'cat' row as a VARCHAR type, COPY parses 'cat' directly as an INTEGER.

=> COPY t (a FILLER INT, i AS a) FROM STDIN;

See Also