Reload Data from a Rejected Data Table: Quick Tip

Posted November 14, 2018 by Jim Knicely, Big Data Solutions Architect, Vertica

When running a COPY command, using the REJECTED DATA parameter with the AS TABLE clause, will save rejected data into a table. If you realize there is a modification to the COPY command that will allow those rejected records to load successfully, you can re-run the updated COPY command against the rejected data in the rejected data table, rather than against the entire original data source.

Example: dbadmin=> CREATE TABLE test (c INT); CREATE TABLE dbadmin=> \d test List of Fields by Tables Schema | Table | Column | Type | Size | Default | Not Null | Primary Key | Foreign Key -------+-------+--------+------+------+---------+----------+-------------+------ public | test | c | int | 8 | | f | f | (1 row) dbadmin=> \! cat /home/dbadmin/test.txt "" "1" dbadmin=> COPY test dbadmin-> FROM '/home/dbadmin/test.txt' dbadmin-> ENCLOSED BY '"' dbadmin-> REJECTED DATA TABLE test_bad; Rows Loaded ------------- 1 (1 row) dbadmin=> SELECT c dbadmin-> FROM test; c --- 1 (1 row) dbadmin=> SELECT rejected_data, rejected_reason dbadmin-> FROM test_bad; rejected_data | rejected_reason --------------+-------------------------------------------- "" | Invalid integer format '' for column 1 (c) (1 row) dbadmin=> SELECT c dbadmin-> FROM test; c --- 1 (1 row) dbadmin=> COMMIT; COMMIT Vertica cannot implicitly cast the NULL value to an integer. But we can explicitly using the cast ::!.

Instead of first truncating my target table and re-running a modified COPY command, I’ll just load the rejected data this time! dbadmin=> \! vsql -Atc "SELECT rejected_data FROM test_bad;" | vsql -c "COPY test (c_filler FILLER VARCHAR(1), c AS c_filler::!INT) FROM STDIN ENCLOSED BY '\"';" dbadmin=> SELECT c dbadmin-> FROM test; c --- 1 (2 rows) Helpful Links:

Have fun!