Handling Expression Errors in COPY Statements: Quick Tip

Posted December 4, 2018 by Jim Knicely, Vertica Principal Solution Architect

Shot of two businesswomen using a digital tablet together during a collaboration at work
We can transform data as it is loaded via a COPY command using a FILLER column and a COPY expression. In the example below, a COPY expression uses the LOWER function to convert uppercase text to lowercase as it is loaded. Example: dbadmin=> CREATE TABLE test (a INT, b VARCHAR(10)); CREATE TABLE dbadmin=> \! cat /home/dbadmin/test.txt 1|TEST1 2| 3|TEST3 A|TEST4 dbadmin=> COPY test (a, b) dbadmin-> FROM ‘/home/dbadmin/test.txt’ REJECTED DATA TABLE test_bad; Rows Loaded ————- 3 (1 row) dbadmin=> SELECT * FROM test; a | b —+——- 1 | TEST1 2 | 3 | TEST3 (3 rows) dbadmin=> SELECT rejected_data, rejected_reason FROM test_bad; rejected_data | rejected_reason ————–+——————————————— A|TEST4 | Invalid integer format ‘A’ for column 1 (a) (1 row) That worked as expected. But what happens when an error occurs because of the COPY expression? Let’s find out. dbadmin=> TRUNCATE TABLE test; TRUNCATE TABLE dbadmin=> DROP TABLE test_bad; DROP TABLE dbadmin=> ALTER TABLE test ALTER COLUMN b SET NOT NULL; ALTER TABLE dbadmin=> COPY test (a, b_filler FILLER VARCHAR(10), b AS LOWER(b_filler)) dbadmin-> FROM ‘/home/dbadmin/test.txt’ REJECTED DATA TABLE test_bad; ERROR 2501: Cannot set a NOT NULL column (b) to a NULL value in COPY statement dbadmin=> SELECT * FROM test; a | b –+— (0 rows) Yikes! No rows were loaded because the COPY command was aborted! If you’d rather have the COPY command report record rejections during transformations and proceed, set the CopyFaultTolerantExpressions parameter to 1 (TRUE). dbadmin=> ALTER SESSION SET CopyFaultTolerantExpressions = 1; ALTER SESSION dbadmin=> COPY test (a, b_filler FILLER VARCHAR(10), b AS lower(b_filler)) dbadmin-> FROM ‘/home/dbadmin/test.txt’ REJECTED DATA TABLE test_bad; Rows Loaded ————- 2 (1 row) dbadmin=> SELECT * FROM test; a | b —+——- 1 | test1 3 | test3 (2 rows) dbadmin=> SELECT rejected_data, rejected_reason FROM test_bad; rejected_data | rejected_reason —————-+—————————————————– A|TEST4 | Invalid integer format ‘A’ for column 1 (a) Tuple (2,Null) | ERROR 2501: Cannot set a NOT NULL column (b) to a NULL value in COPY statement (2 rows) That’s better! Helpful Links: https://www.vertica.com/docs/latest/HTML/Content/Authoring/AdministratorsGuide/BulkLoadCOPY/TransformingDataDuringLoads.htm https://www.vertica.com/docs/latest/HTML/Content/Authoring/AdministratorsGuide/ConfiguringTheDB/GeneralParameters.htm Have fun!