COPY Parameters

Parameters and their descriptions are divided into the following sections:

Target Options
Column Options
Input Options
Handling Options
Parser-Specific Options

Target Options

The following parameters apply to the target tables and their columns:

[database.]schema	Specifies a schema, by default `public`. If schema is any schema other than `public`, you must supply the schema name. For example: myschema.thisDbObject If you specify a database, it must be the current database. `COPY` ignores schema-name when used in `CREATE EXTERNAL TABLE` or `CREATE FLEX EXTERNAL TABLE` statements.
target‑table	The target columnar or flexible table for loading new data. Vertica loads the data into all projections that include columns from the schema table.
column‑as‑expression	Specifies the expression used to compute values for the target column. For example: COPY t(year AS TO_CHAR(k, 'YYYY')) FROM 'myfile.dat' Use this option to transform data when it is loaded into the target database. For details about: Expressions with `COPY`: See Transforming Data During Loads. Fillers: See Manipulating Source Data Columns.
column	Restricts the load to one or more specified columns in the table. If you omit specifying columns, `COPY` loads all columns by default. Table columns that you omit from the column list are assigned their DEFAULT or SET USING values, if any; otherwise, `COPY` inserts `NULL`. If you leave the `column` parameter blank to load all columns in the table, you can use the optional parameter `COLUMN OPTION` to specify parsing options for specific columns. The data file must contain the same number of columns as the `COPY` command's column list. For example, in a table `T1` with nine columns (`C1` through `C9`), the following statement loads the three columns of data in each record to columns `C1`, `C6`, and `C9`, respectively: => COPY T1 (C1, C6, C9);
COLUMN OPTION	Specifies load metadata for one or more columns declared in the table column list. For example, you can specify that a column has its own `DELIMITER`, `ENCLOSED BY`, `NULL as'NULL'` expression, and so on. You do not have to specify every column name explicitly in the `COLUMN OPTION` list, but each column you specify must correspond to a column in the table column list.

Column Options

Depending on how they are specified, the following COPY options can qualify specific columns or all columns. Some parser-specific options can also apply to either specific columns or all columns. See Global and Column-Specific Options for more information about these two modes.

ENFORCELENGTH	If specified, `COPY` rejects data rows of type `char`, `varchar`, `binary`, and `varbinary`, or elements of those types in collections, if they are larger than the declared size. By default, `COPY` truncates offending rows of these data types and elements of these types in collections, but does not reject the rows. For more details, see Tracking Load Exceptions and Rejections Status in the Administrator's Guide. If a collection does not fit with all of its elements, `COPY` rejects the row without truncating. It does not reduce the number of elements. This can happen if each element is individually within limits but the number of elements causes the collection to exceed the maximum size for the column.
FILLER datatype	Specifies not to copy the data of an input column, one of the following: Ignore data of the input column. Transform input column data before loading it into the target table. For details, see Manipulating Source Data Columns.
FORMAT 'format'	Specifies the input formats to use when loading date/time and binary columns, where format can be one of the following: These are the valid input formats when loading binary columns: `octal` `hex` `bitstream` See Loading Binary (Native) Data to learn more about these formats. When loading date/time columns, using FORMAT significantly improves load performance. `COPY` supports the same formats as the TO_DATE function. See the following topics for additional information: Template Patterns for Date/Time Formatting Template Pattern Modifiers for Date/Time Formatting If you specify invalid format strings, the `COPY` operation returns an error.
NULL [AS]	The string representing a null value. The default is an empty string (`''`). You can specify a null value as any ASCII value in the range `E'\001'` to `E'\177'` inclusive (any ASCII character except `NULL`: `E'\000'`). You cannot use the same character for both the DELIMITER and NULL options. For more information, see Loading Delimited Data.

Input Options

The following options are available for specifying source data:

LOCAL	Specifies to load a data file on a client system, rather than on a cluster host. `LOCAL` can qualify the `STDIN` and path‑to‑data parameters. For details, see COPY LOCAL. Restrictions: Invalid for CREATE EXTERNAL TABLE AS COPY
STDIN	Reads from the client a standard input instead of a file. `STDIN` takes one input source only. To load multiple input sources, use path‑to‑data. User must have INSERT privileges on the table and USAGE privileges on its schema. Restrictions: Invalid for CREATE EXTERNAL TABLE AS COPY
`path‑to‑data`	Specifies the absolute path of the file (or files) containing the data, which can be from multiple input sources. If the file is stored in HDFS, path‑to‑data is a URL in the `hdfs` scheme, typically `hdfs:///path/to/file`. See Using HDFS URLs. If the file is stored in an S3 bucket, path‑to‑data is a URL in the format `s3://bucket/path`. If the file is stored in Google Cloud Storage, path‑to‑data is a URL in the format `gs://bucket/path`. If the file is on the local disk or an NFS mount, path‑to‑data is a local (Linux) file path. path‑to‑data can optionally contain wildcards to match more than one file. The file or files must be accessible to the local client or the host on which the `COPY` statement runs. COPY skips empty files in the file list. A file list that includes directories causes the query to fail. See Specifying Where to Load Data From. The supported patterns for wildcards are specified in the Linux Manual Page for Glob (7), and for ADO.net platforms, through the .NET Directory.getFiles method. You can use variables to construct the pathname as described in Using Load Scripts. If path‑to‑data resolves to a storage location on a local file system, and the user invoking `COPY` is not a superuser, the following requirements apply: The storage location was created with CREATE LOCATION…USAGE USER. The user must already have READ access to the file storage location. Further, if a user has privileges but is not a superuser, and invokes `COPY` from that storage location, Vertica ensures that symbolic links do not result in unauthorized access.
ON nodename	Specifies the node on which the data to copy resides and the node that should parse the load file. If you omit nodename, the location of the input file defaults to the initiator node. Use nodename to copy and parse a load file from a node other than the `COPY` initiator node. nodename is invalid with `STDIN` and `LOCAL`.
ON (nodeset)	Specifies a set of nodes on which to perform the load. The same data must be available for load on all named nodes. nodeset is a comma-separated list of node names in parentheses. For example: => COPY t FROM 'file1.txt' ON (v_vmart_node0001, v_vmart_node0002); Vertica apportions the load among all of the specified nodes. If you also specify `ERROR TOLERANCE` or `REJECTMAX`, Vertica instead chooses a single node on which to perform the load. If the data is available on all nodes, you usually use `ON ANY NODE`. However, you can use `ON nodeset` to do manual load-balancing among concurrent loads.
ON ANY NODE	Specifies that the source file to load is available on all nodes, so `COPY` opens the file and parses it from any node in the cluster. For an Eon Mode database, `COPY` uses nodes within the same subcluster as the initiator. The file must be the same on all nodes. If the file differs on two nodes, an incorrect or incomplete result is returned, with no error or warning. Vertica attempts to apportion the load among several nodes if the file is large enough to benefit from apportioning. It chooses a single node if `ERROR TOLERANCE` or `REJECTMAX` is specified. You can use a wildcard or glob (such as `*.dat`) to load multiple input files, combined with the `ON ANY NODE` clause. If you use a glob, `COPY` distributes the list of files to all cluster nodes and spreads the workload. `ON ANY NODE` is invalid with `STDIN` and `LOCAL`. `STDIN` can only use the client host, and `LOCAL` indicates a client node. `ON ANY NODE` is the default for HDFS and S3 paths and does not need to be specified.
input‑format	Specifies the input format, one of the following: `UNCOMPRESSED` (default) `BZIP` `GZIP` `LZO` `ZSTD` Input files can be of any format. If you use wildcards, all qualifying input files must be in the same format. To load different file formats, specify the format types specifically. The following requirements and restrictions apply: When using concatenated `BZIP` or `GZIP` files, verify that all source files terminate with a record terminator before concatenating them. Concatenated `BZIP` and `GZIP` files are not supported for `NATIVE` (binary) and `NATIVE VARCHAR` formats. `LZO` files are assumed to be compressed with `lzop`. Vertica supports the following lzop arguments: `‑‑no-checksum` / `‑F` `‑‑crc32` `‑‑adler32` `‑‑no-name` / `‑n` `‑‑name` / `‑N` `‑‑no-mode` `‑‑no-time` `‑‑fast` `‑‑best` Numbered compression levels `BZIP`, `GZIP`, `ZSTD`, and `LZO` compression cannot be used with `ORC` format.
VERTICA	See COPY FROM VERTICA.
[WITH] UDL‑clause[…]	Specifies one or more user-defined load functions—one source, and optionally one or more filters and one parser, as follows: SOURCE source( [arg=value[,…] ] ) [ FILTER filter( [arg=value[,…] ] ) ]… [ PARSER parser( [arg=value[,…] ] ) ] To use a flex table parser for column tables, use the `PARSER` parameter followed by a flex table parser argument. For supported flex table parsers, see Bulk Loading Data into Flex Tables.

Handling Options

The following parameters provide various options for controlling how COPY handles different contingencies:

ABORT ON ERROR	Specifies that `COPY` stops if any row is rejected. The statement is rolled back and no data is loaded.
COLSIZES (integer[,…])	Specifies column widths when loading fixed-width data. `COPY` requires that you specify the `COLSIZES` when using the `FIXEDWIDTH` parser. `COLSIZES` and the list of integers must correspond to the columns listed in the table column list. For more information, see Loading Fixed-Width Format Data in the Administrator's Guide.
ERROR TOLERANCE	Specifies that `COPY` treats each source during execution independently when loading data. The statement is not rolled back if a single source is invalid. The invalid source is skipped and the load continues. Using this parameter disables apportioned load. Restrictions: Invalid for ORC or Parquet data
EXCEPTIONS	Specifies the file name or absolute path of the file in which to write exceptions, as follows: EXCEPTIONS 'path' [ ON nodename[,…]] Exceptions describe why each rejected row was rejected. Each exception describes the corresponding record in the file specified by the `REJECTED DATA` option. Files are written on the node or nodes executing the load. If the file already exists, it is overwritten. To collect all exceptions in one place, use the `REJECTED DATA AS TABLE` clause and exceptions are automatically listed in the table's `rejected_reason` column. `EXCEPTIONS` is incompatible with `REJECTED DATA AS TABLE`. The `ON nodename` clause moves existing exceptions files on nodename to the indicated path on the same node. For more information, see Saving Load Exceptions (EXCEPTIONS) in the Administrator's Guide. If you use this parameter with `COPY…ON ANY NODE`, you must still specify the individual nodes for the exception files, as in the following example: EXCEPTIONS '/home/ex01.txt' on v_db_node0001,'/home/ex02.txt' on v_db_node0002,'/home/ex03.txt' on v_db_node0003 If path resolves to a storage location, the following privileges apply to non-superusers: The storage location must be created with the `USER` option (see CREATE LOCATION). The user must have `READ` access to the storage location where the files exist, as described in GRANT (Storage Location).
REJECTED DATA	Specifies where to write each row that failed to load. If this parameter is specified, records that failed due to parsing errors are always written. Records that failed due to an error during a transformation are written only if configuration parameter CopyFaultTolerantExpressions is set. The syntax for this parameter is: REJECTED DATA { 'path' [ ON nodename ] [,…] \| AS TABLE reject-table } Vertica can write rejected data to the specified path or to a table: `'path' [ON nodename]`: Copies the rejected row data to the specified path on the node executing the load. If qualified by `ON` nodename, Vertica moves existing rejected data files on nodename to path on the same node. The value of path can be a directory or a file prefix. If there are multiple load sources, path is always treated as a directory. If there are not multiple load sources but path ends with '`/`', or if a directory of that name already exists, it is also treated as a directory. Otherwise, path is treated as a file prefix. Files are written on the node or nodes executing the load. If the file already exists, it is overwritten. When this parameter is used with `LOCAL`, the output is written to the client. Do not qualify path with `ON ANY NODE`. To collect all rejected data in one place regardless of how the load is distributed, use a table. `AS TABLE reject-table`: Saves rejected rows to reject‑table. `REJECTED DATA AS TABLE` is incompatible with `EXCEPTIONS`. For more information about both options, see Handling Messy Data.
REJECTMAX integer	The maximum number of logical records that can be rejected before a load fails. For details, see Handling Messy Data. `REJECTMAX` disables apportioned load.
SKIP integer	The number of records to skip in a load file. For example, you can use the `SKIP` option to omit table header information. Restrictions: Invalid for ORC or Parquet data
STREAM NAME	Supplies a `COPY` load stream identifier. Using a stream name helps to quickly identify a particular load. The `STREAM NAME` value that you supply in the load statement appears in the STREAM_NAME column of system tables LOAD_STREAMS and LOAD_SOURCES. A valid stream name can contain any combination of alphanumeric or special characters up to 128 bytes in length. By default, Vertica names streams by table and file name. For example, if you are loading two files (`f1`, `f2`) into `TableA`, their default stream names are `TableA-f1`, `TableA-f2`, respectively. For example: => COPY mytable FROM myfile DELIMITER '\|' STREAM NAME 'My stream name';
WITH Parsers for Various Data Formats	Specifies the parser to use when bulk loading columnar tables, one of the following: `NATIVE` `NATIVE VARCHAR` `FIXEDWIDTH` `ORC (Parser)` `PARQUET (Parser)` By default, `COPY` uses the DELIMITER parser for UTF-8 format, delimited text input data. To use a flex table parser for column tables, use the `PARSER` parameter followed by a flex table parser argument. For parser descriptions, see Flex Parsers Reference. For parser support for complex data types, see the documentation of the specific type. You do not specify the DELIMITER parser directly; absence of a specific parser indicates the default. For more information, see Parsers for Various Data Formats in Getting Data into Vertica. The following restrictions apply: These parsers are not applicable when loading flexible tables. The ORC and PARQUET parsers are for use with Hadoop files in those formats. The files do not need to be stored in HDFS. For details on using these formats, see Reading ORC and Parquet Formats. `COPY LOCAL` does not support `NATIVE` and `NATIVE VARCHAR` parsers. To use a flex table parser for column tables, use the `PARSER` parameter followed by a flex table parser argument. For supported flex table parsers, see Bulk Loading Data into Flex Tables.
`NO COMMIT`	Prevents the `COPY` statement from committing its transaction automatically when it finishes copying data. This option must be the last `COPY` statement parameter. For more information, see Using Transactions to Stage a Load. Restrictions: Invalid for ORC or Parquet data , ignored by CREATE EXTERNAL TABLE AS COPY

Parser-Specific Options

The following parameters apply only when using specific parsers. The parsers that support each option follow the option name.