Choosing a Load Method
By default, Vertica loads table data into WOS. After WOS reaches full capacity, Vertica continues to load data to ROS containers. This approach is suitable for small bulk-load operations. If desired, you can specify alternative load methods. You can do so at two levels:
- Table metadata, through
CREATE TABLE
andALTER TABLE
- DML statements:
COPY
/COPY FROM VERTICA
,INSERT
,MERGE
, andUPDATE
If a table specifies a load option, Vertica uses it for all DML operations unless the DML statement specifies otherwise.
Load Options and Hints
CREATE TABLE
and ALTER TABLE
, and copy operations COPY
/COPY FROM VERTICA
support the following load options, described in detail in this topic:
AUTO
(default): Initially loads data into WOS, suitable for smaller bulk loads.DIRECT
: Loads data directly into ROS containers, suitable for large (>100 MB) bulk loads.TRICKLE
: Loads data only into WOS, suitable for frequent incremental loads.
Vertica also supports three load hints: /*+AUTO*/
, /*+DIRECT*/
, and /*+TRICKLE*/
. These hints let you control how individual INSERT
, MERGE
, and UPDATE
operations load table data, overriding the target table's load setting, if any.
AUTO: Loading Into WOS
If no load option is specified for an operation, Vertica uses the AUTO method to load data into WOS. After WOS reaches full capacity, Vertica continues to load data to ROS containers.
DIRECT: Loading Directly to ROS
The DIRECT
option specifies to load data directly into ROS containers, bypassing the WOS. DIRECT
is best suited for large data loads (100 MB or more).
A large initial bulk load can temporarily affect query performance while Vertica organizes the data.
For example:
=> COPY a FROM stdin DIRECT; => COPY b FROM LOCAL STDIN DIRECT;
Avoid using DIRECT
to load many smaller data sets. This approach results in many ROS containers that must be combined later.
TRICKLE Loading
Use the TRICKLE
load option to load data incrementally after the initial bulk load is complete. Trickle loading loads data into the WOS. If the WOS becomes full, an error occurs and the entire data load is rolled back. Use the TRICKLE
option only when the following conditions are true:
- You have a finely tuned load and moveout process at your site.
- You are confident that the WOS has room to hold the data you are loading.
This option is more efficient than AUTO
when you want to load data into partitioned tables.
For more details about trickle loading, see Trickle Loading Data.