Loading Batches Directly into ROS
When loading large batches of data (more than 100MB or so), you should load the data directly into ROS containers. Inserting directly into ROS is more efficient for large loads than AUTO mode, since it avoids overflowing the WOS and spilling the remainder of the batch to ROS. Otherwise, the Tuple Mover has to perform a moveout on the data in the WOS, while subsequent data is directly written into ROS containers causing your data to be segmented across storage containers.
When you load data using AUTO mode, Vertica inserts the data first into the WOS. If the WOS is full, Vertica inserts the data directly into ROS. For details about load options, see Choosing a Load Method.
To directly load batches into ROS, set the directBatchInsert connection property to true. See Setting and Getting Connection Property Values for an explanation of how to set connection properties. When this property is set to true, all batch inserts bypass the WOS and load directly into a ROS container.
If all of batches being inserted using a connection should be inserted into the ROS, you should set the DirectBatchInsert connection property to true in the Properties
object you use to create the connection:
Properties myProp = new Properties(); myProp.put("user", "ExampleUser"); myProp.put("password", "password123"); // Enable directBatchInsert for this connection myProp.put("DirectBatchInsert", "true"); Connection conn; try { conn = DriverManager.getConnection( "jdbc:vertica://VerticaHost:5433/ExampleDB", myProp); . . .
If you will be using the connection for inserting both large and small batches (or you do not know the size batches you will be inserting when you create the Connection
object), you can set the DirectBatchInsert property after the connection has been established using the VerticaConnection.setProperty
method:
((VerticaConnection)conn).setProperty("DirectBatchInsert", true);
See Setting and Getting Connection Property Values for a full example of setting DirectBatchInsert.