Copying Data Using vsql

You can use vsql to copy data between two Vertica databases. This technique is similar to the technique explained in Exporting Data Using vsql, except instead of having vsql save data to a file for export, you pipe one vsql's output to the input of another vsql command that runs a COPY statement from STDIN. This technique can also work for other databases or applications that accept data from an input stream.

The following technique only works for individual tables. To copy an entire database to another cluster, see Copying the Database to Another Cluster in the Administrator's Guide.

The easiest way to copy using vsql is to log in to a node of the target database, then issue a vsql command that connects to the source Vertica database to dump the data you want. For example, the following command copies the store.store_sales_fact table from the vmart database on node testdb01 to the vmart database on the node you are logged into:

vsql -U username -w passwd -h testdb01 -d vmart -At -c "SELECT * from store.store_sales_fact"  \
| vsql -U username -w passwd -d vmart -c "COPY store.store_sales_fact FROM STDIN DELIMITER '|';"

The above example copies the data only, not the table design. The target table for the data copy must already exist in the target database. You can export the design of the table using EXPORT_OBJECTS or EXPORT_CATALOG.

If you are using the Bash shell, you can escape special delimiter characters. For example, DELIMITER E'\t' specifies tab. Shells other than Bash may have other string-literal syntax.

Monitoring Progress (optional)

You may want some way of monitoring progress when copying large amounts of data between Vertica databases. One way of monitoring the progress of the copy operation is to use a utility such as Pipe Viewer that pipes its input directly to its output while displaying the amount and speed of data it passes along. Pipe Viewer can even display a progress bar if you give it the total number of bytes or lines you expect to be processed. You can get the number of lines to be processed by running a separate vsql command that executes a SELECT COUNT query.

Pipe Viewer isn't a standard Linux command, so you will need to download and install it yourself. See the Pipe Viewer page for download packages and instructions. Vertica does not support Pipe Viewer. Install and use it at your own risk.

The following command demonstrates how you can use Pipe Viewer to monitor the progress of the copy shown in the prior example. The command is complicated by the need to get the number of rows that will be copied, which is done using a separate vsql command within a Bash backquote string, which executes the string's contents and inserts the output of the command into the command line. This vsql command just counts the number of rows in the store.store_sales_fact table.

vsql -U username -w passwd -h testdb01 -d vmart -At -c "SELECT * from store.store_sales_fact"  \
| pv -lpetr -s `vsql -U username -w passwd -h testdb01 -d vmart -At -c "SELECT COUNT (*) FROM store.store_sales_fact;"` \
| vsql -U username -w passwd -d vmart -c "COPY store.store_sales_fact FROM STDIN DELIMITER '|';"

While running, the above command displays a progress bar that looks like this:

0:00:39 [12.6M/s] [=============================>                             ] 50% ETA 00:00:40