Deploying the Vertica Connector for Apache Spark

Once you have downloaded the connector and JDBC library JAR files, you can deploy them to your Spark cluster in two ways:

Copying the Connector for Use with Spark Submit or Spark Shell

  1. Log on as the Spark user on any Spark machine.
  2. Copy both the Vertica Spark Connector and Vertica JDBC Driver JAR files from the package to your local Spark directory.
  3. Then, run the connector in any of the following ways (replace the version number in the jar names with your version number):

Note: The version numbers in the JAR file names will vary depending on your version of Vertica, Spark, and Scala.

Deploying the Connector to a Spark Cluster

You can optionally deploy the JAR files to a Spark cluster. This approach gives all applications (such as shell and submit) access; it does not require specifying them on the command line.

To deploy to the Spark cluster:

  1. Copy the files to a common path on all Spark machines.
  2. Add the path for the connector and JDBC driver to your conf/spark-defaults.conf, and restart the Spark Master. For example, modify the spark.jars line by adding the connector and JDBC JARS as follows (replace paths and version numbers with your values):

    spark.jars /JAR_file_Path/vertica-8.1.0_spark2.0_scala2.11.jar,/JAR_file_Path/vertica-jdbc-8.1.0-0.jar