Vertica Integration with Dataiku DSS: Connection Guide

About Vertica Connection Guides

Vertica connection guides provide basic instructions for connecting a third-party partner product to Vertica. Connection guides are based on our testing with specific versions of Vertica and the partner product.

Vertica and Dataiku DSS: Versions Tested

Software Version
Partner Product

Dataiku Data Science Studio (DSS) 11.3.2

Partner Product Platform RHEL 8.8
Vertica Client

Vertica JDBC Driver 23.3.0

Vertica Server

Vertica Analytic Database 23.3.0

Dataiku DSS Overview

Dataiku Data Science Studio (DSS) is a machine learning tool that enables you to access notebook, tools, and code, and enables data engineers, data scientists to analyze data. It provides simple visual recipes for data preparation along with a suite of AutoML capabilities.

Installing Dataiku DSS

Before you install Dataiku DSS, review the requirements for installing on Linux. 

Download the latest version of Dataiku DSS for your Linux distribution and architecture. After the download is complete, follow the instructions for installation.

Installing the Vertica Client Driver

Dataiku Data Science Studio (DSS) uses the Vertica JDBC client driver to connect to Vertica. To install the driver, follow these steps:

  1. Navigate to the Vertica Client Drivers page.
  2. Download the JDBC driver for your version of Vertica.

    Note For details about client driver and server version compatibility, see the Vertica documentation.

  3. Before installing the driver, you must stop DSS.
    Navigate to the directory where DSS is installed, which by default is DATA_DIR. Stop the application using the following command:
    $ DATA_DIR/bin/dss stop
  4. Copy the downloaded JDBC jar file in the DATA_DIR/lib/jdbc folder.
  5. Start DSS with the following command:
    `$ DATA_DIR/bin/dss start

Connecting Dataiku DSS to Vertica

  1. Open Dataiku DSS <http://<IPAddress>:Port> in a web browser. Port is the one that you entered while installing Dataiku DSS.
  2. Click NEW PROJECT and select Blank project.
  3. Enter the name of the project and click CREATE.
    A new project is created.
  4. In the upper right corner of the screen, click the Applications icon and click Administration.
  5. On the Admin screen, click Connections.
  6. Expand the NEW CONNECTION drop-down and click Vertica.
  7. Enter your connection information:
    • New Connection Name: Name for your connection.
    • Host: IP address of the Vertica server.
    • Database: Database name.
    • User: User name of the Vertica database.
    • Port: The default port number is 5433.
    • Password: Password of the Vertica database.

  8. Click Test and then click Create.

Creating a Dataset

After you have connected to Vertica, follow these steps to create a dataset:

  1. From the DSS menu bar, click the navigation icon and select Datasets.

  2. Click NEW DATASET and select SQL Databases > Vertica.

  3. In the Connection tab, enter the following required fields:
    • Connection: Your connection to Vertica.
    • Mode: Choose Read a database table or SQL query.
    • Table: Table name.
    • Schema: Schema name.

  4. Click TEST TABLE to preview the data.
  5. Enter a dataset name in the New dataset name field and click CREATE.

Known Limitations

Following are the data type limitations when connecting Dataiku DSS and Vertica:

In both Read and Write mode,

  • NUMERIC data type is displayed up to 15 digits of precision beyond which the value is rounded off.
  • For DATE data type, the minimum value displayed is 1583-01-01 below which an incorrect value is displayed.
  • For TIME data type, milliseconds are displayed up to 3 digits.
  • TIME TZ data type does not display milliseconds.
  • For TIMESTAMPTZ data type, minimum date displayed is 1583-01-01 below which an incorrect value is displayed. Milliseconds are displayed up to 3 digits.
  • Binary, VARBINARY, and LONGVARBINARY data types are not displayed correctly.

  • In Write Mode,

    • LONG VARCHAR data type is not loaded.

    For More Information