Vertica Integration with Querybook: Connection Guide

About Vertica Connection Guides

Vertica connection guides provide basic instructions for connecting a third-party partner product to Vertica. Connection guides are based on our testing with specific versions of Vertica and the partner product.

Vertica and Querybook: Latest Versions Tested

This document is based on our testing using the following versions:

Software Version
Querybook

2.7.0

Scala 2.11.0
Desktop Platform CentOS Linux 7.9.2009
Vertica Client

Vertica Python Driver 1.0.1

Vertica Server

Vertica 10.1.1

Querybook Overview

Querybook is an open source Big Data IDE that allows you to discover, create, and share data analysis, queries, and tables.

Installing Querybook

Before you install Querybook, see Prerequisites.

To install Querybook,

  1. Clone and download the repository:

    git clone git@github.com:pinterest/querybook.git
    cd querybook
  2. Create a local.txt file in the requirements folder in the project's root directory:

    touch requirements/local.txt
  3. Add the required packages:

    echo -e "sqlalchemy-vertica-python\npsycopg2" > requirements/local.txt
  4. Start the container:

    Make

Installing the Vertica Client Driver

Querybook uses the Vertica Python client to connect to Vertica.

Note Ensure that you have the latest version of Python and pip.

  1. To install vertica-python client driver using the pip command, refer to the vertica-python github page.

  2. To install sqlalchemy-vertica-python using the pip command, refer to sqlalchemy-vertica-python.

Connecting Vertica to Querybook

  1. Open https://localhost:10001/ in a browser and register as a new user.

  2. Click Settings and then click Admin Tools.

  3. Click create a new metastore and enter the following fields:

    • Name: Enter any name for the metastore.
    • Loader: Select SqlAlchemyMetastoreLoader.
    • Connection_string: Enter the string in the format vertica+vertica_python://username:password@<IP>:portno/db_name.

  4. Click Save.
  5. Click Query Engine and then click create a new query engine..

    Enter the following fields:

    • Name: Enter any name for the query engine.
    • Description: Enter a description if you want.
    • Executor: Select sqlalchemy.
    • Connection_string: Enter the string in the format vertica+vertica_python://username:password@<IP>:portno/db_name.

  6. Click Save.
  7. Click Environment and create a new environment

    • Name: Enter any name for the new environment.
    • Description: Add a description if you want.
    • Query Engines: Select the query engine that you created and click Add Query Engine.
    • Access Control: Select a user to provide access control.

  8. Click Save.

Creating Reports in Querybook

In this example, we will be creating a report which reads data from Vertica and processes it.

  1. Click the plus sign on the left pane to create a new datadoc.

  2. Type the title.

  3. Click Query to add your query and click execute.

Known Limitations

  • For LONG VARCHAR 32M data type, the maximum value is not displayed.
  • For INTERVAL and TIMETZ data types, string ‘b' is appended to each value.
  • For BINARY, VARBINARY, LONG VARBINARY data types, string ‘b' is appended to each value.

For More Information