Loading...

Connection#

Various Methods to Connect with VerticaPy

Discover the versatility of connecting to VerticaPy through a multitude of methods, providing users with flexibility tailored to their preferences. With a user-friendly approach, VerticaPy offers seamless connections, allowing users to effortlessly integrate their Python workflows with Vertica’s powerful backend. Whether you’re navigating the parameters of the vertica_python client, exploring auto connections with automatic credential handling, or opting for a more manual setup, VerticaPy caters to different levels of expertise and security preferences.

Important

In the next sections, we will use placeholders such as your_host, your_user, your_database, etc. Please replace them with your actual connection details.

Vertica Python Client Parameters#

Important

This section introduces the vertica_python client connector and covers all possible parameters. It is used to establish the foundation for the final VerticaPy connection. If you prefer to directly learn how to create a VerticaPy connection, you can skip this part.

The new_connection() function in VerticaPy relies on the vertica_python native client, a powerful tool designed for seamless integration with the VerticaPy ecosystem. This native client acts as a bridge between VerticaPy and the underlying Vertica database, facilitating efficient communication and data retrieval.

Note

Visit the official vertica_python GitHub for more information.

The conn_info parameter utilized by the new_connection() function is a dictionary that aligns with the parameters supported by the vertica_python client. These parameters encompass various connection details, such as the host, port, database name, user credentials, and more. Each parameter in the conn_info dictionary corresponds to a specific aspect of the connection setup, ensuring users have fine-grained control over their connectivity preferences.

Hint

The example below shows how to create a conn_info dictionary:

conn_info = {
   'host': 'your_host', # ex: 127.0.0.1
   'port': 5433,
   'user': 'your_user', # ex: dbadmin
   'password': 'some_password',
   'database': 'your_database', # ex: testdb
   # autogenerated session label by default,
   'session_label': 'some_label',
   # default throw error on invalid UTF-8 results
   'unicode_error': 'strict',
   # SSL is disabled by default
   'ssl': False,
   # autocommit is off by default
   'autocommit': True,
   # using server-side prepared statements is disabled by default
   'use_prepared_statements': False,
   # connection timeout is not enabled by default
   # 5 seconds timeout for a socket operation
   # (Establishing a TCP connection or read/write operation)
   'connection_timeout': 5,
}

Discover the details of the majority of supported parameters in the table below.

Connection Option

Description

host

The server host of the connection. This can be a host name or an IP address. Default: “localhost”

port

The port of the connection. Default: 5433

user

The database user name to use to connect to the database. Default: OS login user name

password

The password to use to log into the database. Default: “”

database

The database name. Default: “”

autocommit

Autocommit. Default: False

backup_server_node

Connection Failover. Default: []

binary_transfer

Data Transfer Format. Default: False (use text format transfer)

connection_load_balance

Connection Load Balancing. Default: False (disabled)

connection_timeout

The number of seconds (can be a nonnegative floating point number) the client waits for a socket operation (Establishing a TCP connection or read/write operation). Default: None (no timeout)

disable_copy_local

Whether copying local files into the database is disabled. Default: False

kerberos_host_name

Kerberos Authentication. Default: the value of connection option host

kerberos_service_name

Kerberos Authentication. Default: “vertica”

log_level

Logging Level.

log_path

Logging Path.

oauth_access_token

To authenticate via OAuth, provide an OAuth Access Token that authorizes a user to the database. Default: “”

request_complex_types

SQL Data conversion to Python objects. Default: True

session_label

Sets a label for the connection on the server. This value appears in the client_label column of the v_monitor.sessions system table. Default: an auto-generated label with format of verticapy-{version}-{random_uuid}

ssl

TLS/SSL. Default: False (disabled)

unicode_error

UTF-8 encoding issues. Default: ‘ignore’ (ignore invalid UTF-8 results)

use_prepared_statements

Passing parameters to SQL queries. Default: False

workload

Sets the workload name associated with this session. Valid values are workload names that already exist in a workload routing rule on the server. If a workload name that doesn’t exist is entered, the server will reject it and it will be set to the default. Default: “”

dsn

Set Properties with Connection String.

For additional examples and more in-depth information, please refer to the official vertica_python GitHub

Additionally, the new_connection() function introduces an extra parameter, env which holds significance in enhancing connection security. This parameter, discussed in more detail later, enables users to configure an environment variable within the conn_info dictionary, offering an additional layer of control and customization for a secure and tailored connection experience.

Advanced Authentication Options (Kerberos & OAuth)#

VerticaPy supports authentication via OAuth and Kerberos, offering secure and versatile options for accessing your Vertica database. This section provides a step-by-step guide on setting up authentication using OAuth access tokens or Kerberos authentication.

OAuth Authentication#

To authenticate via OAuth, follow these steps:

  1. Obtain OAuth Access Token: Acquire an OAuth access token from your OAuth provider, ensuring it has the necessary permissions to access your Vertica database.

  2. Use OAuth Access Token in Connection: Incorporate the obtained access token into your connection dictionary when establishing a connection in VerticaPy.

connection_info = {
   "host": "your_host", # ex: 127.0.0.1
   "port": 5433,
   "user": "your_user", # ex: dbadmin
   "oauth_access_token": "your_oauth_access_token",
   "database": "your_database", # ex: testdb
}

Important

You may need to install additional dependencies.

Note

For additional examples and more in-depth information, please refer to the official vertica_python GitHub

Kerberos Authentication#

To authenticate via Kerberos, follow these steps:

  1. Set Up Kerberos Credentials: Ensure you have valid Kerberos credentials configured on your system.

  2. Update Connection Dictionary for Kerberos: Modify your connection dictionary to include the Kerberos-related parameters.

connection_info = {
   "host": "your_host", # ex: 127.0.0.1
   "port": 5433,
   "user": "your_user", # ex: dbadmin
   "kerberos_host_name": "your_kerberos_host_name",
   "kerberos_service_name": "your_kerberos_service_name",
   "database": "your_database", # ex: testdb
}

Important

You may need to install additional dependencies.

Note

For additional examples and more in-depth information, please refer to the official vertica_python GitHub

Creating a Connection Directly from a Connector#

Whether leveraging vertica_python, exploring supported protocols, or interfacing with ODBC/JDBC connectors, VerticaPy allows users to tailor their connection method to specific needs.

Explore the various tabs to learn how to establish a connection using the specific connection.

In certain scenarios, you may need to create a connection directly from a connector in VerticaPy. This section provides a step-by-step guide on establishing a connection using a connector, offering flexibility in managing connections within your Python environment.

To create a connection directly from a cursor, follow these steps:

# Import VerticaPy.
import verticapy as vp

# Import the client.
import vertica_python

# Creating a vertica_python connection directory.
conn_info = {
   "host": "your_host", # ex: 127.0.0.1
   "port": "5433",
   "database": "your_database", # ex: testdb
   "password": "XxX",
   "user": "your_user", # ex: dbadmin
}

# Setting the connection.
conn = vertica_python.connect(** conn_info)

# Linking this connection to VerticaPy.
vp.set_connection(conn)

Now, the connection conn will be utilized throughout the entire API.

Warning

As we are directly using the connector, the ENV parameter is not supported.

Important

This connection is not saved by VerticaPy, and it cannot be automatically reused. In the event of a connection loss, manual recreation and reconfiguration are required.

The same process can be applied to set up an ODBC connection. We create the connector and then set it in VerticaPy.

# Import VerticaPy.
import verticapy as vp

# Import the ODBC Module.
import pyodbc

# Option 1 - credentials.
# Connecting with DSN credentials

# Credentials.
driver = "/Library/Vertica/ODBC/lib/libverticaodbc.dylib"
server = "10.211.55.14"
database = "testdb"
port = "5433"
uid = "dbadmin"
pwd = "XxX"
dsn = (
   "DRIVER={}; SERVER={}; DATABASE={}; PORT={}; UID={}; PWD={};"
).format(
   driver,
   server,
   database,
   port,
   uid,
   pwd,
)
# Connect.
conn = pyodbc.connect(dsn)

# Option 2 - DSN.
# Connecting with the DSN

# DSN.
dsn = ("DSN=VerticaDSN")

# Connect.
conn = pyodbc.connect(dsn, autocommit = True)

# Setting the connection in the VerticaPy API.
vp.set_connection(conn)

Warning

This type of connection is not recommended, as some functionalities might not be supported. With the native client, we strive to accommodate all continuous changes, including the integration of complex data types, vmaps, special data types and more…

The same process can be applied to set up an JDBC connection. We create the connector and then set it in VerticaPy.

# Import VerticaPy.
import verticapy as vp

# Import the JDBC Module.
import jaydebeapi

# Credentials.
database = "testdb"
hostname = "your_host"
port = "5433"
uid = "dbadmin"
pwd = "XxX"

# Vertica JDBC class name.
jdbc_driver_name = "com.vertica.jdbc.Driver"

# Vertica JDBC driver path.
jdbc_driver_loc = "/Library/Vertica/JDBC/vertica-jdbc-9.3.1-0.jar"

# JDBC connection string.
connection_string = 'jdbc:vertica://' + hostname + ':' + port + '/' + database
url = '{}:user={};password={}'.format(connection_string, uid, pwd)
conn = jaydebeapi.connect(
   jdbc_driver_name,
   connection_string,
   {'user': uid, 'password': pwd},
   jars = jdbc_driver_loc,
)

# Setting the connection in the VerticaPy API.
vp.set_connection(conn)

Warning

This type of connection is not recommended, as some functionalities might not be supported. With the native client, we strive to accommodate all continuous changes, including the integration of complex data types, vmaps, special data types and more…

Closing Connections#

Closing connections is a crucial step in managing your database connections with VerticaPy. This section outlines the recommended practices for closing connections once they are no longer needed, ensuring proper resource management and security.

To close a connection in VerticaPy, follow these steps:

# Import VerticaPy.
import verticapy as vp

# Close the connection.
vp.close_connection()

Note

Closing the connection promptly is essential to free up resources and avoid potential issues related to open connections.

In this comprehensive guide, we’ve explored the various facets of managing connections in VerticaPy, from creating connections and leveraging auto connections to advanced functionalities. The flexibility offered by VerticaPy allows you to tailor your approach to meet the unique requirements of your data analytics projects.

As you navigate through the connection options, it’s essential to weigh the advantages and disadvantages of each method. Whether you opt for manual connections, auto connections, or utilize existing connections, understanding the nuances ensures a seamless and secure workflow.

Remember that efficient connection management is the cornerstone of successful data analytics. By making informed choices and leveraging the capabilities of VerticaPy, you empower yourself to harness the full potential of your data.