Share this article:

Vertica Integration with Microsoft SQL Server Integration Services: Connection Guide

About Vertica Connection Guides

Vertica connection guides provide basic instructions for connecting a third-party partner product to Vertica. Connection guides are based on our testing with specific versions of Vertica and the partner product.

SQL Server Business Intelligence Overview

SQL Server Business Intelligence is a suite of tools for creating BI applications on your SQL Server deployment. SQL Server Data Tools (SSDT) for Visual Studio (VS) is a part of the SQL Server Business Intelligence platform that enables the functionality to create SQL Server Analysis Services (SSAS), SQL Server Integration Services (SSIS), or SQL Server Reporting Services (SSRS) projects within Visual Studio. Installing SSDT enables the SSAS, SSIS, and SSRS templates in Visual Studio:

  • SQL Server Analysis Services (SSAS)—Business intelligence using either multidimensional (MOLAP) or tabular, relational (ROLAP) modeling.
  • SQL Server Integration Services (SSIS)—Data migration, workflows, and ETL.
  • SQL Server Reporting Services (SSRS)—Report generation.

SQL Server Integration Services (SSIS) Overview

SQL Server Integration Services (SSIS) is a platform for building enterprise-level applications for data transformation and integration. A Vertica connection in an SSIS application can serve as either a data source or a data target or both. SSIS is available as a part of SQL Server Data Tools (SSDT) for Visual Studio (VS).

Vertica and SQL Server Integration Services: Latest Versions Tested

This document provides guidance using the software listed as follows:

Software Version
Client tools
  • Microsoft SQL Server 2016 Enterprise Edition
  • SQL Server Data Tools (SSDT) for Visual Studio 2015 (If you install SSDT without Visual Studio, then the free edition of Visual Studio is included with SSDT.)

  • Visual Studio 2015 (If not installed by SSDT)

Client Platform

Microsoft Windows Server 2016

Vertica Client

Vertica Client Package 10.0.1

Vertica Server

Vertica Server 10.0.1

Installing SQL Server Integration Services (SSIS)

To install SQL Server Integration Services (SSIS)

  1. Download and install SQL Server 2016.

    Follow the installation instructions in the Microsoft documentation.

  2. Download and install SQL Server Data Tools (SSDT) for Visual Studio 2015.
  3. Download Visual Studio if not installed by SSDT.
    1. Check the Visual Studio 2015 System Requirements.

    2. Follow the instructions for Installing Visual Studio 2015.

Installing the Vertica Client Drivers

SQL Server Integration Services (SSIS) can use ADO.NET or ODBC to connect to Vertica.

To download the drivers:

  1. Navigate to the Client Drivers page on the Vertica website.
  2. Download the driver package for Windows.
  3. Start the installer and select the following options:

    • ADO.NET Provider
    • ODBC Driver
    • Visual Studio Plug-in
    • Microsoft Connectivity Pack
    • vsql Client (Vertica recommends installing vsql client, but it's optional)

Connecting to Vertica Using ADO.NET or ODBC

Follow the instructions in the following topics to create a connection to Vertica from SSIS:

Note You can use either ADO.NET or ODBC to connect to Vertica as a source or target database. You can combine both connection methods in your transformations/ETL applications with this product. Each method has its performance characteristics and limitations, so you can choose a method or combination that best fits your environment.

For more information on performance, see Performance with ODBC and ADO.NET.

For more information on data types, see Known Limitations.

Connecting to Vertica using ADO.NET

  1. Open Visual Studio and create a new SQL Server Integration Services (SSIS) project.

    To create a new project, select File > New > Project.

  2. In the New Project window, select Integration Services Project or Integration Services Import Project Wizard. We selected Integration Services Project in this example.

    When the new project is created, the Solution Explorer displays the Integration Services Project in the right side of the screen.

  3. Right-click Connection Manager and select New Connection Manager.

  4. In the Add SSIS Connection Manager window, select ADO.NET and click Add.

  5. In the Configure ADO.NET Connection Manager window, click New.

    The Connection Manager window displays.

  6. Expand the Provider drop-down list.

    Under .NET Providers, select Vertica Data Provider. and click OK.

  7. Specify values for Database, Host, User, Password, and Port.

  8. Click Test Connection.

  9. Click OK on all open windows to close them.

Connecting to Vertica using ODBC

Note  

The Vertica Connectivity Pack installer provides both the 32- and the 64-bit ODBC drivers. However, only the 32-bit ODBC driver is visible within SSDT, because SSDT is a 32-bit application.

For ODBC connections, we recommend that you create a User DSN (not a System DSN), because a User DSN automatically uses the appropriate bitness for the connection.

If you create a System DSN, then you must create both a 32- and a 64-bit DSN with the same name.

  1. Open the ODBC Data Source Administrator and click Add to add a new DSN.

  2. Select Vertica and click Finish.

    The Vertica ODBC DSN Configuration window displays.

  3. Specify values for DSN Name, Database, Server, Port, User name, and Password.
  4. Click Test Connection.
  5. Click OK.
  6. Switch to SSDT to create a new project or open an existing project.

  7. Right-click Connection Managers and select New Connection Manager to open the Add SSIS Connection Manager window.

  8. Select ODBC and click Add.

  9. Click New to open the Connection Manager window.

  10. In the Connection Manager window, expand the Use user or system data source name drop-down list and select the ODBC DSN that you created.

  11. Click Test Connection.

  12. Click OK on all open windows to close them.

Connection Pooling and Native Connection Load Balancing

We recommend that you enable connection pooling on the client and native connection load balancing on the client and Vertica Server.

  • Connection pooling

    Connection pooling ensures that your SSIS application can reuse connections from a pool of connections. This reduces the overhead involved in opening and closing multiple connections.

  • Native connection load balancing

    Native connection load balancing is a feature built into the Vertica server and client libraries that can prevent unequal distribution of client connections among hosts in the cluster.

Neither connection pooling nor native connection load balancing are enabled by default.

Enabling Connection Pooling for the Vertica ODBC Driver

  1. In the Administrative Tools folder of Control Panel, open the 32-bit ODBC Data Source Administrator.
  2. Click the Connection Pooling tab.
  3. Double-click the Vertica ODBC driver.
  4. Select Pool Connections to this driver.

  5. Click OK.

  6. Repeat the above steps in the 64-bit ODBC Data Source Administrator.

Note  

These steps enable pooling for all applications using the Vertica driver. The default maximum number of connections in a pool is 100 (Max Pool Size).

Enabling Connection Pooling for the Vertica ADO.NET Driver

  1. In the Solution Explorer window on the right-hand side of the SSDT Designer, select Connection Manager.
  2. Right-click your ADO.NET connection.
  3. Click Open.
  4. From the Connection Pooling Settings drop-down list, select True.

  5. Verify the other required details and click OK.

    You may need to re-enter the password.

Note  

These steps enable pooling for the specific connection. The default maximum number of connections in a pool is 100 (Max Pool Size)

Enabling Native Connection Load Balancing in Vertica

You must enable connection load balancing for both the Vertica client and the Vertica server.

Enabling Connection Load Balancing in the Vertica ODBC Client

  1. Open the Data Source Administrator.
  2. Select the DSN you created to connect to Vertica.
  3. Click Configure.
  4. On the Basic Settings tab, check Use connection load balancing.
  5. Click Test connection.
  6. Click OK.

Enabling Connection Load Balancing in the Vertica ADO.NET Client

  1. In the Solution Explorer window on the right-hand side of the SSDT Designer, select Connection Manager.
  2. Right-click your ADO.NET connection.
  3. Select Open.
  4. From the Connection Load Balance drop-down list, select True.

  5. Verify the other required details and click OK.

    You may need to re-enter the password.

Enabling Connection Load Balancing in the Vertica Server

For step-by-step instructions, see Enabling and Disabling Native Connection Load Balancing in the Vertica documentation.

Tip  

For details about native connection load balancing in Vertica, see About Native Connection Load Balancing in the Vertica documentation.

Performance with ODBC and ADO.NET

Our testing has shown that ODBC is the more performant option for connecting SSIS to Vertica. You can tune the BatchSize property of ODBC to increase efficiency. BatchSize controls the loading of data in batches for both Source and Destination components in SSIS. By default, BatchSize is 1000.

We have tested the Vertica ADO.NET and ODBC drivers with a 1.2 million row table where Vertica was both the Source and the Destination. With the ODBC driver, using the default batch size, SSIS loaded the data in 36 seconds. With the ADO.NET driver, SSIS loaded the data in 82 seconds. When we changed the value of BatchSize to 50,000, the time to load the 1.2 million rows decreased from 36 to only 16 seconds.

The same value of BatchSize may yield different performance results on different machines, since resources available for any task vary. A value of BatchSize that allows data to accumulate in memory improves performance. A value of BatchSize that is greater than the available memory for the drive buffer degrades performance.

Known Limitations

  • SSIS connections to Vertica can use ADO.NET or ODBC, but OLEDB is not supported.

Data Type Incompatibilities with ADO.NET

Preview data with ADO.NET Source component:

• NUMERIC data type is supported up to 28 digits of precision beyond which the error “Error displaying this cell” is displayed.

• INTERVALMONTH data type is not supported.

• For TIME, TIMETZ, TIMESTAMP, and TIMESTAMPTZ data types, milliseconds are truncated.

• BINARY, VARBINARY, and LONG VARBINARY data types are not supported.

Writing data with ADO.NET Destination component:

  • NUMERIC data type is supported up to 28 digits of precision. When writing numbers with a higher precision the following error is displayed:

    Specified argument was out of the range of valid values.
  • INTERVAL HOUR TO SECOND, INTERVAL HOUR TO MINUTE, and INTERVAL data types are not supported. When attempting to write, the following error is displayed:

    failed because truncation occurred
  • INTERVAL YEAR TO MONTH data type is not supported. When attempting to write, the following error is displayed:

    Error: Invalid conversion from Interval_Day_To_Second to Interval_Year_To_Month
  • TIME and TIMESTAMP data types are supported up to 3 digits above which the value is truncated.

  • LONG VARBINARY is supported only up to 8 million bytes beyond which the following error is displayed:

    An exception has occurred during data insertion, the message returned from the provider is: [20032] Operation canceled.

Data Type Incompatibilities with ODBC

Preview data with ODBC Source component:

• For TIME and TIMETZ data types, milliseconds are truncated.

• For TIMETZ and TIMESTAMPTZ data types, timezone is not supported.

• For BINARY, VARBINARY, and LONG VARBINARY data types, data is displayed in hexa decimal format.

  • For LONG VARBINARY data type, very large values are not displayed. You may see the following error when displaying LONG VARBINARY values:
Failed: String data is too big for driver's data buffer. Native Error Code: 2052. ERROR 2052:  COPY: Row size 32320356 is too large.

Writing data with ODBC Destination component:

  • Execute SQL Task component fails when it returns single row or full results set. The following error is displayed:

    Disconnected record sets are not available from ODBC connection.

    The workaround for this issue is to use the ADO.NET connector.

  • NUMERIC data type is supported up to 38 digits of precision. When writing numbers with higher precisions, the following error is displayed:
  • Numeric value out of range
  • TIME data type is rounded off to the nearest millisecond.
  • For TIMETZ data type, milliseconds are truncated and timezone is not displayed correctly.

For More Information

Share this article: