Vertica Integration with Informatica PowerCenter: Connection Guide

About Vertica Connection Guides

Vertica connection guides provide basic instructions for connecting a third-party partner product to Vertica. Connection guides are based on our testing with specific versions of Vertica and the partner product.

Vertica and Informatica PowerCenter: Latest Versions Tested

Software Version
Partner Product

Informatica PowerCenter 10.5.3

Informatica PowerCenter Client Platform

Windows Server 2019 Standard

Informatica PowerCenter Server Platform

Windows Server 2019 Standard

Red Hat Enterprise Linux release 8.3

Vertica Client

Vertica ODBC 23.4.0

Vertica Server

Vertica Analytic Database 23.4.0

Informatica PowerCenter Overview

Informatica offers the PowerCenter platform for connecting to Vertica. PowerCenter is a scalable, high-performance enterprise ETL platform that provides the following major capabilities:

  • Data delivery: Performing ETL (extract, transform, load) and ELT (extract, load, transform) operations, moving data among various data sources
  • Data transformation: Basic, intermediate, and complex data transformation
  • Design and development environment: Team-based development capabilities
  • Metadata and modeling: Physical-to-logical data mapping, including graphical attribute-level mapping

PowerCenter10.5.3 includes the PowerExchange (PWX) Connector for Vertica. The PWX Connector for Vertica includes additional capabilities and performance improvements when connected to your Vertica database.

History of Integration Between Informatica and Vertica

Prior to PowerExchange (PWX), Vertica developed and supported two connectors for Informatica and Vertica:

  • In 2009, Vertica developed the Vertica Plug-in for Informatica. This connector used the native method of loading data from Informatica into Vertica.
  • In 2013, Vertica replaced the earlier Vertica Plug-in for Informatica with a Java plug-in that supported all operating system platforms. The plug-in ran on generic JDBC and ODBC connections and included new and improved features compared to the native plug-in.
  • In 2014, Informatica released the PWX Connector for Vertica. This connector includes enhancements to the partitioning and pushdown capabilities of Informatica. We recommend that you use the new PWX Connector for Vertica with Informatica PowerCenter 10.5.3 to connect to your Vertica database.

PowerExchange Features for Vertica

IT organizations can access data sources without having to develop custom data access utilities using a high-performance tool—PowerExchange (PWX) Connector for Vertica. PowerExchange Connector for Vertica provides connectivity between Informatica PowerCenter and Vertica. PWX uses the Vertica ODBC driver to write large volumes of data into your Vertica database.

The major features of PWX for Vertica include:

PWX for Vertica Feature Description

Bulk mode or relational mode for processing Vertica data

Bulk mode: Write large amounts of data to Vertica from multiple data sources. Bulk mode is only available for writing to Vertica.

Relational mode: Read data from a Vertica source and write data to a Vertica target. Relational mode supports pushdown optimization.

Design-time features

  • Preview source and target data before mapping.
  • Generate and execute DDL on Vertica.
  • Import Vertica tables.

Run-time features

  • Set the commit type, interval sources, and interval targets.
  • Update strategy: Data-driven approach to inserting, deleting, and updating data in Vertica.

Partitioning support

  • Key range: One or more ports make up a partition key.
  • Round robin: Data is distributed to one or more partitions.
  • Hash-based: Data is distributed to partitions in groups.

Uses Vertica COPY LOCAL statement for various partitions in the mapping pipeline.

SQL transformation

Flexibility to process logic per input row.

Lookup (cached and uncached)

Supports Vertica lookup in the existing data integration logic.

Connection resiliency

High availability

Compatibility with Vertica Java Plug-in for Informatica

Support for existing features

SQL pushdown

  • Source Side
  • Target Side

Supported pushdown transformations

  • Aggregator, filter, joiner, union, sort, router, and lookup transformation as source.
  • Expression transformation as source, target, and full.
  • Update transformation as full. (When the source and target database are the same, all transformations are pushed to the target.)

Connecting Vertica to Informatica PowerCenter using PWX Connector

PowerCenter includes the PowerExchange (PWX) for Vertica connector. Informatica supports this connector from Informatica PowerCenter 9.6.1 HotFix 2 onwards. 

Before you Begin

Before you install and configure PowerExchange, you must install Vertica using the instructions in the Vertica Installation Guide.

Installing and Configuring PowerExchange for Vertica

To install and configure PowerExchange (PWX) for Vertica, perform the following steps on your Informatica server:

  1. Ask Informatica Support for the Informatica PowerCenter 10.5.3 software and install it.
  2. Install the Vertica ODBC driver.

    1. Download the driver from the Client Drivers page on the Vertica website.
    2. Follow the installations instructions in the Vertica documentation, Installing and configuring client drivers

      For Vertica driver/server compatibility, review the information in Client drivers support in the Vertica documentation.

  3. In the Linux environment, create a configuration file named vertica.ini. This file defines the Vertica specific settings required by the ODBC drivers. Here’s an example vertica.ini file:

    [VerticaDriverName]
     DriverManagerEncoding=UTF-16
     ODBCInstLib=<Informatica install Dir>/ODBC7.1/lib/libodbcinst.so
     ErrorMessagesPath=/opt/vertica/lib64 ## Installed Vertica Client Directory
     LogLevel=4
     LogPath=/tmp

For more information about the content of the vertica.ini file, see Installing the ODBC client driver in the Vertica documentation. 

  1. In the Linux environment, set the VERTICAINI environment variable to the path to the vertica.ini file.
  2. To use the PowerCenter bulk mode, register the PWX for Vertica connector:
  3. Ensure that you have the PWX for Vertica XML file:
    <Informatica_installation_folder>\server\bin\plugin\VerticaConnector.xml
  4. Follow the instructions to register the plug-in on the Informatica server in Registering the Plug-In's Metadata. You must specify VerticaConnector.xml as the plug-in file.
  5. After you have registered the plug-in, in the Repository, on the Plug-Ins tab, ensure that you see the line that reads “PowerExchange for Vertica ”.
  6. Before you continue, on the Repository Properties tab, you must switch the Operating mode back to Normal mode.

For detailed instructions about registering PWX for Vertica , see the Informatica PowerExchange for Vertica —User Guide for PowerCenter (Version 9.6.1 Hotfix 2 and above) document, available in the Informatica Documentation.

Configuring Vertica as Source and Target

Use PowerCenter Designer to import Vertica source and target definitions.

Before you import a source or target definition, you must create an ODBC data source for your Vertica database:

  1. To configure an ODBC data source on Windows, set up a DSN for your Vertica database(s) using the Informatica Control Panel. Under Administrative Tool, click Data Source (ODBC). Ensure that you create a 32-bit DSN to import Vertica source and target table definitions only and test the connection.
  2. To configure an ODBC data source on Linux or other non-Windows platform, add the Vertica database entries to the <Informatica_Installation_Directory>\ODBC7.1\odbc.ini file. Here is an example odbc.ini file:

    [ODBC Data Sources]
    vertica_odbc=libverticaodbc.so
    [ODBC]
    InstallDir=/opt/infa/Informatica/10.4.0/ODBC7.1
    Trace=0
    TraceFile=/opt/infa/Informatica/10.4/odbctrace.out
    TraceDll=/opt/Infa/Informatica/10.4/ODBC7.1/lib/DWtrc27.so
    
    [vertica_odbc]
    Description = Vmart Database
    Driver = /opt/vertica/lib64/libverticaodbc.so
    Database = <database_name>
    Servername = <server_name>
    UID= <user_ID>
    PWD= <password>
    Port = <port_number>
    ConnSettings=
    SSLKeyFile=
    SSLCertFile=
    Locale=UTF-8
  1. Set the ODBCINI environment variable to the ini file path.
  2. Using Informatica Designer, import the Vertica source and target definition. In the Source Analyzer, on the Source menu, select Import from the Database.

  3. Select the tables you want to import and click OK.

  1. In the Target Designer, on the Target menu, select Import from the Database.
  2. Select the tables you want to import and click OK.

Creating a Mapping

After you configure the source and target tables, use PowerCenter Designer to create a mapping and create the desired transformations: 

  1. Open the Mapping Designer.
  2. On the Mapping menu, select Create.
  3. Name the new mapping and click OK.
  4. Drag the source and target tables into the Mapping Designer. Create the transformation between the source and the target.

    mapping_in_designer.png

  5. Save the mapping.

Creating a Workflow

To create a workflow, you can use the Workflow Manager or the Mapping Designer. The following example shows how to use the Mapping Designer to create the workflow and configure the PWX Connector:

  1. In the Mapping Designer, right-click and select Generate Workflow. The Workflow Generation wizard opens.

  2. Select the desired Workflow Generation Option and click Next.
  3. From the drop-down list, select the Informatica Integration Service you want to run the workflow on. If needed, under Connection Object, change the connection setting for your Vertica database. Click Next.
  4. Make any necessary changes to the workflow settings and click Next.
  5. When you see that the workflow generated successfully, click Finish.
  6. Make changes to the connection, memory, properties, and other settings using the Workflow Manager.
  7. After you have saved the changes, right-click the folder for the connector and click Connect.
  8. In the Workflow Manager, reconnecting to the folder allows you to view the newly created workflow.

Configuring the Workflow

  1. Drag the workflow into the Workflow Designer.
  2. Double-click the task. The Edit Tasks window opens.
  3. Select the Mapping tab.


  4. In the left pane, select your source. Under Readers, the value should be Relational Reader. Under Connections > Value, check that the Vertica connector is listed.

    rekational_reader.png

  5. If needed, under Properties, change the attribute values.
  6. In the left pane, select your target. Under Writers, the value should be Vertica Bulk Writer. Under Connections > Value, check that the Vertica connector is listed.

  7. If required, under Properties, change the attribute values.
  8. In the Edit Tasks dialog box, select the Properties tab. Change the Commit Interval from the default (10000) to 100000, 1000000, or 10000000, depending on how many data rows you are loading. When writing large amounts of data to Vertica, larger commit intervals can improve load performance.

  9. Make any other desired changes to the attributes. When you have completed your changes, click Apply and OK.
  10. Save the workflow.

Running the Workflow

  1. To run your workflow, right-click in the Workflow Manager and select Start Workflow.

  2. To check the status of your workflow, open the Workflow Monitor and select the folder for your workflow.

    The Workflow Monitor displays the status of the workflow.

    The following window shows that the workflow has completed successfully.

  3. When the workflow completes, the Status reads either Succeeded or Failed.

  4. To review session logs, right-click session name - s_m_<mapping_name> and select Get Session Log.
  5. Review the session logs. Here is a sample from a successful workflow run:

The session logs list any error or issue that occurred while the workflow was running. The session logs also show the number of rows that the workflow committed to the Vertica target, depending on the value of the commit interval.

Pushdown Optimization

PowerExchange (PWX) for Vertica supports three types of pushdown optimization: 

  • Source Side—PowerCenter pushes certain transformations to the source database only. Use source-side pushdown optimization where you are downloading data into a file from your Vertica database.
  • Target Side—PowerCenter pushes certain transformations to the target database only. Use target-side pushdown optimization when you want to load data from a file or other database into Vertica.
  • Full—PowerCenter pushes all tranformations to the target database when the source and target databases are the same. This situation requires a relational mode connection. Use full pushdown optimization to load data from Vertica, use Informatica to transform the data, and then write it back to the database.

Pushdown Considerations

Consider the following when using pushdown optimization with Informatica and Vertica:

  • Vertica strips the padding spaces from CHAR column values when you push down a function that takes a CHAR column as an argument.
  • Pushdown compatibility for connection properties have to be identical for the source and target databases. The connection properties are:
    • Code Page
    • Connect String
    • Connection environment SQL
    • Transaction environment SQL
  • Qualifying name for a table is <database_name>.<schema_name>.<table_name>

Support for Pushdown Optimization

Expressions

PowerExchange (PWX) for Vertica supports the following in expressions:

Operators

+ - * /

%

||

= > < >= <= <>

!=

^=

NOT   AND   OR 

Variables

SESSSTARTTIME

SYSDATE

Functions

Functions listed in the first column of the following table can be pushed to Vertica along with other functions using source-side pushdown optimization.

CEIL() ABS() FLOOR() MIN() SYSDATE()
LOG() ADD_TO_DATE() GET_DATE_PART() MOD() SYSTIMESTAMP()
ROUND(DATE) ASCII() IIF() POWER() TAN()
ROUND(NUMBER) AVG() INITCAP() RPAD() TANH()
SIGN() CHR() INSTR() RTRIM() TO_BIGINT
TRUNC(DATE) CONCAT() ISNULL() SIN() TO_CHAR(DATE)
TRUNC(NUMBER) COS() LAST_DAY() SINH() TO_CHAR(NUMBER)
  COSH() LENGTH() SOUNDEX() TO_DATE()
  DATE_COMPARE() LN() SQRT() TO_DECIMAL()
  DATE_DIFF() LOOKUP STDDEV() TO_FLOAT()
  DECODE() LOWER() SUBSTR() TO_INTEGER()
  EXP() LPAD() SUM() UPPER()
    LTRIM() MAX() VARIANCE()

Transformations

PowerExchange (PWX) for Vertica supports the following transformations:

Transformation Pushdown type

Aggregator transformation

Source-side, Full

Expression transformation

Source-side, Target-side, Full

Filter transformation

Source-side, Full

Joiner transformation

Source-side, Full

Lookup transformation

Source-side, Full

Router transformation

Source-side, Full

Sorter transformation

Source-side, Full

Source qualifier transformation

Source-side, Full

Target transformation

Target-side, Full

Union transformation

Source-side, Full

Update strategy transformation

Full

For more details, see the Informatica PowerCenter—Advanced Workflow Guide (Version 9.6.1 HotFix 2 onwards).

Known Limitations

  • By default, Integration Service converts NUMERIC data type to DOUBLE data type with a precision of 15 digits beyond which the values are rounded off.

    To load the values as a DECIMAL data type,

    In the Workflow Manager, edit the session task, click the Properties tab and then select Enable high precision. It supports up to 28 digits of precision after enabling high precision.

  • For TIME and TIMETZ data types, milliseconds are truncated.
  • Timezone values are not supported.

  • VARBINARY and LONG VARBINARY data types are truncated when using PowerExchange for Vertica.

For More Information