Share this article:

Vertica Integration with Tableau: Tips and Techniques

For a PDF version of this document, click here.

Introduction

This document describes tips and techniques that can enhance your experience using Tableau Desktop and Tableau Server with Vertica. This document is based on our testing of Tableau with Vertica as described in Vertica and Tableau: Latest Versions Tested.

Audience

This document is intended for customers using Tableau with Vertica. Familiarity with Vertica and Tableau is assumed.

For more information, see:

About Tableau

Tableau is a powerful and flexible business analytics tool that helps you quickly analyze, visualize, and share information.

Tableau Desktop

Tableau Desktop allows you to connect to data and create visualizations and interactive dashboards with an easy-to-use, drag-and-drop interface. Tableau Desktop is available for Windows and Mac OS, and uses ODBC to connect to Vertica.

For more information, see:

Tableau Server

Tableau Server is a web-based application that helps you publish and share the workbooks that you create using Tableau Desktop. Tableau Server is available for Windows and Linux, and uses ODBC to connect to Vertica.

There are several deployment options for Tableau Server:

  • Tableau Server is hosted by users on premises.
  • Tableau Public is free and available to all users on the cloud.
  • Tableau Online is Tableau Server hosted on Amazon Web Services (AWS) and managed by Tableau. Tableau Online ensures that users can only see their own data, not other users' data.

Vertica and Tableau: Latest Versions Tested

This document is based on the results of testing the following versions of Tableau and Vertica:

Software Version
Tableau
  • Tableau Desktop 10.5.2 on Windows and MaacOS
  • Tableau Server 10.5.2 on Linux and Windows
  • Tableau Online 10.5.2 (SaaS)
Vertica Client

Vertica ODBC Driver 9.0.1-0

Vertica Server

Vertica Analytic Database 9.0.1-0

Connecting Vertica and Tableau

Before you connect Vertica and Tableau, download and install the Vertica ODBC driver from the Vertica Client Drivers website.

You connect using Tableau's Vertica connector, also known as the native Vertica connector. Tableau has optimized this connector for excellent performance between Tableau and Vertica.

For more information, see:

Note on Client Driver and Server Compatibility

Verticas’s ODBC driver was made backwards compatible from version 8.1.1. Some new features of the new server version may not be available through the old drivers, so we recommend installing the latest available version of the ODBC driver.

For details about Vertica client and server compatibility, see the Vertica documentation.

Tips for Connecting to Vertica from Tableau

The following sections describe how to configure your Tableau-to-Vertica connection for optimal performance:

For more details about connecting Tableau to Vertica, see:

Parallel Query Processing

Tableau can now open multiple connections to a Vertica database, allowing queries to execute in parallel in multiple concurrent threads. Prior to Tableau 9, queries were executed sequentially.

Tableau automatically creates a new connection for each independent query until the limit for parallel connections is reached. By default, the limit is 16 connections. If a query depends on the results of a previous query, Tableau sends it to Vertica sequentially.

Caution  

Use caution when taking advantage of the parallel query processing capability. Depending on the number of concurrent users, resource pool configuration settings, and dashboard content, you may want to decrease the connection limit to avoid potential resource contention issues.

You can use the following tools to identify the queries that are sent in parallel to your Vertica database:

For information about changing the connection limit, see:

Connection Pools

Tableau creates a separate connection pool for each data source in the workbook that generates a unique connection string. The default maximum number of connections in a connection pool is 16, but you can modify this maximum.

For example, in your Tableau workbook, you can have three data sources that connect to the same Vertica server and database with the exact same connecting string. That would result in a single connection pool shared by all three data sources in the workbook. On the other hand, if you have three data sources pointing to the same Vertica server and database but with different connection strings (for example each one using a different user ID/password), then each of them will use a separate connection pool. You can modify the maximum number of connections for each connection pool independently.

Set the Maximum Number of Connections for a Single Data Source

You can control the maximum number of connections from Tableau to Vertica with connection properties such as class, server, and database.

To set limits for data sources, or for Vertica on specific servers, create a configuration file with the desired settings. The configuration file is an XML file named connection-configs.xml. Tableau parses the contents of the configuration file from top to bottom and chooses the first match for a specific connection.

To create the configuration file:

  1. Create an XML file named connection-configs.xml.
  2. Save the file as follows:

    • For Tableau Server, save the file in the config\vizqlserver folder. The following is an example on Windows:

      C:\ProgramData\Tableau\TableauServer\data\tabsvc\config\vizqlserver

      Note Remember to copy the configuration file to all the vizqlserver folders on all Tableau machines.

    • For Tableau Desktop, save the file in the Tableau Desktop folder. The following is an example on Windows:

      C:\Program Files\Tableau\Tableau <version-number>  

      Note  

      Make sure that the path corresponds to the version of Tableau that you are using.

  3. Copy and paste the following content into the file. Modify the values for server, database, and number of connections to reflect your setup.

    In the following example file:

    • The first connection class sets a limit of 4 connections for any Tableau workbook that uses server name 15.126.227.222 and database name verticanow.
    • The second connection class sets a limit of 10 connections for any Tableau workbook that has a server name of 172.16.116.45.
    • The third connection class sets a limit of 12 connections for all other Tableau connections to Vertica.

    <?xml version='1.0' encoding='ntf-8 ?>
      <connection-list>
         <connection class='vertica' server='15.126.227.222' 
    	 dbname='verticanow'>
    	<limit max='4'>
    	</limit>
         </connection>
         <connection class='vertica' server='172.16.116.45'>
    	<limit max='10'>
    	</limit> 
         </connection>
         <connection class='vertica'
    	<limit max='12'>
    	</limit>
         </connection>
    </connection-list> 

Set the Maximum Number of Connections for All Connections

In Tableau Desktop, to set a connection limit for all connections during a particular session, use the -DConnectionLimit command-line switch.

For example, to launch Tableau Desktop with a connection limit of 2 for all connections, type the following at the Windows command prompt:

"C:\Program Files\Tableau\Tableau <version-number>\bin\tableau.exe" -DConnectionLimit=2
"C:\Program Files\Tableau\Tableau <version-number>\bin\tableau.exe" -DConnectionLimit=2

Note  

Make sure that the path matches the version of Tableau that you are using.

In Tableau Server, you can disable parallel query execution for all connections by using tabadmin and setting the maximum number of connections to 1:

tabadmin set native_api.connection.limit.globallimit 1

To set the connection limit to, for example, 20:

tabadmin set native_api.connection.limit.globallimit 20

Vertica MaxClientSessions Parameter

In Vertica, the maximum number of concurrent connections per cluster is limited by physical RAM of a single node (or number of threads per process). The default limit per node is 50. You can modify the maximum number of connections with the MaxClientSessions parameter.

Before changing the MaxClientSessions setting, refer to the following topics in the Vertica documentation:

Live Connections Compared to Extracts

After you connect to Vertica, you can maintain a live connection or you can create a data extract. A data extract is data from Vertica that you store locally in Tableau.

We recommend that you use a live connection when:

  • You need to leverage the analytic capabilities of Vertica.
  • The volume of data is not conducive to creating an extract.
  • Your workbook uses pass-through RAWSQL functions.
  • You need near real-time analysis.
  • You need robust user-level security.

Use data extracts only when:

  • You need offline access to the data.
  • Many users are accessing a single workbook with data that is not needed in real time.

Build your Tableau Data Source by Dragging and Dropping Tables instead of Writing Custom SQL

We recommend that you create a data source by dragging multiple tables from the same database to the join area/canvas, instead of using the New Custom SQL option in the Data Sources window. This recommendation applies when you construct inner and outer join queries.

When you build your joins by dragging and dropping tables in the UI, Tableau takes advantage of the join culling capability. (See Join Culling in Tableau in this document. )

When to Use Custom SQL

When you use custom SQL, Tableau wraps additional GROUP BY, ORDER BY, and WHERE clauses around your query. As a result, custom SQL queries become subqueries. Therefore, use custom SQL only if Tableau cannot generate the desired SQL (for example, for queries with UNION clauses and self-join queries).

Note  

Custom SQL cannot leverage join culling. (See Join Culling in Tableau in this document.)

When using custom SQL, consider moving your query to Vertica. To do so, create a view with the query that you would have used in the custom SQL connection and then connect to that view from Tableau.

For more information, see the following topics in the Tableau documentation:

Join Culling in Tableau

With join culling, Tableau does not use all joins in your Vertica connection all the time. If you are joining many tables in your connection, using every join would be expensive. Instead, Tableau uses a join only when required. If a join does not impact the result, Tableau discards the join.

Tableau uses join culling automatically on tables that have foreign keys defined. If you do not have foreign keys defined, you can enable join culling for a data source by selecting Assume referential integrity from the Data menu.

Checking for Well-Constructed Joins

When you create a new connection to Vertica, pay attention to the joins that Tableau generates.

By default, Tableau guesses the joins based on the primary and foreign key relationships. These joins may not be correct for certain situations, for example, when a dimension table is used for more than one column in a fact table.

Consider this example: The columns sale_date_key and ship_date_key in the fact table online_sales_fact correspond to the same key from the date dimension table, date_dimension.date_key.

In this case, by default, Tableau creates an incorrect join where the two columns are in the same join:

If the join is not correct, you can alter the join condition. Drag and drop the table date_dimension twice into the canvas (sale_date_dimension and ship_date_dimension in the graphic) and manually join each key to the fact table separately, in two joins:

Cross-Database Joins

Cross-database joins allow you to join tables from multiple databases, such as multiple Vertica databases.

To construct a cross-database join, you need to create a data source with multiple connections, one connection for each database in your join. Follow these steps:

  1. Connect to one of your data sources.
  2. On the Data Source tab, click Add link for each database that your join needs to access.

    After you have added all necessary connections, Tableau lists them on the left-hand side of the screen, each connection identified by a different color.

  3. Drag and drop the tables from each connection onto the canvas to create the join.
  4. Check that the relationships that Tableau has created are correct. Edit them if necessary.

Note  

It is important to understand that Tableau does not compute the join and aggregations in Vertica. Tableau issues separate queries to each database. Then Tableau transfers the data to the client and computes the joins and aggregations within the Tableau data engine.

Exporting the data out of the source database and computing the query in the Tableau data engine is not as efficient as performing the queries in the source database itself. The performance of your dashboards depends on the number of databases that are joined and the amount of data transferred.

For more information, see Join your Data in the Tableau documentation.

Adding or Excluding Columns

  1. From the Data menu of Tableau Desktop, select your data source and click Edit Data Source.

  2. In the data grid or in the list of columns in Manage Metadata, click the arrow of the field you want to exclude and select Hide.

  3. To add a field back that has been excluded, select Show Hidden Fields. Click the field you want to add and select Unhide.

Tableau Best Practices

The following sections describe techniques you can use to improve performance when connecting to Vertica from Tableau:

Tips for Dashboard Design

Consider the following recommendations for efficient dashboard design in Tableau:

  • An efficient dashboard design combines action filters, data source filters, and parameters.

  • Avoid using filter cards, or use them in moderation. If you use filter cards, use a wildcard text box instead of drop-downs with long lists of values. As an alternative to quick filters, use action filters. Action filters allow you to use visualizations to filter other visualizations. When you use action filters, Tableau sends fewer queries to Vertica.

  • Avoid using too many text tables (also called cross-tables or Pivot Tables). If you use text tables, make sure they are small.
  • If a calculation is complicated in Tableau, aggregate it in Vertica, for example, by using live aggregate projections. Push down the execution of these calculations to the database, and let Tableau create the visualization. See Use Live Aggregate Projections.
  • After you have finished designing your dashboards, exclude all unused columns from your data sources. Doing so allows your dashboards to load faster when you connect to Vertica. See Adding or Excluding Columns.

Tips for Using Parameters

Using parameters for conditional calculations in Tableau allows you to dynamically change a calculation.

Note

If the data changes in your Vertica database, Tableau does not dynamically repopulate the parameters. You need to add new parameter values manually. In addition, Tableau supports only single-value selection for parameters.

The following example shows three ways of using Tableau parameters, from slowest to fastest:

Slowest response time:

IF [Parameters].[Date Part Picker] = "Year"
   THEN DATEPART('year',[Order Date])
   ELSEIF [Parameters].[Date Part Picker] = "Quarter"
     THEN DATEPART('quarter',[Date])
…
ELSE NULL END

Faster calculation:

IF [Parameters].[Date Part Picker] = 1
   THEN DATEPART('year',[Order Date])
   ELSEIF [Parameters].[Date Part Picker] = 2    
     THEN DATEPART('quarter',[Date])
…
ELSE NULL END

Fastest calculation:

DATEPART([Parameters].[Date Part Picker],[Order Date])

For more information on parameters, see the Tableau documentation.

Tips for Using Filters

Filters allow you to narrow the data displayed on your dashboard. There are many ways to filter data from your views, for example:

  • You can drag a field directly into the Filters shelf and specify the details of the filter in the Filter dialog box.
  • You can select specific marks on your visualization, and click on "Keep Only" or "Exclude" on the pop-up menu that appears to filter the selected data.
  • You can click on a field's drop-down menu and select "Show Filter". This action displays the filter in the view and users can interact with it.

The following sections provide recommendations for filtering data from Tableau visualizations:

For details about using filters in Tableau, see Filter Data in the Tableau documentation.

Filter Cards

Filter Cards (previously known as Quick Filters) allows you to modify a filter selection directly in the view. To display a filter card, you can click on a field drop-down menu and select "Show Filter". The field is automatically added to the filters shelf and the filter card is displayed so that users can interact with it. You can also click on the drop-down menu of an existing filter and select "Show Filter" to display it.

For more information, see the Tableau documentation on filter cards.

Consider these recommendations when using filter cards:

Filter Cards that Do Not Require Extra Queries

We recommend that you avoid filter cards that require knowledge of the values listed in the filter. Tableau has to query Vertica for all potential values before the filter card object can be rendered.

Filter cards that do not require knowledge of the values in the filter list include:

• Custom value list

• Wildcard match

• Relative date filters

• Browse period date filters

Filter Cards that Require Extra Queries

The following filter cards require knowledge of the values listed in the filter:

  • Multiple value list
  • Single value list
  • Compact list
  • Slider
  • Measure filters
  • Ranged date filters
Customize Filter Cards with an Apply Button

We recommend that you use the Apply button in filter cards that consist of large list of values. Doing so prevents Tableau from issuing a query each time the user selects or un-selects an item from the list.

To display the Apply button on your filter card, click the filter card drop-down menu and select "Customize" > "Show Apply Button".

After you customize your filter with the Apply button, select or deselect multiple values and click Apply. Tableau issues a single query to Vertica that retrieves data based on the selected values.

Filtering Dimensions

  • If you are using a filter card to filter a dimension, we recommend that you use a filter card that does not require an extra query like the wildcard filter. For more information, see Filter Cards that Do Not Require Extra Queries.
  • If you need to use filter cards and they consist of a large list of discrete values such as a drop-down list or a multi-select list, we recommend using the Apply button, as described in Customize Filter Cards with an Apply Button.
  • If your data has numeric dimensions or you define a condition to filter based on numeric values, filter the data using a range of values. Tableau can process a range of values faster than it can process a list of discrete values.
  • Avoid filtering data using the Keep Only or Exclude options. The Keep Only and Exclude options force Tableau to scan all the selected data.

Filtering Measures

  • For best performance, filter measures using a range of values instead of multiple discrete values.
  • If you have a large data source, Tableau recommends that you create a set that contains the measure. You can then apply one or more filters to that set.

Filtering Dates

Date filters are a special kind of dimension filter. The method you use to filter dates impacts the efficiency of the resulting query. There are 3 methods to filter dates:

  • Relative date filters—Show a date range that is relative to a specific day.
  • Range of date filters—Show a defined range of discrete dates.
  • Discrete date filters—Show individual dates that you have selected from a list.

An efficient date filter is one that generates queries with a resulting WHERE clause that uses a ranged date filter. Filtering by date range leverages Vertica partitioning and compression capabilities. We recommend relative or range filters and avoid discrete filters when designing your dashboards.

Context Filters

By default, all filters that you create in Tableau are computed independently and access all rows in your Vertica data source, without regard to other filters.

Context filters behave differently. Tableau implements context filters by writing the filter result set to a temporary table. Any other filters process only the data that the context filter filters and stores in the temporary table.

A context filter must meet the following conditions to ensure optimal performance:

  • The context filter must reduce the size of the data set significantly, by filtering 90% of the data.
  • The context filter should be used against slow-changing dimensions. If you change the filter, the database must recompute and rewrite the temporary table.

For more information, see Improve View Performance with Context Filters in the Tableau documentation.

Filter Actions and Data Source Filters

As a best practice for building efficient dashboards, consider using filter actions and data source filters as described in the Tableau documentation. Filter actions are an efficient method to filter data and prevent Tableau from issuing expensive unnecessary queries associated with filter cards:

  • Filter actions: Using charts to filter related information between Tableau worksheets.

    Note  

    Use filter actions with the Exclude all values option checked. This reduces the number of queries that are pushed down to Vertica.

  • Data Sources filters: Filtering the data on the data source at connection time.

Cross-Data Source Filters

Cross-data source filters allow you to filter data from multiple data sources at once based on a common dimension. By default, any fields that have the same name and data type across the data sources are identified as common or related.

To use cross-data source joins, follow these steps:

  1. Create a dashboard with visualizations from your data sources.
  2. Open one of the worksheets.
  3. Drag and drop the common fields to the Filters shelf to create the filter.
  4. Bring the filter to your dashboard.
  5. Click the arrow in the upper-right corner of the filter.
  6. In the menu that appears, select Apply to Worksheets > All Using Related Data Sources.

    This action applies this filter across all related data sources in your dashboard.

Before filtering the visualizations, Tableau issues separate queries to each data source to identify the values in the filter. If those values do not exist in all related data sources, Tableau excludes them from the list of values in the filter.

For more information, see Filter Data Across Multiple Data Sources in the Tableau documentation.

Tips for Calculations

There are many ways of performing calculations and using Vertica functions in Tableau. The following sections provide some tips:

Using Tableau Built-In Functions

In the Create Calculated Field interface, functions that display in bold always execute locally in Tableau on the aggregated results. Vertica does not execute these functions. Some examples are:

ASCII
CHAR
FIRST
INDEX
ISDATE

Any function that does not display in bold executes in the Vertica database. Some examples are:

ABS
CONTAINS
DATE
DATEADD
FIND
IF
IFNULL
MAX
MIN
POWER
SPACE
STARTSWITH
TRIM
ZN

Regular Expression Functions

Tableau 9.3 adds support for Vertica regular expression functions using built-in functions in the Tableau user interface. The new Tableau built-in functions for Vertica are descrbied as follows:

Tableau built-in function Vertica function called by Tableau function Comments
REGEXP_MATCH(string, pattern) REGEXP_LIKE(string, pattern) These two functions have the same behavior.

REGEXP_REPLACE(string, pattern, replacement)

REGEXP_REPLACE(string, pattern, replacement) These two functions have the same behavior.
REGEXP_EXTRACT(string, pattern) REGEXP_SUBSTR(string, pattern, 1, 1, '', 1)

When calling REGEXP_SUBSTR, Tableau sets the last parameter, captured_subexp, to 1. Doing so causes REGEXP_SUBSTR to return the substring captured by the first set of parentheses in the regular expression.

REGEXP_EXTRACT_NTH(string, pattern, index)

REGEXP_SUBSTR(string, pattern, 1, 1, '', index) You can specify the last parameter of REGEXP_SUBSTR, captured_subexp, using the parameter index in the Tableau function REGEXP_EXTRACT_NTH. To return the entire string that matches the regular expression, set index to 0.

For detailed information about regular expression functions in Tableau and Vertica, see:

How Regular Expression Functions Work

When you call a Tableau built-in function, Tableau pushes the corresponding Vertica function down to Vertica. Vertica executes the function and returns the results to Tableau. Tableau displays the results in the dashboard as you specify.

Let's look at an example. Suppose you want the portion of a string that matches the first set of parentheses in the regular expression. When you use REGEXP_EXTRACT in your dashboard, it calls REGEXP_SUBSTR, with the last parameter, captured_subexp, set to 1. The REGEXP_EXTRACT function looks like this:

REGEXP_EXTRACT('abc 123', '[a-z]+\s+(\d+)') = '123'  -- Tableau function

REGEXP_EXTRACT generates a query like the following and sends it to Vertica for execution. You cannot change this query:

SELECT REGEXP_SUBSTR('abc 123', '[a-z]+\s+(\d+)', 1, 1, '', 1)   

Now suppose you want the full string that matches the regular expression. For this, call the Tableau function REGEX_EXTRACT_NTH with the index parameter set to 0:

REGEXP_EXTRACT_NTH('abc 123', '[a-z]+\s+(\d+)', 0) = 'abc 123'  --function in Tableau

The Tableau function REGEX_EXTRACT_NTH, with the index parameter set to 0, generates a query that calls the Vertica function REGEXP_SUBSTR with the captured_subexp parameter set to 0. This query returns the full string that matches the regular expression:

SELECT REGEXP_SUBSTR('abc 123', '[a-z]+\s+(\d+)', 1,1, '', 0) = 'abc 123'  –-function in Vertica

For more details, see REGEXP_SUBSTR in the Vertica documentation.

Using Table Calculations

Table calculations execute only in Tableau. Some examples are:

RUNNING_SUM
WINDOW

We recommend that, when possible, you consider using Level-of-Detail calculations that can achieve the same results. LOD calculations execute in the underlying database, which allows them to take advantage of Vertica's speed.

Using Pass-Through Functions

Tableau does not support all database functions that Vertica provides. If a function is not available in Tableau, use pass-through functions to push the function to Vertica.

For example, Tableau does not have a MOD or a MEDIAN function. Use RAWSQLAGG_REAL to compute MOD or MEDIAN at the database level. RAWSQLAGG_REAL pushes down Vertica functions that return real values, such as MOD and MEDIAN.

For a complete list of Vertica functions, see the Vertica documentation

For a complete list of Tableau pass-through functions, see the Tableau documentation.

Pass-Through Function Example

To use a Tableau pass-through function to run APPROXIMATE COUNT DISTINCT, create a calculated field:

  1. Right-click a field and select Create Calculated Field.
  2. Select Pass Through from the Functions drop-down list.
  3. Select RAWSQLAGG_REAL.
Enable Non-DBADMIN Users for Pass-Through Functions

To enable Vertica non-DBADMIN users to access pass-through functionality, add the public schema to the user's search path:

=> ALTER USER tableau_user SEARCH_PATH public, tableau_user_s;

Using Level-of-Detail Expressions

If you are working with large data sets, we recommend that you use Level-of-Detail (LOD) expressions instead of table calculations when possible. Tableau does not push table calculations to Vertica.

Level-of-detail calculations are a type of calculation, new in Tableau 9, that allows users to compare data at several different aggregation levels. For example, you might want to compare the number of dropped calls in a certain area of the country to the total number of dropped calls.

LOD expressions generate queries that Vertica computes. These queries include subqueries with inner joins and sometimes with cross joins. Using LOD expressions, you can push calculations and complex queries to Vertica, as opposed to table calculations, which move a large data set to Tableau.

For more information about LOD expressions, see the Tableau documentation:

Tips for Using Sets

Sets in Tableau are useful when you want to compare two or more data sets. However, sets do not push query execution down to Vertica. You should use sets only when working with small data sets.

For more information, see the Tableau documentation.

Vertica Tuning Recommendations

Consider the following techniques for improving Vertica performance with Tableau:

Upgrade Vertica for More Efficient Query Processing

Tableau issues system-related queries to obtain metadata on Vertica tables. The queries use a WHERE 1=0 clause. Vertica 7.1 and later optimize such queries. For more efficient query processing from Tableau to Vertica, upgrade Vertica to Release 7.1.0 or later.

Create a Physical Design with Database Designer

To get the best performance from your Vertica database, create a physical design for your database that optimizes both query performance and data compression.

The Vertica Database Designer automatically optimizes your physical design in the following ways:

  • Analyzes your logical schema, sample data, and your sample queries.
  • Creates a physical schema design (projections) that can be deployed automatically or manually
  • Can be run any time for additional optimization without stopping the database.
  • Uses strategies to provide optimal query performance and data compression

Database Designer minimizes the time you spend on manual database tuning and provides the ability to redesign the database incrementally to optimize for changing workloads over time.

For more information, see Workflow for Running Database Designer.

Identify Sample Queries for Database Designer

You can submit sample queries to Database Designer before it creates or updates a physical design. In particular, if some queries are executing slowly, Database Designer can create a physical design that optimizes those slow queries.

You can use Tableau to identify queries that are good candidates for the Database Designer in a number of ways:

  • Use Tableau Performance Recorder, as described in Using Tableau Performance Recorder.
  • Check the logs, as described in Examining Log Files
  • Run SQL queries against the Vertica Data Collector tables.

    The following query returns the 10 slowest queries from the system:

    => SELECT request FROM v_monitor.query_requests
    	WHERE request_type = 'QUERY'
           ORDER BY request_duration_ms DESC LIMIT 10;	

    The following query returns the 10 most frequently executed queries:

    => SELECT request FROM (SELECT request, COUNT(request) cnt 
    	FROM v_monitor.query_requests 
    	WHERE request_type = 'QUERY' 
    	GROUP BY request) T 
    	ORDER BY cnt DESC LIMIT 10;
  • Monitor queries in Management Console to better understand resource utilization when queries are executing.

For more information, see:

Check Database Designer Projections

After Database Designer creates a database design, examine the log files and review the projections. You can find data about the projections that Database Designer considered and deployed in two Data Collector tables:

DC_DESIGN_PROJECTION_CANDIDATES

DC_DESIGN_QUERY_PROJECTION_CANDIDATES

Verify that the projections are sorted on columns of importance according to the WHERE clauses of your sample queries. Low-cardinality columns should be encoded.

Use Live Aggregate Projections

When you create a live aggregate projection for a table, Vertica automatically aggregates data from that anchor table and loads it into the live aggregate projection. Because the data is already aggregated, retrieving the data directly from the live aggregate projection is faster than retrieving it from the anchor table.

Note

Once you create a projection, run the START_REFRESH or REFRESH function to load the data into the projection.

To access a live aggregate projection from Tableau, create a view and access the view from Tableau:

=> CREATE VIEW <projection_name> AS SELECT * FROM <projection_name>

For information on creating live aggregate projections, see:

Manage Vertica Resource Pools

You can use Vertica resource pools to manage the use of database resources with Tableau. You can create a separate resource pool for Tableau, and you can define secondary (cascading) resource pools to handle queries that do not return within a given time limit.

Create a Separate Resource Pool for Tableau

Vertica recommends that you create a separate resource pool and user for your Tableau application. Doing so reduces the impact of other applications such as ETL jobs on your queries.

The actual impact depends on the amount of memory on the machine and other factors such as how many other resource pools are created for other users.

The following example show how to create and manage access to the TABLEAU_ POOL resource pool:

=> GRANT USAGE ON SCHEMA tableau_s TO tableau;
=> GRANT USAGE ON SCHEMA PUBLIC TO tableau;
=> GRANT USAGE ON SCHEMA online_sales TO tableau;
=> GRANT USAGE ON SCHEMA store TO tableau;
=> GRANT SELECT ON ALL TABLES IN SCHEMA PUBLIC TO tableau;
=> GRANT SELECT ON ALL TABLES IN SCHEMA store TO tableau;
=> GRANT SELECT PN ALL TABLES IN SCHEMA online_sales TO tableau;
=> GRANT ALL PRIVILEGES ON SCHEMA tableau_s TO tableau WITH GRANT OPTION;
=> GRANT CREATE ON SCHEMA tableau_s TO tableau;
=> GRANT USAGE ON RESOURCE POOL TABLEAU_POOL TO tableau;
=> ALTER USER tableau RESOURCE POOL TABLEAU_POOL;

For more information, see Resource Manager in the Vertica documentation.

Create a Cascading Resource Pool

When you create a resource pool, you assign a RUNTIMECAP that specifies the amount of time a query should run in that resource pool before it times out. If a query exceeds the RUNTIMECAP, it errors out. However, if you have defined a secondary pool, the query continues executing in the secondary pool.

To designate a secondary pool, use the CASCADE TO parameter in the ALTER RESOURCE POOL or CREATE RESOURCE POOL statement.

For example, suppose you have a resource pool assigned to the Tableau user:

=> ALTER USER tableau RESOURCE POOL TABLEAU_POOL;

If the TABLEAU_POOL resource pool is not adequate, create a second resource pool (TABLEAU_POOL_BACKUP) for overflow query execution:

=> CREATE RESOURCE POOL TABLEAU_POOL_BACKUP RUNTIMECAP '5 minutes';
=> ALTER RESOURCE POOL TABLEAU_POOL CASCADE TO TABLEAU_POOL_BACKUP;

For more information about cascading resource pools, see Defining Secondary Resource Pools in the Vertica documentation.

Enable Database Isolation Levels in Tableau

By default, Vertica uses the READ COMMITTED isolation level for every session. If your database has a different isolation level than READ COMMITTED, you might experience table locks.

To change the default isolation level for the database, set the TransactionIsolation configuration parameter in either the Tableau Datasource Customization (TDC ) file or in the Tableau Data Source (TDS) file. The following examples show how to set the isolation level to READ COMMITTED:

<!-- METHOD 1 -->
<customization name='CAP_SET_ISOLATION_LEVEL_VIA_ODBC_API' value='yes' />
<customization name='CAP_ISOLATION_LEVEL_READ_COMMITTED' value='yes' />
<!-- METHOD 2 -->
<customization name='odbc-connect-string-extras' value='TransactionIsolation=Read Committed;'/>

To change the default isolation level for a specific session in vsql, use SET SESSION CHARACTERISTICS command:

=> SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL READ COMMITTED;

For details, see Customize Your Connection to Vertica in this document.

For information about Vertica isolation levels, see Change Transaction Isolation Levels in the Vertica documentation.

Enable Native Connection Load Balancing in Vertica

Native connection load balancing is a Vertica feature that spreads the CPU and memory overhead caused by client connections across the hosts in the database.

Vertica does not use load balancing by default. To use load balancing, you must explicitly enable it. For instructions, see Enabling and Disabling Native Connection Load Balancing in the Vertica documentation.

Once you have enabled load balancing on your Vertica server, make the following change to the Tableau Datasource Customization (TDC) file:

<customization name='odbc-connect-string-extras' value='ConnectionLoadBalance=1' />

For more information about native connection load balancing, see About Native Connection Load Balancing in the Vertica documentation.

Customize Your Connection to Vertica

Follow the recommendation in this topic to customize Tableau for connecting to Vertica:

About TDC and TDS Files

You can use a Tableau Datasource Customization (TDC) file or a Tableau Data Source (TDS) file to investigate issues with your connection to Vertica. A TDC file applies customizations to all data sources; a TDS file allows you to customize individual data sources.

Note  

We recommend using TDC files only. Try to avoid customizing connections for individual data sources using TDS files

Important  

Once you create Tableau customization files, you must maintain them manually, editing them any time you need to change them. Because of this, you should use these files judiciously, relying on them only when you detect connection issues that you want to investigate further.

Tableau saves TDC and TDS files in the following locations.

Tableau Desktop:

C:\Users\myuser\Documents\My Tableau Repository\Datasources

Tableau Server:

C:\ProgramData\Tableau\Tableau Server\data\tabsvc\vserver\Datasources

For more information about customizing and tuning ODBC connections in Tableau, see the Tableau documentation.

Customization Details

Based on your situation, apply the following modifications in a TDC or TDS file. The Tableau customizations are as follows:

Customization Recommended Setting
CAP_CREATE_TEMP_TABLES

Set to 'no' so that Tableau does not create temporary tables. See also: CAP_SELECT_INTO.

CAP_ISOLATION_LEVEL_READ_COMMITTED

Set to 'yes' to force the transaction isolation level to Read Committed. Apply only if you are experiencing lock issues.

Refer to Enable Database Isolation Levels in Tableau.

CAP_ODBC_EXPORT_ALLOW_CHAR_UTF8 Set to 'yes' to allow the single-byte char data type for binding Unicode strings as UTF-8. This setting does not impact performance with Vertica.
CAP_ODBC_METADATA_SUPPRESS_EXECUTED_QUERY

Set to 'yes' for Vertica versions and prior releases.

Refer to Upgrade Vertica for More Efficient Query Processing.

CAP_QUERY_SUBQUERY_QUERY_CONTEXT

Set to 'yes' to force Tableau to use a subquery for context filters instead of a temporary table or locally cached results. We recommend this setting if you are using context filters.

Example:

<customization name='CAP_QUERY_SUBQUERY_QUERY_CONTEXT' 
value='yes' />
CAP_SELECT_INTO

Set to 'no' so that Tableau does not create a table on the fly from the result set of another query. See also: CAP_CREATE_TEMP_TABLES.

Example:

<customization name='CAP_SELECT_INTO ' value='no' />
ODBC-CONNECT-STRING-EXTRAS

Use this customization. See the following examples.

Note This customization is case sensitive. Make sure that odbc-connect-string-extras is always lowercase.

Define a session label:

<customization name='odbc-connect-string-extras'
value='Label=tableau_query_session'/>

Change the buffer size:

<customization name='odbc-connect-string-extras' 
value='ResultBufferSize=500000'/>

Enable load balancing:

<customization name='odbc-connect-string-extras' 
value='ConnectionLoadBalance=1' />

Specify database isolation:

<customization name='odbc-connect-string-extras' 
value='TransactionIsolation=Read Committed' />

If you want to change multiple parameters, you must put them in one line, as in this example:

<connection-customization class='vertica' enabled='true'
   version='10.5'>  
   <vendor name='vertica' />
   <driver name='vertica' />
   <customizations>
	<customization name='odbc-connect-string-extras'
	value= 'Label=tableau_session_label;ConnectionLoadBalance=1' />
   </customizations>
<connection-customization>				

Note  

The preceding example specifies Tableau 10.5. Verify that the version you specify matches the version of Tableau that you are using.

We have tested the customizations in this table with Tableau 10.5.

For a complete list of customizations, see Tableau Capability Customizations Reference.

Global Data Source Customizations

To modify the connection behavior for all data sources connecting to Vertica, create a stand-alone Tableau Datasource Customization (TDC) file. The TDC file contains a <connection-customization> section that Tableau applies to any connection to Vertica.

Tableau Desktop automatically embeds the content of the *.tdc file in the workbook file (*.twb).

When you publish the Tableau workbook to Tableau Server or Tableau Online, the workbook uses the customizations that you specified in your custom TDC file. You no longer have to manually create a *.tdc file in Tableau Server.

To overwrite any customized settings, create a new TDC file. However, if you delete this new TDC file, Tableau Desktop uses the settings in the original custom TDC file, even if that file no longer exists.

Create a TDC File

To create a TDC file, follow these steps:

  1. Create a new file in the Datasources directory of the Tableau repository (My Tableau Repository), and name the file using the .tdc suffix (for example, vertica.tdc)
  2. Copy and paste the following into your new TDC file:

    <connection-customization class='vertica' enabled='true' version='10.5’'> 
       <vendor name='vertica' />  
       <driver name='vertica' />  
       <customizations>
          <customization name='CAP_QUERY_SUBQUERY_QUERY_CONTEXT' value='yes' />
          <customization name='odbc-connect-string-extras' 
           value='Label=tdc_query;ConnectionLoadBalance=1;BackupServerNode=10.0.50.153,10.0.50.151' />
       </customizations>
    </connection-customization>

    This example sets CAP_QUERY_SUBQUERY_QUERY_CONTEXT to yes. This setting forces Tableau to use a subquery for context filters instead of a temporary table or locally cached results. Vertica strongly recommends this setting if you use or plan to use context filters in your workbooks.

    This example also sets a session label for the connection balancing using the customization 'odbc-connect- string-extras' and enables connection load balancing.

    Important  

    The Tableau version mentioned within the file must be the same as the Tableau Desktop version you are using.

  3. Save and close your .tdc file.

Edit the TDC File

To remove all customizations, you must manually edit the *.twb file:

  1. Delete any custom *.tdc files.
  2. Edit the *.twb file (it is an XML file) to reset the customizations to the desired settings.

Check that the TDC File Settings are Applied

To verify that the customizations in the TDC file have been applied to your connection, take these steps:

  1. Delete all log files in: My Tableau Repository\Logs.
  2. Open Tableau Desktop and connect to any table in Vertica. Open an existing workbook or create a new one.
  3. Close Tableau.
  4. Open your Tableau log file.
  5. Search for the name of the customization you applied to your connection. For example, search for CAP_QUERY_SUBQUERY_QUERY_CONTEXT.

Single Data Source Customizations

Only use a single data source customization when it is not necessary to have modifications applied to all Vertica data source connections.

To modify connection behavior for a specific data source connection:

  1. Generate a Tableau Data Source (TDS) file.
  2. Edit the .tds file to add a <connection-customization> section for your specific use case.

Important  

TDS settings override global TDC settings. Use TDS settings only if you want to apply your settings to one specific connection.

Create a TDS File

To create a TDS file, take these steps:

  1. Right-click the data connection and select Add to Saved Data Sources.

    Alternatively, from the Data drop-down menu, select the data connection you want to customize and click Add to Saved Data Sources.

  2. Enter a name and save the .tds file.

    You can save the file in the Datasources directory of the Tableau repository, for example, My Tableau Repository\Datasources

  3. Close the workbook.

Edit the TDS File

To edit the TDS file, take these steps:

  1. Close your workbooks.
  2. Open your saved .tds file in a text editor. The .tds file is an XML document that describes the connection to Vertica.
  3. In the <connection> section, add a <connection-customization> section that contains the customizations for your specific scenario.

    Important

    The Tableau version mentioned within the TDC must be the same as the Tableau Desktop version you are using.

    <connection-customization class='vertica' enabled='true' 
       version='10.5'>
       <vendor name='vertica' />
    <driver name='vertica' />
    <customizations>
    <customization name='odbc-connect-string-extras' value='Label=tableau_session_label;ConnectionLoadBalance=1' />
    </customizations>
    </connection-customization>

    You can copy this example and add it to your .tds file. This modification has been tested with the Tableau Desktop and Vertica versions covered in this manual.

    This example includes a session label for the connection ('odbc-connect-string-extras'). The session label is helpful for identifying the connection to facilitate troubleshooting.

  4. Save and close your .tds file.

Apply the Modified TDS File

To apply your TDS file to a connection, take these steps:

  1. From your Tableau workbook, navigate to your.tds file.

  2. Select the file and click Open.

    Notice that the new data source is displayed under Data.

  3. Right-click the original data source and select Replace Data Source:

  4. From the Replacement list box, choose the name of your replacement data source.
  5. Click OK.
  6. Right-click the original data source, and select Close.

Check the TDS File

To verify that the customizations in the TDS file have been applied to your connection, take these steps:

  1. Delete all log files in your logs directory (My Tableau Repository\Logs).
  2. Open Tableau Desktop and connect to any table in Vertica. Open an existing workbook or create a new one.
  3. Close Tableau.
  4. Open your Tableau log file.
  5. Search for the name of the customization you applied to your connection. For example, search for CAP_QUERY_SUBQUERY_QUERY_CONTEXT.

Troubleshooting Tools

This section includes the following information to help you in troubleshooting:

Using Tableau Performance Recorder

The Tableau Performance Recorder helps you identify what events are affecting query performance.

If the bottleneck is query execution (retrieving the data), follow the recommendation in Vertica Tuning Recommendations in this document.

If the bottleneck is in creating the visualizations and the dashboard layout, follow the recommendation in Tableau Best Practices in this document.

Use the Tableau Performance Recorder to view the performance of the queries that Tableau pushes to Vertica. With this information, you can identify queries that are good candidates to submit to Database Designer. When you run Database Designer, it creates projections that optimize the performance of those sample queries.

Run Performance Recorder first on your Tableau Desktop machine. If your workbook is slow in Tableau Desktop, it will be slow in Tableau Server. Before you run Performance Recorder on Tableau Server, restart Tableau Server to eliminate caching and inaccurate results. Then run Performance Recorder on Tableau Server.

Many factors can affect Tableau Server performance, including network latency. To identify if network latency is the problem:

  1. Install Tableau Desktop on the Tableau server machine.
  2. Run Performance Recorder using Tableau Desktop on the Tableau Server machine.
  3. Compare the performance to the Performance Recorder results on the Tableau Desktop machine.

For more information about Tableau Performance Recorder, see the Tableau documentation.

Using Session Labels

When troubleshooting, you can identify a connection by configuring a Tableau session label.

To configure a Tableau session label, customize the connection attribute odbc- connect-string-extras in the TDC or TDS file. Once you have configured a session label, all sessions that Tableau creates will have this label.

To configure a session label in a TDC file, add this text:

<customization name='odbc-connect-string-extras' value='Label=tableau_query_session'

To configure a session label in a TDS file, specify either of the following:

<connection class='vertica' dbname='VMart' 
  expected-driver-version='7.0'
  odbc-connect-string-extras='Label=tableau_query_session' 
  odbc-native-protocol='yes' one-time-sql='' port='5433'
  schema='online_sales' server='part10' server-oauth=''     
  username='dbadmin' workgroup-auth-mode='as-is'>

or

<customization name='odbc-connect-string-extras' value='Label=tableau_query_session'/>

Examining Log Files

To monitor the performance of SQL queries that Tableau sends to Vertica, review the following log files:

ODBC Log File

The ODBC Driver Manager has a trace facility that records the sequence of function calls that an ODBC application makes. You need to enable ODBC tracing and (optionally) specify the name of the log file in which Tableau records this information.

Follow these steps:

  1. Before you start ODBC tracing, close Tableau.
  2. Open the ODBC Data Source Administrator tool. Find the icon for Data

    Sources (ODBC) in the Control Panel under Administrative Tools.

  3. Click the Tracing tab.
  4. Click Start Tracing Now to enable tracing. The button then changes to Stop Tracing Now.

    If you want, specify the name and full path to the log file in the Log File Pathtext box, for example, C:\Odbclogs\myOdbclog.log.

  5. Click OK.
  6. Open Tableau to interact with your visualizations. The ODBC log starts collecting or capturing all calls between the Tableau workbook and the Driver Manager, and between the Driver Manager and the ODBC driver.
  7. Close Tableau.
  8. To stop tracing, in the ODBC Data Source Administrator tool, click Stop Tracing Now.
  9. Review the content of the generated ODBC log file.

Vertica Log File

Vertica records all database events in a log file. An example of the path to the Vertica log file is:

/home/dbadmin/vmart/v_vmart_node0001_catalog/vertica.log

For more information, see the Monitoring Log Files in the Vertica documentation.

Tableau Desktop Log File

If you have a slow workbook, the Tableau log file might help you identify the problem. To determine if a slow query is causing a slow workbook, test the query generated in Tableau directly in vsql.

An example of the path to the log file is:

C:\Users\ant\Documents\My Tableau Repository\Logs\log.txt

To find the problematic query in the log file:

  1. Close Tableau Desktop.
  2. Navigate to My Documents\My Tableau Repository\Logs and delete all the log files in that folder.
  3. Open Tableau Desktop.
  4. Perform the steps that cause the problematic query to execute.
  5. Close Tableau Desktop.
  6. Navigate to My Documents\My Tableau Repository\Logs and open the log.txt file.
  7. Search for the end-query tag and find the query, as in this example:
    {
      "ts": "2016-01-20T12:58:40.178", 
      "pid": 24676,
      "tid": "5688",				
      "sev": "info",
      "req": "-",	
      "sess": "-",
      "site": "-",
      "user": "-",
      "k": "end-query", 
      "v": 
        {
         "query": 
          "SELECT AVG("inventory_fact"."qty_in_stock") 
             AS "avg_qty_in_stock_ok" 
           FROM "public"."inventory_fact" "inventory_fact"  
    	  INNER JOIN "public"."date_dimension" "date_dimension" 
    	  ON ("inventory_fact"."date_key" = "date_dimension"."date_key")
    	WHERE (("date_dimension"."date" >= (DATE '2012-01-01')) 
    	   AND ("date_dimension"."date" <= (DATE '2016-12-31')))
    	   HAVING (COUNT(1) > 0)"
    	"cols": 1,
    	"protocol-id": 0,
    	"rows": 1,
    	"elapsed": 0.095,
    	"query-hash": 1493363951
       }
    }					
    
  8. Extract the query and run the query using vsql.
  9. If the performance in vsql is similar to the performance in Tableau, you may need to rewrite the query.

Tableau Server Log File

In Tableau Server, the log files related to connecting and querying Vertica are in the following folder:

C:\ProgramData\Tableau\Tableau Server\data\tabsvc\vizqlserver\Logs

The Tableau Server log files are in JSON format.

For details about Tableau Server log files, see Work with Log Files in the Tableau documentation.

Support for Vertica Data Types

Be aware of how Tableau handles Vertica data types.

  • Tableau does not support LONG VARBINARY data types. Tableau displays #Error when trying to display LONG VARBINARY values:

  • Tableau displays up to 65,000 characters for LONG VARCHAR values. When Tableau tries to display a very large text value, it displays only the first 65,000 characters of that string, truncating the remainder of the string.
  • For TIMETZ and TIMESTAMPTZ data types, Tableau 10.0 does not support milliseconds and time zone offsets.
  • Tableau supports the TIME data type but requires custom formatting: HH:mm:SS. Right-click the field and click Default properties > Date format.
  • The maximum number of digits for numeric values is 15. Tableau truncates very large numeric values to 15 digits as shown in these examples:

    • 79228162514264337593543950335

      displays as

      79228162514264300000000000000

    • 7922816251426433759354395.0335

      displays as

      7922816251426430000000000.00

  • Tableau rounds some decimal values. For example,

    –99999999999999.9999 rounds to –100000000000000.

  • Tableau supports the FLOAT data type, except for the following values:

    • NaN
    • +Infinity
    • -Infinity

Known Issues and Workarounds

The following topic describes a known issue when connecting from Tableau to the current version of Vertica. It includes a workaround:

Multiple Active Result Sets Per Session

Issue

The Vertica driver does not support multiple queries sharing the same connection. Vertica can only run one query at time within a connection.

Scenario

If Tableau pushes multiple queries down to the Vertica server at the same time using the same connection, you may see the following errors:

Workaround

To work around this restriction, in the TDC file, change the value of the ResultBufferSize connection parameter.

If you are using large result sets in streaming mode, increase the size of your Vertica buffer. Setting the ResultBufferSize to 0 is equivalent to setting the buffer to an unlimited size. The default buffer size is 131072 bytes (128 KB). To increase the size of the buffer, add the following entry to your .tdc file, adjusting the ResultBufferSize according to the needs of your environment.

<customization name='odbc-connect-string-extras' value=' ResultBufferSize=500000' />

For related information, see:

For More Information

 

Share this article: