Using Looker to Leverage Vertica Capabilities: ORC Reader, Flex Tables, and Place

Posted December 8, 2015 by BD_Partner_Eng

Want some hints about how to use Looker to visualize data in your Vertica database? The Looker user community, known as Looker Discourse, has published three articles that explain how to use Looker with these Vertica features:

  • ORC reader
  • Flex tables
  • Geospatial functions and data

Looker is a BI tool that connects directly to Vertica and provides tools that allow you to explore, visualize, and gain insights about your Vertica data.

Vertica ORC Reader

The Looker and Vertica: ORC Reader article provides an example that explains how to use Looker to visualize Hive data stored in Vertica. This article uses data about retail orders to first show how to bring data stored in Hive into Vertica, and then how to use Looker to retrieve, transform, and visualize that data.

The example first uses Hive to bring the order data from a text file into an external table that’s stored in ORC format. Then, you load the data from Hive into a Vertica external table, which allows Looker to query the new table as if it was any table in Vertica.




The example shows how to create a Looker view file, with dimensions that represent the columns from the ORC file, transforming the column data if needed. Looker uses an explore (starting point for a query that references objects in the Vertica database) to join the order data with other related data. You create a look (a saved report) based on that explore to visualize these relationships.


Vertica Flex Tables

The Looker and Vertica: Flex Table Integration article describes how to bring JSON data into a Vertica flex table so that Looker can query the data and create visualizations. This example uses Twitter data stored in a Vertica flex table.
looker2You’ll need to start by creating a flex table in which to store the data. For Looker to query the table, you must first run the Vertica function COMPUTE_FLEXCTABLE_KEYS_AND_BUILD_VIEW to compute virtual columns (keys) from the VMap data of a flex table and construct a view.



The next step is to create a view file using the flec table and use that view file to create two looks:

  • The first look lists the tweets that contain the string “:)”
  • The second look displays the average, minimum, and maximum number of followers across all users who posted tweets that contain the string “:)”.

Currently, Looker does not support creating a view file from a view in Vertica. That’s why this example connects directly to the flex table and casts the column data types. If you want to create a view file directly from the flex table view, replace all references to the tweets_table flex table with tweets_table_view, and remove the casting from the LookML.

Vertica Place: Geospatial Functions and Data

Vertica Place provides functions that allow you to manipulate complex two-dimensional spatial objects. These functions follow the Open Geospatial Consortium (OGC) standards. Vertica Place also provides data types and SQL functions that allow you to specify and store spatial objects in a database according to OGC standards.

The Looker and Vertica: Geospatial Analytics article describes an example that uses sales data from locations (points) all over the United States. Looker retrieves the sales data from a Vertica database and plots those locations on a map of the United States. Once you’ve see the country-wide data, you can use Looker to drill down into geographic areas of the country.

In this example, you’ll learn how to calculate in which of four regions the points are located: Northeast, Northwest, Southeast, or Southwest. Using Vertica Place functions ST_GeomFromText, ST_Intersects, and STV_DWithin, Looker aggregates the sales data to identify which region has the most sales. You can use Looker to drill down further to see local data, for example, which locations are within 250 miles of Dallas, Texas.

For best results, if you have a large number of polygons and points, leverage the optimized spatial joins using STV_INTERSECT instead of ST_INTERSECTS. Spatial indexes and STV_INTERSECT are the best way to scale when you want to calculate the intersections of millions of polygons and points.

For More Information