Vertica QuickStart for Pentaho BI

To read this document in PDF form, click here.

The Vertica QuickStart for Pentaho BI is a sample application implemented as a set of Pentaho dashboards powered by Vertica Analytic Database. The dashboards present sample retail data for analysis. The QuickStart shows how retail companies could use Vertica and Pentaho to quickly explore, visualize, and gain insight into their data stored in Vertica.

You can download the Vertica QuickStart for Pentaho BI from the following location:

https://www.vertica.com/quickstart/vertica-quickstart-for-pentaho/

About the Vertica QuickStarts

The Vertica QuickStarts are free, sample applications created using front-end products from Vertica technology partners. For an overview, watch this short video:

The QuickStarts are posted for download on the Vertica QuickStart Examples page.

Note The Vertica QuickStarts are freely available for demonstration and educational purposes. They are not governed by any license or support agreements and are not suitable for deployment in production environments.

About Pentaho BI

Pentaho BI is a web-based tool for creating interactive dashboards that perform business analytics. For product details, see the Pentaho Business Analytics website.

Requirements

The Vertica QuickStart for Pentaho BI requires a Vertica database server with a standard installation of the VMart example database, a Vertica client with JDBC, Pentaho Server, and a web browser for running the Pentaho dashboards.

The QuickStart was created using Pentaho 5.4 and the Vertica Analytic Database.

Install the Software

To install the software that is required for running the QuickStart, follow these steps:

Install the Vertica Database Server

If you do not already have Vertica, you can download the Community Edition free of charge:

  1. Navigate to vertica.com.
  2. Log in or click Register Now to create an account
  3. On the Downloads menu, click Community Edition.
  4. Follow the on-screen instructions to download and install the Vertica Community Edition.

Install the VMart Example Database

The Vertica QuickStart for Pentaho BI assumes a default installation of the Vertica VMart example database.

To install VMart, follow the instructions in the Vertica documentation:

For details, see VMart Example Database Schema, Tables, and Scripts.

Install Pentaho Server

Pentaho Server is available for Windows, Mac OS, and Linux. See the Pentaho Components Reference for compatibility information.

To install Pentaho Server 5.4: 

  1. Navigate to the following URL:

       http://sourceforge.net/projects/pentaho/files/Business%20Intelligence%20Server/5.4/

  2. Click Download to download the version of Pentaho for your platform.
  3. Extract the contents of the download file. The files are extracted into the following directory:

    <Pentaho Home>\biserver-ce-5.4.0.1-130\biserver-ce\

To start Pentaho Server, execute the start_pentaho command. For example, on Windows:

C:\Pentaho\biserver-ce-5.4.0.1-130\biserver-ce\start-pentaho

To stop Pentaho Server, execute the stop_pentaho command. For example, on Windows:

C:\Pentaho\biserver-ce-5.4.0.1-130\biserver-ce\stop-pehtaho

Install the JDBC Client Driver

Before you can connect to Vertica using Pentaho you must download and install a Vertica client package. This package includes the Vertica JDBC driver that Pentaho uses to connect to Vertica.

To download and install the JDBC driver: 

  1. Go to http://www.vertica.com/resources/vertica-client-drivers/ for the latest client drivers.
  2. Download the Vertica client package for your platform.
  3. Place the jar file in the following directory:

    <Pentaho Home >\biserver-ce\tomcat\lib

    For example, on Windows:

    C:\Pentaho\biserver-ce\tomcat\lib

Note Vertica drivers are forward compatible, so you can connect to the Vertica server using previous versions of the client. For more information, see Client Driver and Server Version Compatibility in the Vertica documentation.

Connect Pentaho Server with Vertica

The following steps show the process of configuring a connection to Vertica using Pentaho Server on Windows.

  1. From the Pentaho Server File menu, select New, then Data Source.

    Pentaho_Select_Datasource.jpg

  2. Click New Data Source.

    Pentaho_Select_Datasource.jpg

  3. In the Database Connection dialog box:

    • For Database Type, select Vertica 5+.
    • For Access, select Native (JDBC).
    • Fill out the rest of the connection details.
    • Click Test to test the connection.
  4. When the connection succeeds, click OK.

Download the QuickStart Dashboards

  1. Navigate to vertica.com/quickstart
  2. Select Vertica QuickStart for OBIEE.
  3. Log in or create an account.
  4. Click Download.

Deploy the QuickStart Dashboards

To deploy the dashboards: 

  1. Create a folder for deploying the QuickStart dashboards.
  2. In Pentaho Server, select Browse Files from the File menu.
  3. Navigate to the deployment folder that you created.
  4. Under Folder Actions, click Upload.
  5. Navigate to the folder that contains zip, the QuickStart zip file that you downloaded from the Big Data Marketplace
  6. Select the zip file.

The following screenshot shows the Upload action with a deployment folder called VMART.

deploy1.png

About the QuickStart Dashboards

The QuickStart dashboards present sample business and operational data that a large retail chain might track over time. The chain operates brick-and-mortar stores and an online marketplace. It sells a wide variety of products that it purchases from different vendors.

Note The data in your dashboards will not match the data in the screen shots in this document. This is because the VMart data generator generates data randomly.

Executive Dashboard

The Executive Dashboard presents a high-level view of the business data that is shown in greater detail in the other dashboards. The dashboard also allows you to take a closer look at the data for a specific state, in this case Texas. The dashboard shown here is displaying revenue from sales to resellers (companies) in 2005 and 2006. You can see at a glance that:

  • Store sales were more volatile in 2005 than in 2006.
  • The lowest revenue month for store sales was June 2005.
  • There was a sharp drop in online sales in January and February of 2006.

dsh_executive_sales.jpg

By changing the Sale/Return filter from Sale to Return, you can see the the negative revenue resulting from products returned during the same time period.

dsh_executive_returns.jpg

Online Sales Overview

The Online Sales Overview presents an overview of the online business. In this instance, the dashboard is displaying quarterly revenue from sales to resellers in the SouthWest region in 2005 and 2006.

The filters on the bottom of the page allow you to examine the data in detail. For example, by viewing product categories, you can see that food was by far the greatest source of revenue from online sales during this time period.

dsh_onlinesales_top.jpg

Store Sales Overview

The Store Sales Overview presents an overview of the traditional business conducted in the brick-and-mortar stores owned by this retail chain. In this instance, the dashboard is displaying quarterly revenue from store sales to resellers in 2005 and 2006.

The filters on the bottom of the page allow you to examine the data in detail For example, by selecting Has Membership Card for Customer Attribute, you can see that a membership card had no effect on sales to resellers.

dsh_storesales.jpg

Customer Dashboard

This Customer Dashboard presents information about customers, both individuals and resellers. In this instance, the dashboard is showing the individual customers in the SouthWest region who returned items they purchased either online or in stores in 2005 and 2006.

For additional insight into your customer base, you can view customer characteristics such as age, gender, and income by selecting a different Customer Attribute.

dsh_customer.jpg

Call Center Dashboard

The Call Center Dashboard presents an overview of the performance of the chain’s sales personnel, both in stores and in online call centers. In this instance, the dashboard is displaying data for 2005 and 2006. It’s clear that the performance of sales personnel in stores was far more uniform than the performance of call center personnel during this time period.

dsh_callclenter.jpg

Product Dashboard

The Product Dashboard presents an overview of the products sold in stores and online. In this instance, the dashboard is displaying revenue from products sold in the Southwest region in 2005 and 2006. dsh_product.jpg

Vendor Performance

The Vendor Performance Dashboard presents an overview of the performance of the vendors used by this retail chain. In this instance, the dashboard is displaying Average days to deliver during 2005 and 2006 for the vendor Market Wholesale. It’s clear that this vendor’s deliveries to California were seriously delayed during this time period.

dsh_vendor.jpg 

Inventory Overview

The Inventory Dashboard presents an overview of the inventory held by this retail chain. In this instance, the dashboard is showing the inventory broken down by warehouse and product category for 2005 and 2006.

dsh_inventory.jpg

Find More Information