End-to-End Machine Learning Solution with OpenText Vertica and Qwak Using VerticaPy

This video covers the integration of Vertica and Qwak using VerticaPy. Leverage Vertica’s in-database Machine Learning functionality and Qwak’s MLOps capability to optimize your entire machine learning lifecycle.

Qwak is an end-to-end machine learning production platform. It reduces friction between ML research and production phases. Qwak is used to build, deploy, and monitor models in production reducing engineering effort. Qwak Features include CI/CD, Version Management, Model Analytics, and Feature Store.

This document provides an end-to-end solution from loading your data into Vertica, connecting Vertica and Qwak using VerticaPy to performing data science operations and deploying an ML model in the Qwak platform.

VerticaPy is a Python library that has scikit-like functionality used in machine learning and advanced analytics for Vertica.

Vertica and Qwak High-Level Design

The following is a high-level design of how Qwak connects to Vertica using VerticaPy to build, deploy, and monitor your machine learning models. Qwak allows you to customize your model structure from which you can connect to Vertica using VerticaPy as shown in the design diagram.

You can then explore, prepare data, build, train your model, and predict using model class. The sections that follow provide step by step instructions for

  • Using the Sample Dataset

  • Loading Data into Vertica

  • Installing and Setting Up Qwak SDK

  • Creating a Project

  • Creating a Model

  • Building the Model

  • Deploying the Model

  • Automating Model Build and Deploying

  • Predicting the Model

  • Querying Model Predictions

Environment

To begin you'll need to set up the following environment:

  • Qwak Cloud Instance

  • Vertica Analytical Database 12.0.4

  • VerticaPy library installed on the Qwak platform

  • Jupyter Notebook or any other ETL tool to load data into Vertica

Assumptions and Prerequisites

• Qwak is already setup either in the cloud or on-premises and the instance is up and running.

• Install Git and Install Python versions 3.7-3.9

• Install Qwak SDK.

• No firewall/connection issues exist from the Qwak to the Vertica instance.

Step by Step Machine Learning Solution to Predict Customer Churn

The goal of this solution is to build and train a model to predict which Telco user is likely to churn; that is, customers that will likely stop using Telco.

The example explains how to get started with a dataset in Vertica and use data exploration, data preparation, data modeling features in Qwak using VerticaPy functionality to identify trends in datasets. The example also describes converting Boolean values to numeric and create dummies to help the model understand the categorical variables. Finally, you can create a RandomForestClassifier model and train the model with the dataset, deploy the model to predict customer churn.

Note The following sections are collapsible/expandable. Ensure to click these topics to read more.

For More Information