Vertica and AutoQL by Chata: Technical Exploration
About this Document
The goal of this exploration is to understand the AutoQL technology and how it integrates with Vertica. This exploration describes the deployment and implementation process that takes place before a user can access Vertica from AutoQL. After the implementation process, we describe how to use the solution to ask questions and get insights from your data in Vertica.
AutoQL Overview
AutoQL by Chata is an innovative solution for self-service data analytics. AutoQL is a platform that enables users to access data in a conversational and intuitive manner. Users can ask their questions in their own words and instantly get the answers they need for decision making.
AutoQL securely connects to Vertica and uses Vertica’s JDBC driver to execute the queries directly in your database. The Vertica JDBC driver is pre-configured and bundled with AutoQL and managed by Chata.
Test Environment
Partner Product: AutoQL Integrator Portal
Note Versions of the AutoQL portal as well as the language models are managed and tracked internally by Chata’s team. Changes in the platform are made available and deployed dynamically to the customers environment when needed.
AutoQL URL: URL of your AutoQL deployment. AutoQL SaaS URL: https://autoql.chata.io/
Vertica Server version: Vertica Analytic Database v11.1.1-0
Vertica Server Platform: Red Hat Enterprise Linux 7.3
Vertica Client: Vertica JDBC 11.0.2. The Vertica JDBC driver is bundled and pre-configured by Chata.
AutoQL Deployment Options
Customers can choose one of the two options to deploy AutoQL Engine:
- AutoQL can be deployed in the customer’s own Kubernetes environment in the cloud or on-premises. In this deployment, query processing and customer’s data stays inside the customer’s network. For instructions on how to deploy AutoQL Engine, see section Deploying the AutoQL Integrator Portal in this document.
- AutoQL can be used as a hosted solution running in Google Cloud Platform (GCP) or Microsoft Azure and managed by Chata.
Note In this document, we describe the steps to deploy AutoQL in a customers environment for either a deployment in the cloud or on-premises. See Deploying the AutoQL Integrator Portal later in this document.
Before you Begin
Before accessing and exploring data in Vertica from AutoQL, it is necessary to prepare a language model specific to the customers database and deploy AutoQL Engine on the customers environment:
- AutoQL Language Model Implementation.
- Deploying the AutoQL Integrator Portal.
AutoQL Language Model
An AutoQL language model consists of a few Deep Learning Machine Learning models that translate natural language questions into SQL. The model can cover multiple topics/subjects in the database. For example, a model based on the VMart sample database, would cover online sales, inventory, promotions, and other interrelated topics/subjects.
AutoQL Language Model Implementation
The implementation of an AutoQL language model for a particular customer is performed by Chata’s team on Chata’s cloud environment. This implementation is necessary before using AutoQL with Vertica. The implementation of an AutoQL Language Model consists of the following steps:
- Model creation: Chata’s team securely access the customer’s database and uses the database structure to train the model. A few options are available for customers to share their database structure with Chata. Chata can connect to a staging database (recommended) or access a copy or a backup of the database. Once a secure access has been established, Chata uses a proprietary system that leverages the database structure to build the model.
- Model testing and validation: Once the model has been trained and is ready, the next step is to test and validate it to ensure a high-quality model is delivered to the customer. This validation is performed by Chata’s team on Chata’s cloud environment.
- Model publishing: Once the model has been validated and works as expected, then the model is deployed to the customers Kubernetes environment. This process is performed by Chata integration team.
Note A model can be trained on the customer’s specific terminology, so the interpretation of natural language questions is more precise.
Deploying the AutoQL Integrator Portal
The deployment of the AutoQL Engine is performed by the customer's cloud administrator or DevOps team. The AutoQL containers are deployed into a Kubernetes cluster running in the cloud or on-premises. The following are prerequisites and steps to deploy AutoQL Engine:
Prerequisites
- Ensure that you install Kubernetes and it is running.
- Create a Kubernetes cluster to install the AutoQL engine.
For Kubernetes information, refer to the Kubernetes documentation.
Downloading and Installing AutoQL
To download and install AutoQL integrator portal follow these steps:
- Download the Deployment Script from the Client Portal. The Client Portal is an interface available in the Chata’s cloud environment.
- Execute the Deployment Script via cloud SDK.
Note The deployment script pulls images from Chata’s container registry and instantiates services in the customers Kubernetes cluster. Access to the container registry is managed by IAM and Service Account.
Interacting with AutoQL Engine
Chata provides different options for developers and end-users to interact with the AutoQL engine.
Developers
Chata provides an API to interact with AutoQL. Developers can use the API to access the engine, embed the AutoQL functionality and visualizations into applications, and implement or extend functionality.
For API documentation see:
Main Page: https://chata.readme.io/
Quick Start: https://chata.readme.io/docs/autoql-quick-start
End-users
There are two out-of-the-box interfaces for business users to interact with AutoQL:
- AutoQL TeamsApp: TeamsApp is an application to access AutoQL from Microsoft Teams environment. TeamsApp can be added to a channel which enables database querying via the channel. A video about TeamsApp can be found here, Chata TeamsApp
- AutoQL WebApp: WebApp is a preconfigured user portal that allows users to create dashboards and execute queries against the projects in their Enterprise. The WebApp can be white-labelled, so users have a familiar environment to work in.
Connecting to Vertica from AutoQL
Once an AutoQL Language Model has been generated for your database and you have access to the AutoQL Integrator Portal, the next step is to create a new AutoQL project to connect to your Vertica database:
- Navigate to the AutoQL Integrator Portal on your browser. For example: https://<My_AutoQL_portal>/
- Enter your email and password and click LOG IN.
- On the left-hand side menu, click Projects.
- In the Sandbox or Production environment, click Add new Project icon.
Note You can use the Sandbox environment as a staging area to test the language models against your database. Then you can create the same project in the Production environment to give access to end users.
- In the Add New Project dialog window, in Project Information, enter a name for your project in the Project Company name field.
- Click Next.
- In Connect Database, enter your Vertica database connection information:
- Database Name: Vertica database name.
- Database URL: Vertica server IP address along with the port number in the following format: <ip_address>:<port>
- Type: Select Vertica from the drop-down list.
- Database options: Any JDBC properties you want to customize, for example setting a session label. See Vertica’s JDBC Connection Properties for more information.
- Username: Your database username.
- Password: Your database user password.
- Click Test Connection to test the connection to Vertica.
- Click Next.
- In Connect Model, select the language model you want to connect to from the list. This is the language model that Chata’s team prepared and published for your database. See section AutoQL Language Model Implementation in this document for details.
- Click Next.
- In Review, review the new project information and click Submit.
This will initiate a download of database specific models and instantiate model specific services in Kubernetes.
Creating the project can take several minutes. Once it is created it will appear in the list of projects. When the status of the project appears as Active you can begin querying your database.
Note All communication between the customer’s environment and Chata’s environment happens over a secure connection. The connection uses a service account and IAM that is provided in the download file from the Client Portal.
Using AutoQL Data Messenger
After you have created a project that connects to your Vertica database using an AutoQL language model, you can begin using AutoQL Data Messenger to ask questions about your data in Vertica.
Note During a user query, a call to the API is issued with a JWT token. The JWT token routes calls to the correct project.
To use AutoQL Data Messenger follow these steps:
- On the top right hand-side of the screen, click Demo Sandbox. The demo testing area appears.
- Select your project from the Select Project drop-down list.
- On the right-hand side of the screen, click the Open Data Messenger icon. The Data Messenger popup window displays.
- In the Data Messenger interface, type in the question you want to ask about your data. Data Messenger displays the answer as a chart.
Example 1: In this example we asked What is the average online sales by customer state year over year:
Example 2: In this example we asked What is the average online sales by transaction type in 2005:
Note You can customize the results to display in tabular format or other types of charts.
Using AutoQL Dashboards
To create a new dashboard in AutoQL follow this steps:
- In the home page, click Demo Sandbox at the top right hand-side of the screen. The demo testing area appears.
- Select your project from the Select Project drop-down list.
- Click Create a New Dashboard. Enter a name for your dashboard and click Save.
- Click Add a New Tile to get started.
- In the text boxes, type in the following information:
- Query: Type in a question about your data in your own words.
- Title (Optional): A description you want to display in the visualization.
- Click the play button next to the text boxes to submit the query and see the results. You can customize the visualizations just as you do in Data Messenger.
This is an example of a dashboard with two visualizations about Online Sales data:
AutoQL Generated Queries
AutoQL language models translate natural language questions into SQL, that then is sent to Vertica via the JDBC driver. For technical users, there are several ways to capture and examine the SQL statements generated by AutoQL:
- Using the Vertica log or Vertica system tables.
In a command line terminal in your Vertica server, navigate to the location of the Vertica log. Execute
tail -f vertica.log
to examine the queries that are issued by AutoQL and executed in your database.To query the system tables for the queries generated by AutoQL you can run a query like the following:
SELECT * FROM v_monitor.query_requests WHERE user_name = '<my_db_user>' and request_type = 'QUERY' order by start_timestamp desc
- Using the View generated SQL option in the Data Messenger interface.
In the visualization generated by Data Messenger, click the More options icon and select View generated SQL. Copy the query and examine it in a SQL tool or directly in VSQL.
Note The View generated SQL option is disabled by default. The AutoQL administrator can grant access to users if needed.
Recommendations
The following recommendations can be followed to enhance your experience when using AutoQL with Vertica.
Use the Report a Problem functionality
AutoQL collects feedback from the user to improve the language model. Click the Report a Problem icon on the right side in your visualization. If you ask a question and the data returned is incorrect or incomplete, use this feature to notify about the inconsistency.