Vertica Blog

Vertica Blog

SQL

Unleash the Power of Vertica and Apache Spark Using the Upgraded Spark Connector

This post is authored by Alex Le What is Apache Spark? Apache Spark is a distributed compute engine that provides a robust API for data science, machine learning, or to work with big data. It is fast, scalable, simple, and supports multiple languages, including Python, SQL, Scala, Java, and R. Backed by the Apache 2.0...
data lakehouse example

Going Beyond a Data Lakehouse

The way things get named in this industry can be pretty odd.  A lot of people have been hearing noise about how cool a “Data Lakehouse” is since it combines the power of a data lake with the power of a data warehouse. When I think of a lakehouse, I think of a great place...

Quick Tip: Replacing non-UTF-8 Characters

Vertica database servers expect to receive all data in UTF-8 and Vertica outputs all data in UTF-8. However, you can load or insert data into Vertica that is non UTF-8, but you'll want to clean it up. I used to recommend the REGEXP_REPLACE function for that task, but now there is a better way! Vertica...
Helpful Tips in blue text with magnifying glass

Quick Tip: Determining Database “Cell” Count

The Vertica system tables LICENSE_AUDITS and USER_AUDITS include a mysterious column called CELL_COUNT. What exactly is the Cell Count? It's simply a calculation that looks like this: DATABASE CELL_COUNT = (Table_1 Column Count * Table_1 Row Count) + (Table_2 Column Count * Table_2 Row Count) + … + (Table_n Column Count * Table_n Row Count)...
SQL

Quick Tip: WITH Clause Recursion

Starting with version 10.1.x, Vertica supports recursive queries. These are incredibly useful in modeling and working with self-referential hierarchical data. Let's take a look at a natural instance of this form of data: the manager-subordinate relationship. The following table contains employees (EMP_NAME), their IDs (EMP_ID), and the IDs of their managers (EMP_MGR_ID). For example, the...
Tips and Tricks Orange Button

Quick Tip: Create “Super Wide” Tables

Starting in Vertica 10.1.0, you can go super wide with up to 9,800 column definitions in a table! Prior to Vertica 10.1.x, tables were limited to 1,600 columns. Example Creating a table with 9,800 columns: Hint: Try to avoid running a SELECT * FROM … a table with 9,800 columns and a trillion+ rows 😊...

List the Missing Dates in a Date Series

The TIMESERIES clause in Vertica makes it easy to list all of the gap days (that is, those that are missing) from a series of dates. Say I have this simple date series: As you can see, there are several dates missing from the series (2020-10-30, 2020-11-02, 2020-11-03 and 2020-11-04). I can list these absent...
Compass rose with true north pointing to "Helpful Tips" text

Getting Every Nth Row from a Table (Enhanced)

In a previous Vertica Quick Tip we learned how to query every Nth row from a table when we had a unique ID column having sequential values (that is, from a sequence). See, https://www.vertica.com/blog/vertica-quick-tip-getting-every-n-th-row-table/ What if we don’t have a key like that in our table? No problem thanks to the ROW_NUMBER() Analytic Function! Say...
Electronic medical record on digital smart device

Vertica in Good Health: Ingesting, Securing, and Analyzing Healthcare Data

Electronic Medical Records and new personal healthcare technology, particularly medical devices, generate a huge amount of data. Organizations including Cerner and Philips Healthcare rely on Vertica for data analysis. How can Vertica help you analyze this data? Vertica excels at data ingest and provides hundreds of built-in SQL and analytic functions with broad integration with...

Constellation Research Report Finds Vertica 10 a Leading Choice

A few pages into a new, detailed overview of Vertica 10, produced by Constellation Research, principle analyst Doug Henschen summarizes the top new capabilities of Vertica 10 like this: In short, the Vertica team has steadily evolved the product to keep pace with the demands of organizations that seek to do the following: Unify diverse...
Quick Tip - blue button

Copy a Table with an Identity to a Table with an Identity on the Same Database Keeping the Identities in Sync

An IDENTITY associates a table column with a sequence. This sequence automatically increments the column value as new rows are added. There are some noteable restrictions on IDENTITY columns: You cannot INSERT values manually into an IDENTITY column You cannot UPDATE the values of an IDENTITY column once they has been populated via the sequence...
Visual Storytelling

VSQL, Visual Storytelling through Query Language

This tip expands on my earlier post: “Pure Vertica SQL Graph Flavors” Visual story telling has not just the ability to capture attention, it make things simple to understand. Consider the following to boost your query result set readability. What is the visual correlation between CPU utilization and Canary Queries elapsed time?      ...