Vertica Blog

Vertica Blog


Helpful Tips in blue text with magnifying glass

Viewing Parquet Export Events More Easily

The EXPORT TO PARQUET command exports a table, columns from a table, or query results to files in the Parquet format. When you run EXPORT TO PARQUET information about the files created during the export is stored in the Vertica log. It's no fun combing through a Vertica log looking for those particular records. Good...
Red arrow points to silver ball with 4 red arrows pointing away representing dividing a single thing into multiple

Parallel Processing Using Partitions With Vertica UDx

You can add functionality to Vertica using UDx, but what if you need to process more data than can be efficiently processed in a single thread or a single node? Vertica can divide data into partitions defined with the OVER() clause and distribute computing across nodes. This partition processing is "shared-nothing" similar to the Map...
View from above of a sailing ship in the ocean with the sail full and the crew leaning to one side to balance

Announcing Vertica Version 9.3 – Ride the Winds of Change

The winds of change have been blowing strong. This week, Vertica version 9.3 made its big splash. Vertica has a tendency to pack a lot of features even into minor releases, and this is not a minor release. Rather than dive deep, I’m going to skim the surface of the ocean of new features. Eon...
Vertica and Python logos stacked

Extending Vertica with Python functions: Adding NumPy FFT as a UDx

User-Defined Transform Function (UDTF) support for Python UDx were added back in Vertica 9.1, allowing you to add a much greater range of existing libraries and functions to Vertica. In this example, I'll add Fast Fourier Transform (FFT) from the NumPy package. FFT is a way to transform time-domain data into frequency-domain data. My test...
Clusters of points colored differently by grouping on a black background

Finding the “K” in K-means Clustering With a UDx

You can apply k-means clustering to partition data points into k different groups. Along with the data, the number of clusters "k" is an input to the algorithm. Common examples like the Iris data set tell you upfront how many different groups exist, so you set k=3. What if you don't know how many clusters...

Find the Number of Days Passed and Remaining in the Relative Year

Jim Knicely authored this post. Although there aren’t any specific functions that will return the number of days that have passed and that are remaining in a given year, you can combine a few of Vertica’s built-in date functions to find these numbers. You can encapsulate the date logic above into several user-defined functions that...

Using Java UDX in Vertica

Michael Flower authored this post. Introduction Vertica has a highly extensible UDx framework, which allows external user-defined functions, parsers and data loaders to be installed onto the Vertica server. This means that a routine written in C++, R, Java or Python can be run in-database as a Vertica SQL function. This blog is based on...

Introducing the Developing Vertica UDxs in Java Tutorial Series

What is a Java UDx and why would you want to develop one? Check out our new tutorial series in the wiki to find out!