Vertica Blog
Bryan Herger

Bryan Herger

Vertica Big Data Solution Architect at Micro Focus

My background is analytical chemistry and current interests and work are in finance, adtech, health information, and real-time streaming analytics.

Connect With Bryan on

Are Your Columns too Wide?

This tip expands on the earlier post on encoding and compression at When you have millions to billions of rows, data type becomes a bit more important: even an extra 10 bytes per row across a huge data set will impact storage or performance (or both!). When I created the big_fact_table, I included some VARCHAR...

Checking and Improving Column Compression and Encoding

When working with terabytes of data, storage and transfer become major time and cost sinks. Vertica can help minimize storage cost and transfer time with column compression and encoding. How can we identify Vertica tables that might benefit from compression? Information about column size and current compression is stored across column_storage and projection_columns table. The...
Red arrow points to silver ball with 4 red arrows pointing away representing dividing a single thing into multiple

Parallel Processing Using Partitions With Vertica UDx

You can add functionality to Vertica using UDx, but what if you need to process more data than can be efficiently processed in a single thread or a single node? Vertica can divide data into partitions defined with the OVER() clause and distribute computing across nodes. This partition processing is "shared-nothing" similar to the Map...
Vertica and Python logos stacked

Extending Vertica with Python functions: Adding NumPy FFT as a UDx

User-Defined Transform Function (UDTF) support for Python UDx were added back in Vertica 9.1, allowing you to add a much greater range of existing libraries and functions to Vertica. In this example, I'll add Fast Fourier Transform (FFT) from the NumPy package. FFT is a way to transform time-domain data into frequency-domain data. My test...
Clusters of points colored differently by grouping on a black background

Finding the “K” in K-means Clustering With a UDx

You can apply k-means clustering to partition data points into k different groups. Along with the data, the number of clusters "k" is an input to the algorithm. Common examples like the Iris data set tell you upfront how many different groups exist, so you set k=3. What if you don't know how many clusters...
Disk usage, traffic, and other usage concerns

Diving into Disk Usage

Would you like to know how much disk space Vertica is using as it runs? This could be useful for capacity planning, monitoring trends, or debugging. Here are some ways to follow disk usage trends and also look at temporary events like Tuple Mover and Join Spills: Helpful Link: Have fun!
Vintage businessman concept wearing futuristic helmet at office

Find and Fix Issues from Vertica Query Events

Vertica offers tools like the Workload Analyzer in Management Console (MC) to tune up a Vertica Cluster, but there's a simple way to find and fix issues that Vertica observes and records if you aren't using MC. The query_events table captures optimization issues and suggests fixes. Let's take a look at my demo cluster, checking...

Streaming Data in One Line!

Remember that game show, "Name That Tune", where contestants were challenged to name a tune in as few notes as possible? Today's tip converts that for Vertica Big Data, showing how we can ingest streaming data in just one line! A simple way to stream data is to write CSV rows to a network socket....
Digital image of a trash can made of data points on blue background

Watch those Delete Vectors!

Vertica is very good at ingesting data, compressing it, and querying at high speed. The trade-off here is that the data is stored in large block files called ROS containers. These containers can grow to large sizes, sometimes over 10 GB, and this makes it impractical to decompress and edit the files during updates and...
Lock

Looking into details of locking

(thanks  for query and advice) If you have a lot of concurrent queries, especially mixing DDL and DML, you might see lock contention. If you'd like to see how locks interact in your system, the following queries generate a temporary table with Gantt chart to show an ordered list of locks over time: Query (be...
Hand writing the text: Helpful Tips

Analyzing JSON Already Loaded into Vertica Regular Tables

Vertica flex tables allow you to query data in JSON format. But what if you've imported JSON objects into VARCHAR already? Here's how you can extract the JSON into flex tables without exporting and importing. Let's create a sample table with a JSON column to convert: Now let's create a flex table: Flex tables use...