Vertica Blog

Vertica Blog

ETL

Fast Data Loading with Vertica

Curtis Bennett authored this post. Vertica is well known for its blinding query performance at big data scale, but it can also insert data at very high rates of speed. It can even load data non-stop while being queried, thus enabling real-time analysis of data. Basic Loading Methods There are many ways of loading data...

What are your Data Loading Preferences?

We’re back with our newest product management survey for this summer! This time we’re asking about how you load your data – everything from the ETL tools you use to how you manage your information. The answers you provide will help us fit Vertica into your infrastructure in a way that is ideal for your...

Rebalance Taking a Long Time

After you add a node to your Vertica cluster or remove a node from your cluster, Vertica rebalances the data across all the nodes. If rebalancing is taking a long time, review these steps to find out the probable cause. Pre-Requisites To ensure a successful rebalance of your cluster, before you start the rebalance, take...

What Should I do if my Node Recovery is Slow?

If you are running Vertica 7.2.x or later, perform recovery by table. For details, see in the Vertica documentation. If you are running a Vertica version prior to 7.1.x, stop the ETL jobs and restart node recovery. Step Task Results 1 Monitor progress of recovery: If is_running = f, recovery completed. Repeat this statement to...

Sizing Your Vertica Cluster for an Eon Mode Database

This blog post was authored by Shrirang Kamat. Vertica in Eon Mode is a new architecture that separates compute and storage, allowing users to take advantage of cloud economics that enable rapid scaling and shrinking of clusters in response to a variable workload. Eon Mode decouples the cluster size from the data volume and lets...

Handling Duplicate Records in Input Data Streams

This blog post was authored by Ravi Gupta. We have often found that sources or operational systems that provide data for further analysis have duplicate records and these are sent to a downstream application or EDW for processing. This post shows a few scenarios of how to handle these duplicate records using various SQL options,...
Commercial passenger plane with Vertica painted on the tail

Blog Post Series: Using Vertica to Track Commercial Aircraft in near Real-Time – Part 6

Part 6: Extract, Transform and Load ADS-B messages into Kafka I have discussed in previous blog posts the continuous stream of messages from aircraft transponders, captured and decoded using the DUMP1090 application, which we are planning on feeding into a series of Kafka topics, prior to loading into their corresponding tables in a Vertica database....

Introducing the Parallel Streaming Transformation Loader (PSTL) Solution

This blog post was authored by Soniya Shah. At Vertica, we understand how important it is that our customers can make decisions in near real time. Being able to do this not only requires the massive parallel processing that Vertica offers, but the ability to transform and ingest your data into Vertica as quickly as...

The Kafka Streaming Load Scheduler

This blog post was authored by Tom Wall. Vertica’s streaming load scheduler provides high-performance streaming data load from Kafka into your Vertica database. Whether you already use Kafka or not, it is worth considering it as a solution to your data loading challenges. Kafka complements Vertica very nicely, and the scheduler removes many complexities of...

Jump Start your ETL Application Development with Vertica

Interested in exploring the Vertica Analytic Database in the context of data movement and transformation? To get a feel for it, try our new ETL QuickStart sample apps. You'll find them on the . Our Partner Engineering team develops QuickStart apps using tools from our technology partners. Currently we have ETL QuickStarts for the following...

Workload Management Metrics ? A Golden Triangle

Modern databases are often required to process many different kinds of workloads, ranging from short/tactical queries, to medium complexity ad-hoc queries, to long-running batch ETL jobs to extremely complex data mining jobs (See my previous blog on workload classification for more information.) DBAs must ensure that all concurrent workload, along with their respective Service Level...

A Method for Vertica Workload Classification

Modern analytic databases such as Vertica often need to process a myriad of workloads ranging from the simplest primary-key lookup to complex analytical queries that include dozens of large tables and joins between them. Different types of load jobs (such as batch type ETL jobs and near real-time trickle loads) keep the data up-to-date in...