Vertica Blog

Vertica Blog

Integrations

Using Java UDX in Vertica

Michael Flower authored this post. Introduction Vertica has a highly extensible UDx framework, which allows external user-defined functions, parsers and data loaders to be installed onto the Vertica server. This means that a routine written in C++, R, Java or Python can be run in-database as a Vertica SQL function. This blog is based on...

Introducing the VerticaPy Library for Jupyter Notebooks

One of the coolest things about working at Vertica is our amazing intern program, which often leads to full-time hires. Last year, the VerticaPy library, also known as vpython, was started as an internship project by Badr Ouali. A year later, he works for Vertica full time and has seen his project through into an...
Programmer

Saving an Apache Spark DataFrame to a Vertica Table

Before you save an Apache Spark DataFrame to a Vertica table, make sure that you have the following setup: • Vertica cluster • Spark cluster • HDFS cluster. The Vertica Spark connector uses HDFS as an intermediate storage before it writes the DataFrame to Vertica. This checklist identifies potential problems you might encounter when using...

Why is Vertica not Ingesting Data From Kafka?

Prerequisite: Verify that Vertica is up and running. If you want to troubleshoot why Vertica is not ingesting data from Kafka, follow this checklist. Step Task Results 1 Check whether Kafka is up and running. a. Examine the server log files for broker errors: If there are errors, consult the Kafka documentation. b. Examine the...

Important: gcc and Ubuntu 16.04 Incompatibility

Vertica 9.0.1-9 and later 9.0 hotfixes now support Ubuntu 16.04. However, Ubuntu 16.04 ships with a compiler that is incompatible with the Vertica C++ SDK. To compile UDxs on this platform, you must install packages for 4.8 compatibility: $ sudo apt-get install gcc-4.8 $ sudo apt-get install g++-4.8 $ cd /usr/bin $ sudo ln -s...
Three 3D arrows, different colors pointing in different directions

IMPORTANT: Vertica and Amazon Linux 2.0 gcc Compatibility Problems

This blog post was authored by Monica Cellio, with Serge Bonte. The Vertica binaries are compiled using the default version of g++ installed on the supported Linux platforms. Vertica requires a minimum of gcc version 4.8.4. The default version of g++ on Amazon Linux 2.0 is not compatible with gcc 4.8.4. To be able to...
Programmer

How Cisco and Vertica empower high performance analytics for the most demanding workloads

This blog post was authored by Steve Sarsfield. Hadoop and HDFS is capable of storing massive volumes of data, but performing analytics on Hadoop can be challenging. Despite the apparent low-cost cost of Hadoop, it is best suited for data lake and data science solutions, where the number of concurrent analytical users is low. In...

Introducing the Parallel Streaming Transformation Loader (PSTL) Solution

This blog post was authored by Soniya Shah. At Vertica, we understand how important it is that our customers can make decisions in near real time. Being able to do this not only requires the massive parallel processing that Vertica offers, but the ability to transform and ingest your data into Vertica as quickly as...

The Kafka Streaming Load Scheduler

This blog post was authored by Tom Wall. Vertica’s streaming load scheduler provides high-performance streaming data load from Kafka into your Vertica database. Whether you already use Kafka or not, it is worth considering it as a solution to your data loading challenges. Kafka complements Vertica very nicely, and the scheduler removes many complexities of...
Visual data flow graph showing parallel spark Vertica data sharing

Integrating with Apache Spark

The Vertica Connector for Apache Spark is a fast parallel connector that allows you to use Apache Spark for pre-processing data. Apache Spark is an open-source, general purpose, cluster-computing framework. The Spark framework is based on Resilient Distributed Datasets (RDDs), which are logical collections of data partitioned across machines. For more information, see the Apache...