Michael Flower authored this post.
Introduction
Vertica has a highly extensible UDx framework, which allows external user-defined functions, parsers and data loaders to be installed onto the Vertica server. This means that a routine written in C++, R, Java or Python can be run in-database as a Vertica SQL function.
This blog is based on the existing documentation and examples, which are provided from the Vertica documentation and onGitHub.
• The Java UDX features are documented in the “Extending Vertica” Guide –
https://my.vertica.com/docs/latest/HTML/index.htm#Authoring/ExtendingVertica/ExtendingVertica.htm
• Examples are available at
https://github.com/vertica/UDx-Examples
• More information about those examples is available also in the Extending Vertica Guide –
https://my.vertica.com/docs/latest/HTML/index.htm#Authoring/ExtendingVertica/UDx/UDxExampleCodeGitHub.htm
The purpose of this document is to tie together these available sources.
The document was tested against the latest available version of Vertica (version v9.1.0-1 at the time of writing), running on CentOS 7.4.1708.
Step 1. Download and Install Vertica
If you are unfamiliar with Vertica, or you don’t have the physical hardware on which to run Vertica, test it for free by downloading Community Edition onto a prebuilt Virtual Machine –
https://my.vertica.com/download/vertica/community-edition/

The page contains 2 relevant sections:
–
Download – download the virtual machine of your choice. This guide was written using the “VMWare VMX file format” download.
–
Resources – Follow the instructions contained in the resource links to complete the installation process.
This guide was developed using the Vertica Community Edition 9.1.0 which is preinstalled on the Virtual Machine (vertica_community_edition-9.1.0-0_vmx.zip). In order to use the VM, you will also need to have a suitable VM player (eg
https://www.vmware.com/go/downloadplayer )
Step 2. Start the Virtual Machine
The CE VM contains a preinstalled version of Vertica. It already has a running database, called “Vmart”. If you are prompted for a password, it is
password
Start the Virtual Machine.
You will have 2 available logins:
• root – this user will be used for system administration. Apart from step 3, this account will not be used.
• dbadmin – This user is responsible for the Vertica administration. It is the main user account we will use herein.
Step 3. Install the Java Development Kit
The JDK needs to be installed on the Vertica server machine in order to compile the provided Java functions.
Login as the root user and install the Java Development Kit.
On Centos / Redhat, run
yum install java-devel
Step 4. Download the Java UDX Examples from GitHub
Download the UDX zip from GitHub-
https://github.com/vertica/UDx-Examples

Login as user
dbadmin
Transfer the downloaded files onto your Vertica server.
Copy the zipfile “UDx-Examples-master.zip” to the Vertica server machine.
Open a Terminal command prompt and unzip the file.
unzip UDx-Examples-master.zip
snip

Go to the Java folder
cd /home/dbadmin/UDx-Examples-master/Java-and-C++
Step 5. Link Vertica to the Java executable area
Login as user
dbadmin
Define the Vertica JavaBinaryForUDx config parameter
eg Find where is the java executable
whereis java

Make a note of the result. In this case it is /usr/bin/java
Run the Vertica SQL command to change the setting
vsql -w password -c "ALTER DATABASE VMart SET JavaBinaryForUDx = '/usr/bin/java';"
Step 6. Edit the makefile to link to the Java Executable area
Login as user
dbadmin
Edit the makefile and change the setting for the JAVA_HOME variable.
eg
whereis java

Make a note of the result. In this case it is “/usr/bin/java”. The root directory will be “/usr”
vi makefile
Change the value of JAVA_HOME:
replace
JAVA_HOME ?= $(SOURCE)/../third-party/jdk/jdk1.6.0_45
with
JAVA_HOME ?= /usr
Save the file and exit the file editor.
Step 7. Compile the Java code samples and create associated Vertica User Defined Functions
Login as user
dbadmin
In order to compile the Java functions, run the “make” command with the relevant parameter.
make JavaFunctions
snip
Step 8. Test the Java User Defined Functions
Login as user
dbadmin
Run the Java UDX functions by executing the provided script. The script will:
• Create the necessary tables.
• Populate the tables with sample data.
• Run the example functions.
• Perform necessary housekeeping.
vsql –w password -f JavaFunctions.sql
snip
Step 9. Test the Java User Defined Parser and Filter Functions
Java functions can be embedded into Vertica’s SQL COPY command, which means that user defined transformations can be included during the data load process.
To run the provided examples, login as user
dbadmin
Run the provided script
vsql –w password -f JavaUDLFunctions.sql

Be patient. One of the steps generates a 10m row data set which takes most of the time. The next step loads the dataset into Vertica, which takes only a little time!