Separate Clusters

With separate clusters, a Vertica cluster and a Hadoop cluster share no nodes. You should use a high-bandwidth network connection between the two clusters.

The following figure illustrates the configuration for separate clusters::

Network

The network is a key performance component of any well-configured cluster. When Vertica stores data to HDFS it writes and reads data across the network.

The layout shown in the figure calls for two networks, and there are benefits to adding a third:

Hadoop Configuration Parameters

For best performance, set the following parameters with the specified minimum values:

Parameter Minimum Value
HDFS block size 512MB
Namenode Java Heap 1GB
Datanode Java Heap 2GB