Integrating with Apache Hadoop

Apache™ Hadoop™, like Vertica, uses a cluster of nodes for distributed processing. The primary component of interest is HDFS, the Hadoop Distributed File System.

You can use Vertica with HDFS in several ways:

See Hadoop Interfaces for more information about these options.

Hadoop file paths are expressed as URLs in the hdfs or webhdfs URL scheme. For more about the hdfs scheme, see Using HDFS URLs.

Hadoop Distributions

Vertica can be used with Hadoop distributions from Hortonworks, Cloudera, and MapR. See Vertica Integrations for Hadoop for the specific versions that are supported.

If you are using Cloudera, you can manage your Vertica cluster using Cloudera Manager. See Integrating With Cloudera Manager.

If you are using MapR, see Integrating Vertica with the MapR Distribution of Hadoop.

In This Section