HDFS Connector Requirements

Uninstall Prior Versions of the HDFS Connector

The HDFS Connector is now installed with Vertica; you no longer need to download and install it separately. If you have previously downloaded and installed this connector, uninstall it before you upgrade to this release of Vertica to get the newest version.

WebHDFS Requirements

The HDFS Connector connects to the Hadoop file system using WebHDFS, a built-in component of HDFS that provides access to HDFS files to applications outside of Hadoop. This component must be enabled on your Hadoop cluster. See your Hadoop distribution's documentation for instructions on configuring and enabling WebHDFS.

Note: HTTPfs (also known as HOOP) is another method of accessing files stored in HDFS. It relies on a separate server process that receives requests for files and retrieves them from HDFS. Since it uses a REST API that is compatible with WebHDFS, it could theoretically work with the connector. However, the connector has not been tested with HTTPfs and OpenText does not support using the HDFS Connector with HTTPfs. In addition, since all of the files retrieved from HDFS must pass through the HTTPfs server, it is less efficient than WebHDFS, which lets Vertica nodes directly connect to the Hadoop nodes storing the file blocks.

Kerberos Authentication Requirements

The HDFS Connector can connect to HDFS using Kerberos authentication. To use Kerberos, you must meet these additional requirements: