HDFS Connector Requirements
Uninstall Prior Versions of the HDFS Connector
The HDFS Connector is now installed with Vertica; you no longer need to download and install it separately. If you have previously downloaded and installed this connector, uninstall it before you upgrade to this release of Vertica to get the newest version.
WebHDFS Requirements
The HDFS Connector connects to the Hadoop file system using WebHDFS, a built-in component of HDFS that provides access to HDFS files to applications outside of Hadoop. This component must be enabled on your Hadoop cluster. See your Hadoop distribution's documentation for instructions on configuring and enabling WebHDFS.
Note: HTTPfs (also known as HOOP) is another method of accessing files stored in HDFS. It relies on a separate server process that receives requests for files and retrieves them from HDFS. Since it uses a REST API that is compatible with WebHDFS, it could theoretically work with the connector. However, the connector has not been tested with HTTPfs and OpenText does not support using the HDFS Connector with HTTPfs. In addition, since all of the files retrieved from HDFS must pass through the HTTPfs server, it is less efficient than WebHDFS, which lets Vertica nodes directly connect to the Hadoop nodes storing the file blocks.
Kerberos Authentication Requirements
The HDFS Connector can connect to HDFS using Kerberos authentication. To use Kerberos, you must meet these additional requirements:
- Your Vertica installation must be Kerberos-enabled.
- Your Hadoop cluster must be configured to use Kerberos authentication.
- Your connector must be able to connect to the Kerberos-enabled Hadoop cluster.
- The Kerberos server must be running version 5.
- The Kerberos server must be accessible from every node in your Vertica cluster.
- You must have Kerberos principals (users) that map to Hadoop users. You use these principals to authenticate your Vertica users with the Hadoop cluster.