Using the HCatalog Connector with HA NameNode

Newer distributions of Hadoop support the High Availability NameNode (HA NN) for HDFS access. Some additional configuration is required to use this feature with the HCatalog Connector. If you do not perform this configuration, attempts to retrieve data through the connector will produce an error.

To use HA NN with Vertica, first copy /etc/hadoop/conf from the HDFS cluster to every node in your Vertica cluster. You can put this directory anywhere, but it must be in the same location on every node. (In the example below it is in /opt/hcat/hadoop_conf.)

Then uninstall the HCat library, configure the UDx to use that configuration directory, and reinstall the library:

=> \i /opt/vertica/packages/hcat/ddl/uninstall.sql 
	DROP LIBRARY 

=> ALTER DATABASE mydb SET JavaClassPathSuffixForUDx = '/opt/hcat/hadoop_conf'; 
	WARNING 2693: Configuration parameter JavaClassPathSuffixForUDx has been deprecated; 
			setting it has no effect 				
=> \i /opt/vertica/packages/hcat/ddl/install.sql 
	CREATE LIBRARY 
	CREATE SOURCE FUNCTION 
	GRANT PRIVILEGE 
	CREATE PARSER FUNCTION 
	GRANT PRIVILEGE 

Despite the warning message, this step is necessary.

After taking these steps, HCatalog queries will now work.