Troubleshooting HCatalog Connector Problems

You may encounter the following issues when using the HCatalog Connector.

Connection Errors

When you use CREATE HCATALOG SCHEMA to create a new schema, the HCatalog Connector does not immediately attempt to connect to the HiveServer2 or metastore servers. Instead, when you execute a query using the schema or HCatalog-related system tables, the connector attempts to connect to and retrieve data from your Hadoop cluster.

The types of errors you get depend on which parameters are incorrect. Suppose you have incorrect parameters for the metastore database, but correct parameters for HiveServer2. In this case, HCatalog-related system table queries succeed, while queries on the HCatalog schema fail. The following example demonstrates creating an HCatalog schema with the correct default HiveServer2 information. However, the port number for the metastore database is incorrect.

=> CREATE HCATALOG SCHEMA hcat2 WITH hostname='hcathost' 
-> HCATALOG_SCHEMA='default' HCATALOG_USER='hive' PORT=1234;
CREATE SCHEMA
=> SELECT * FROM HCATALOG_TABLE_LIST;
-[ RECORD 1 ]------+---------------------
table_schema_id    | 45035996273864536
table_schema       | hcat2
hcatalog_schema    | default
table_name         | test
hcatalog_user_name | hive

=> SELECT * FROM hcat2.test;
ERROR 3399:  Failure in UDx RPC call InvokePlanUDL(): Error in User Defined 
 Object [VHCatSource], error code: 0
com.vertica.sdk.UdfException: Error message is [ 
org.apache.hcatalog.common.HCatException : 2004 : HCatOutputFormat not 
initialized, setOutput has to be called. Cause : java.io.IOException: 
MetaException(message:Could not connect to meta store using any of the URIs 
provided. Most recent failure: org.apache.thrift.transport.TTransportException: 
java.net.ConnectException: 
Connection refused
at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(
HiveMetaStoreClient.java:277)
. . .

To resolve these issues, you must drop the schema and recreate it with the correct parameters. If you still have issues, determine whether there are connectivity issues between your Vertica cluster and your Hadoop cluster. Such issues can include a firewall that prevents one or more Vertica hosts from contacting the HiveServer2, metastore, or HDFS hosts.

You may also see this error if you are using HA NameNode, particularly with larger tables that HDFS splits into multiple blocks. See Using the HCatalog Connector with HA NameNode for more information about correcting this problem.

UDx Failure When Querying Data: Error 3399

You might see an error message when querying data (as opposed to metadata like schema information). This might be accompanied by a ClassNotFoundException in the log. This can happen for the following reasons:

If you did not copy the libraries or configure LZO compression, follow the instructions in Configuring Vertica for HCatalog.

If the Hive jars that you copied from Hadoop are out of date, you might see an error message like the following:

ERROR 3399: Failure in UDx RPC call InvokePlanUDL(): Error in User Defined Object [VHCatSource], 
error code: 0 Error message is [ Found interface org.apache.hadoop.mapreduce.JobContext, but 
class was expected ] 
HINT hive metastore service is thrift://localhost:13433 (check UDxLogs/UDxFencedProcessesJava.log 
in the catalog directory for more information) 

This error usually signals a problem with hive-hcatalog-core jar. Make sure you have an up-to-date copy of this file. Remember that if you rerun hcatUtil you also need to re-create the HCatalog schema.

You might also see a different form of this error:

ERROR 3399: Failure in UDx RPC call InvokePlanUDL(): Error in User Defined Object [VHCatSource], 
error code: 0 Error message is [ javax/servlet/Filter ] 

This error can be reported even if hcatUtil reports that your libraries are up to date. The javax.servlet.Filter class is in a library that some versions of Hadoop use but that is not usually part of the Hadoop installation directly. If you see an error mentioning this class, locate servlet-api-*.jar on a Hadoop node and copy it to the hcat/lib directory on all database nodes. If you cannot locate it on a Hadoop node, locate and download it from the Internet. (This case is rare.) The library version must be 2.3 or higher.

After you have copied the jar to the hcat/lib directory, reinstall the HCatalog connector as explained in Configuring Vertica for HCatalog.

SerDe Errors

Errors can occur if you attempt to query a Hive table that uses a nonstandard SerDe. If you have not installed the SerDe JAR files on your Vertica cluster, you receive an error similar to the one in the following example:

=> SELECT * FROM hcat.jsontable;
ERROR 3399:  Failure in UDx RPC call InvokePlanUDL(): Error in User Defined 
Object [VHCatSource], error code: 0
com.vertica.sdk.UdfException: Error message is [ 
org.apache.hcatalog.common.HCatException : 2004 : HCatOutputFormat not 
initialized, setOutput has to be called. Cause : java.io.IOException: 
java.lang.RuntimeException: 
MetaException(message:org.apache.hadoop.hive.serde2.SerDeException 
SerDe com.cloudera.hive.serde.JSONSerDe does not exist) ] HINT If error 
message is not descriptive or local, may be we cannot read metadata from hive 
metastore service thrift://hcathost:9083 or HDFS namenode (check 
UDxLogs/UDxFencedProcessesJava.log in the catalog directory for more information)
at com.vertica.hcatalogudl.HCatalogSplitsNoOpSourceFactory
.plan(HCatalogSplitsNoOpSourceFactory.java:98)
at com.vertica.udxfence.UDxExecContext.planUDSource(UDxExecContext.java:898)
. . .

In the error message, you can see that the root cause is a missing SerDe class (shown in bold). To resolve this issue, install the SerDe class on your Vertica cluster. See Using Nonstandard SerDes for more information.

This error may occur intermittently if just one or a few hosts in your cluster do not have the SerDe class.

Differing Results Between Hive and Vertica Queries

Sometimes, running the same query on Hive and on Vertica through the HCatalog Connector can return different results. This discrepancy is often caused by the differences between the data types supported by Hive and Vertica. See Data Type Conversions from Hive to Vertica for more information about supported data types.

If Hive string values are being truncated in Vertica, this might be caused by multi-byte character encodings in Hive. Hive reports string length in characters, while Vertica records it in bytes. For a two-byte encoding such as Unicode, you need to double the column size in Vertica to avoid truncation.

Discrepancies can also occur if the Hive table uses partition columns of types other than string.

HCatalog Connector Installation Fails on MapR

If you mount a MapR file system as an NFS mount point and then install the HCatalog Connector, it could fail with a message like the following:

ROLLBACK 2929: Couldn't create new UDx side process, 
failed to get UDx side process info from zygote: Broken pipe 

This might be accompanied by an error like the following in dbLog:

java.io.IOException: Couldn't get lock for /home/dbadmin/node02_catalog/UDxLogs/UDxFencedProcessesJava.log
	at java.util.logging.FileHandler.openFiles(FileHandler.java:389) 
	at java.util.logging.FileHandler.<init>(FileHandler.java:287) 
	at com.vertica.udxfence.UDxLogger.setup(UDxLogger.java:78) 
	at com.vertica.udxfence.UDxSideProcess.go(UDxSideProcess.java:75) 
	...

This error occurs if you locked your NFS mount point when creating it. Locking is the default. If you use the HCatalog Connector with MapR mounted as an NFS mount point, you must create the mount point with the -o nolock option. For example:

sudo mount -o nolock -t nfs MaprCLDBserviceHostname:/mapr/ClusterName/vertica/$(hostname -f)/ vertica

You can use the HCatalog Connector with MapR without mounting the MapR file system. If you mount the MapR file system, you must do so without a lock.

Preventing Excessive Query Delays

Network issues or high system loads on the HiveServer2 server can cause long delays while querying a Hive database using the HCatalog Connector. While Vertica cannot resolve these issues, you can set parameters that limit how long Vertica waits before canceling a query on an HCatalog schema. You can set these parameters globally using Vertica configuration parameters. You can also set them for specific HCatalog schemas in the CREATE HCATALOG SCHEMA statement. These specific settings override the settings in the configuration parameters.

The HCatConnectionTimeout configuration parameter and the CREATE HCATALOG SCHEMA statement's HCATALOG_CONNECTION_TIMEOUT parameter control how many seconds the HCatalog Connector waits for a connection to the HiveServer2 server. A value of 0 (the default setting for the configuration parameter) means to wait indefinitely. If the server does not respond by the time this timeout elapses, the HCatalog Connector breaks the connection and cancels the query. If you find that some queries on an HCatalog schema pause excessively, try setting this parameter to a timeout value, so the query does not hang indefinitely.

The HCatSlowTransferTime configuration parameter and the CREATE HCATALOG SCHEMA statement's HCATALOG_SLOW_TRANSFER_TIME parameter specify how long the HCatlog Connector waits for data after making a successful connection to the server. After the specified time has elapsed, the HCatalog Connector determines whether the data transfer rate from the server is at least the value set in the HCatSlowTransferLimit configuration parameter (or by the CREATE HCATALOG SCHEMA statement's HCATALOG_SLOW_TRANSFER_LIMIT parameter). If it is not, then the HCatalog Connector terminates the connection and cancels the query.

You can set these parameters to cancel queries that run very slowly but do eventually complete. However, query delays are usually caused by a slow connection rather than a problem establishing the connection. Therefore, try adjusting the slow transfer rate settings first. If you find the cause of the issue is connections that never complete, you can alternately adjust the Linux TCP socket timeouts to a suitable value instead of relying solely on the HCatConnectionTimeout parameter.