Apache Hadoop Parameters

The following table describes the general parameters for configuring integration with Apache Hadoop. See Integrating with Apache Hadoop for more information.

Parameter Description
HadoopConfDir

A directory path containing the XML configuration files copied from Hadoop. The same path must be valid on every Vertica node. You can use the VERIFY_HADOOP_CONF_DIR meta-function to test that the value is set correctly. Setting this parameter is required to read data from HDFS.

When you set this parameter, any previously-cached configuration information is flushed.

Default Value: obtained from environment if possible

Requires Restart: No

Example:

ALTER DATABASE mydb SET HadoopConfDir = '/hadoop/hcat/conf';
HadoopImpersonationConfig A session parameter specifying the delegation token or Hadoop user for HDFS access. See HadoopImpersonationConfig Format for information about the value of this parameter and Proxy Users and Delegation Tokens for more general context.
HDFSUseWebHDFS

Whether to use the webhdfs scheme instead of hdfs, regardless of the URL. Using webhdfs is slower than using hdfs but supports some additional features. If you do not specifically need a feature not supported in the hdfs scheme, you should not change the value of this parameter.

Default Value: 0 (disabled)

Requires Restart: No

The following table describes the parameters for configuring the HCatalog Connector. See Using the HCatalog Connector in Integrating with Apache Hadoop for more information.

Parameter Description
EnableHCatImpersonation

Whether the HCatalog Connector uses (impersonates) the current Vertica user when accessing Hive. If impersonation is enabled, the HCatalog Connector uses the Kerberos credentials of the logged-in Vertica user to access Hive data. Disable impersonation if you are using an authorization service to manage access without also granting users access to the underlying files. For more information, see Configuring Security in Integrating with Apache Hadoop.

Default Value: 1 (enabled)

Requires Restart: No

Example:

ALTER DATABASE mydb SET EnableHCatImpersonation = 0;
HCatalogConnectorUseHiveServer2

When enabled, Vertica internally uses HiveServer2 instead of WebHCat to get metadata from Hive.

Default Value: 1 (enabled)

Requires Restart: No

Example:

ALTER DATABASE mydb SET HCatalogConnectorUseHiveServer2 = 0;
HCatalogConnectorUseLibHDFSPP

Whether the HCatalog Connector should use the hdfs scheme instead of webhdfs to read native formats.

Note: This parameter is deprecated. Vertica uses the hdfs scheme by default. If you need to use webhdfs, use the HDFSUseWebHDFS parameter.

Default Value: 1 (enabled)

HCatConnectionTimeout

The number of seconds the HCatalog Connector waits for a successful connection to the HiveServer2 (or WebHCat) server before returning a timeout error.

Default Value: 0 (Wait indefinitely)

Requires Restart: No

Example:

ALTER DATABASE mydb SET HCatConnectionTimeout = 30;
HCatSlowTransferLimit

The lowest transfer speed (in bytes per second) that the HCatalog Connector allows when retrieving data from the HiveServer2 (or WebHCat) server. In some cases, the data transfer rate from the server to Vertica is below this threshold. In such cases, after the number of seconds specified in the HCatSlowTransferTime parameter pass, the HCatalog Connector cancels the query and closes the connection.

Default Value: 65536

Requires Restart: No

Example: 

ALTER DATABASE mydb SET HCatSlowTransferLimit = 32000;
HCatSlowTransferTime

The number of seconds the HCatalog Connector waits before testing whether the data transfer from the server is too slow. See the HCatSlowTransferLimit parameter.

Default Value: 60

Requires Restart: No

Example:

ALTER DATABASE mydb SET HCatSlowTransferTime = 90;

Note: You can override the HCatalog configuration parameters when creating an HCatalog schema. See CREATE HCATALOG SCHEMA in the SQL Reference Manual for an explanation.