Hadoop Integration
This section contains information on updates to Hadoop-integration information for Vertica Analytic Database 8.1.x.
Rack Locality
When database nodes are co-located on Hadoop nodes, Vertica can take advantage of the Hadoop rack configuration to execute queries. Moving query execution closer to the data reduces network latency and can improve performance.
Vertica automatically uses database nodes that are co-located with the HDFS nodes that contain the data. This behavior, called node locality, requires no additional configuration. In this release, if no database node has local data, Vertica uses a database node in the same rack as the data. Rack locality must be configured. See Configuring Rack Locality in Integrating with Apache Hadoop.
Rack locality is available for reading ORC and Parquet data on co-located clusters.
Function to Verify Kerberos Configuration
You can use the KERBEROS_HDFS_CONFIG_CHECK metafunction to verify that Vertica can use Kerberos authentication with HDFS. You can have the function test all Hadoop integrations that it finds, or you can test specific clusters or services. The function tests each element of Kerberos configuration and reports specific errors if it finds any.
More Details
For more information see Integrating with Apache Hadoop.