Testing Your Hadoop WebHDFS Configuration

To ensure that your Hadoop installation's WebHDFS system is configured and running, follow these steps:

  1. Log into your Hadoop cluster and locate a small text file on the Hadoop filesystem. If you do not have a suitable file, you can create a file named test.txt in the /tmp directory using the following command:

    echo -e "A|1|2|3\nB|4|5|6" | hadoop fs -put - /tmp/test.txt
  2. Log into a host in your Vertica database using the database administrator account.
  3. If you are using Kerberos authentication, authenticate with the Kerberos server using the keytab file for a user who is authorized to access the file. For example, to authenticate as an user named exampleuser@MYCOMPANY.COM, use the command:

    $ kinit exampleuser@MYCOMPANY.COM -k -t /path/exampleuser.keytab

    Where path is the path to the keytab file you copied over to the node. You do not receive any message if you authenticate successfully. You can verify that you are authenticated by using the klist command:

    $ klistTicket cache: FILE:/tmp/krb5cc_500
    Default principal: exampleuser@MYCOMPANY.COM
    Valid starting     Expires            Service principal
    07/24/13 14:30:19  07/25/13 14:30:19  krbtgt/MYCOMPANY.COM@MYCOMPANY.COM
            renew until 07/24/13 14:30:19
    
  4. Test retrieving the file:

    • If you are not using Kerberos authentication, run the following command from the Linux command line:

      curl -i -L "http://hadoopNameNode:50070/webhdfs/v1/tmp/test.txt?op=OPEN&user.name=hadoopUserName"

      Replacing hadoopNameNode with the hostname or IP address of the name node in your Hadoop cluster, /tmp/test.txt with the path to the file in the Hadoop filesystem you located in step 1, and hadoopUserName with the user name of a Hadoop user that has read access to the file.

      If successful, the command produces output similar to the following:

      HTTP/1.1 200 OKServer: Apache-Coyote/1.1
      Set-Cookie: hadoop.auth="u=hadoopUser&p=password&t=simple&e=1344383263490&s=n8YB/CHFg56qNmRQRTqO0IdRMvE="; Version=1; Path=/
      Content-Type: application/octet-stream
      Content-Length: 16
      Date: Tue, 07 Aug 2012 13:47:44 GMT
      A|1|2|3
      B|4|5|6
      
    • If you are using Kerberos authentication, run the following command from the Linux command line:

      curl --negotiate -i -L -u:anyUser http://hadoopNameNode:50070/webhdfs/v1/tmp/test.txt?op=OPEN

      Replace hadoopNameNode with the hostname or IP address of the name node in your Hadoop cluster, and /tmp/test.txt with the path to the file in the Hadoop filesystem you located in step 1.

      If successful, the command produces output similar to the following:

      HTTP/1.1 401 UnauthorizedContent-Type: text/html; charset=utf-8
      WWW-Authenticate: Negotiate
      Content-Length: 0
      Server: Jetty(6.1.26)
      HTTP/1.1 307 TEMPORARY_REDIRECT
      Content-Type: application/octet-stream
      Expires: Thu, 01-Jan-1970 00:00:00 GMT
      Set-Cookie: hadoop.auth="u=exampleuser&p=exampleuser@MYCOMPANY.COM&t=kerberos&
      e=1375144834763&s=iY52iRvjuuoZ5iYG8G5g12O2Vwo=";Path=/
      Location: http://hadoopnamenode.mycompany.com:1006/webhdfs/v1/user/release/docexample/test.txt?
      op=OPEN&delegation=JAAHcmVsZWFzZQdyZWxlYXNlAIoBQCrfpdGKAUBO7CnRju3TbBSlID_osB658jfGf
      RpEt8-u9WHymRJXRUJIREZTIGRlbGVnYXRpb24SMTAuMjAuMTAwLjkxOjUwMDcw&offset=0
      Content-Length: 0
      Server: Jetty(6.1.26)
      HTTP/1.1 200 OK
      Content-Type: application/octet-stream
      Content-Length: 16
      Server: Jetty(6.1.26)
      A|1|2|3
      B|4|5|6
      

If the curl command fails, you must review the error messages and resolve any issues before using the Vertica Connector for HDFS with your Hadoop cluster. Some debugging steps include: