CREATE LOCATION
Creates a new storage location where Vertica can store data. After you create the location, you create storage policies that assign the storage location to the database objects that will store data in the location.
Cautions
While no technical issue prevents you from using CREATE LOCATION
to add one or more Network File System (NFS) storage locations, Vertica does not support NFS data or catalog storage except for MapR mount points. You will be unable to run queries against any other NFS data. When creating locations on MapR file systems, you must specify ALL NODES SHARED
.
If you use any HDFS storage locations, the HDFS data must be available at the time you start Vertica. Your HDFS cluster must be operational, and the ROS files must be present. If you have moved data files, or if they have become corrupted, or if your HDFS cluster is not responsive, Vertica cannot start.
Syntax
CREATE LOCATION 'path' [NODE 'nodename' | ALL NODES] [SHARED] [USAGE 'usetype'] [LABEL 'labelname']
Arguments
path |
Where Vertica will store this location's data. The type of filesystem on which the location is based determines the format of this argument:
|
[NODE nodename | ALL NODES]
|
The node or nodes on which the storage location is defined.
Default Value: ALL NODES |
[SHARED]
|
Indicates the location set by the path is shared (used by all of the nodes) rather than local to each node. See below for details. |
[USAGE 'usetype']
|
The type of data the storage location can hold. Valid Values:
Default Value: 'TEMP,DATA' |
[LABEL 'labelname']
|
A label for the storage location. You use this name later when assigning the storage location to data objects. |
Shared vs. Local Storage
The SHARED keyword indicates that the location set by the path argument is shared by all nodes. Most remote filesystems (such as HDFS) are shared. For these filesystems, the path argument represents a single location where all of the nodes store data. Each node creates its own subdirectory to hold its own files in a shared storage location. These subdirectories prevent the nodes from overwriting each other's files. Even if your cluster has only one node, you must include the SHARED keyword if you are using a remote filesystem. If the location is declared as USER Vertica does not create sub directories for each node. The setting of USER takes precedence over SHARED.
If you do not supply this keyword, the new storage location is local. The path argument specifies a location that is unique for each node in the cluster. This location is usually a path in the node's own filesystem. Storage locations contained in filesystems that are local to each node (such as the Linux filesystem) are always local.
WebHDFS URIs
The URIs you supply in the path argument are similar to the URLs for accessing files through WebHDFS, with a few differences:
- The protocol for the URI is
webhdfs://
rather thanhttp://
. - The path portion of the URI does not include the
/webhdfs/v1/
portion of the path. Instead, after the hostname and port, specify the HDFS path from the root directory.
For example, suppose you can access the HDFS directory you want to use as a storage location using the URL http://hadoop:50070/webhdfs/v1/user/dbadmin
. Then the URI you supply to the CREATE LOCATION statement is webhdfs://hadoop:50070/user/dbadmin
.
For more information about WebHDFS URIs, see the WebHDFS REST API page.
Privileges
The user must be a superuser to use CREATE LOCATION.
In addition, the Vertica process must have read and write permissions to the location where date will be stored. Each type of filesystem has its own requirements:
- Linux—The database administrator account (usually named dbadmin) must have full read and write access to the directory in the path argument.
- HDFS without Kerberos—Requires a Hadoop user whose username matches the Vertica database administrator username (usually dbadmin). This Hadoop user must have read and write access to the HDFS directory specified in the path argument.
- HDFS with Kerberos—Requires a Hadoop user whose username matches the principal in the keytab file on each Vertica node. This is not the same as the database administrator username. This Hadoop user must have read and write access to the HDFS directory stored in the path argument.
Examples
The following example shows how to create a storage location in the local Linux filesystem for temporary data storage.
=> CREATE LOCATION '/home/dbadmin/testloc' USAGE 'TEMP' LABEL 'tempfiles';
The following example shows how to create a storage location on the HDFS cluster available from the Hadoop name node hadoop.example.com
in the /user/dbadmin
directory. The HDFS cluster does not use Kerberos.
=> CREATE LOCATION 'webhdfs://hadoop.example.com:50070/user/dbadmin' ALL NODES SHARED USAGE 'data' LABEL 'coldstorage';
The following example shows how to create the same storage location, but on a Hadoop cluster that uses Kerberos. Note the output that reports the principal being used.
=> CREATE LOCATION 'webhdfs://hadoop.example.com:50070/user/dbadmin' ALL NODES SHARED USAGE 'data' LABEL 'coldstorage'; NOTICE 0: Performing HDFS operations using kerberos principal [vertica/hadoop.example.com] CREATE LOCATION
The following example shows how to create a location for user data, grant access to it, and use it to create an external table.
=> CREATE LOCATION '/tmp' ALL NODES USAGE 'user'; CREATE LOCATION => GRANT ALL ON LOCATION '/tmp' to Bob; GRANT PRIVILEGE => CREATE EXTERNAL TABLE ext1 (x integer) AS COPY FROM '/tmp/data/ext1.dat' DELIMITER ','; CREATE TABLE
See Also
- Managing Storage Locations in the Administrator's Guide
- Vertica Storage Location for HDFS in Integrating with Apache Hadoop.
- ALTER_LOCATION_LABEL
- ALTER_LOCATION_USE
- DROP_LOCATION
- SET_OBJECT_STORAGE_POLICY