Creating External Tables
You create an external table using the CREATE EXTERNAL TABLE AS COPY
statement. You cannot create temporary external tables. For the syntax details to create an external table, see the CREATE EXTERNAL TABLE statement in the SQL Reference Manual.
Note: Each table can have a maximum of 1600 columns.
Required Permissions for External Tables
You must be a database superuser to create external tables.
Permission requirements to use (SELECT from) external tables differ from those of other tables. By default, once external tables exist, you must also be a database superuser to access them through a SELECT statement.
To allow users without superuser access to query external tables, an administrator must create a 'user' storage location and grant those users read access to the location. See CREATE LOCATION, and GRANT (Storage Location). This location must be a parent of the path used in the COPY statement when creating the external table.
COPY Statement Definition
When you create an external table, table data is not added to the database, and no projections are created. Instead, Vertica performs a syntactic check of the CREATE EXTERNAL TABLE...
statement, and stores the table name and COPY statement definition in the catalog. When a SELECT query references an external table, Vertica parses and executes the stored COPY statement to obtain the referenced data. Successfully returning data from an external table requires that the COPY definition be correct, and that other dependencies, such as files, nodes, and other resources are accessible and available at query-time.
If the maximum length of a column is smaller than the actual data, such as a VARCHAR that is too short, Vertica truncates the data and logs the event.
When using the COPY parameter on any node
, confirm that the source file definition is identical on all nodes. Specifying different external files can produce inconsistent results.
For more information about checking the validity of the external table COPY definition, see Validating External Tables.
NOT NULL Constraints
Do not specify a NOT NULL column constraint, unless you are certain that the external data does not contain NULL values. Otherwise, you may see unexpected query results. For example, a SELECT statement for an external table with a NOT NULL constraint will reject a column value if it is not NULL.
Canceling the Create Query
Canceling a CREATE EXTERNAL TABLE AS COPY statement can cause unpredictable results. If you enter a query to create an external table, and it is incorrect (for example, you inadvertently specify the wrong external location), wait for the query to complete. When the external table exists, use DROP TABLEs definition.
Developing User-Defined Load (UDL) Functions for External Tables
You can create external tables with your own load functions. For more information about developing user-defined load functions, see User Defined Load (UDL) and the extended COPY syntax in the SQL Reference Manual.
Examples
Examples of external table definitions:
=> CREATE EXTERNAL TABLE ext1 (x integer) AS COPY FROM '/tmp/ext1.dat' DELIMITER ','; => CREATE EXTERNAL TABLE ext1 (x integer) AS COPY FROM 'hdfs:///dat/ext1.dat'; => CREATE EXTERNAL TABLE ext1 (x integer) AS COPY FROM '/tmp/ext1.dat.bz2' BZIP DELIMITER ','; => CREATE EXTERNAL TABLE ext1 (x integer, y integer) AS COPY (x as '5', y) FROM '/tmp/ext1.dat.bz2' BZIP DELIMITER ',';
To allow users without superuser access to use these tables, create a location for 'user' usage and grant access to it. This example shows granting access to a user named Bob to any external table whose data is located under /tmp (including in subdirectories to any depth):
=> CREATE LOCATION '/tmp' ALL NODES USAGE 'user'; => GRANT ALL ON LOCATION '/tmp' to Bob;