Vertica Analytics Platform Version 9.2.x Documentation

Defining Column Values

You can define a column so Vertica automatically sets its value from an expression through one of the following clauses:

DEFAULT expression

Sets the default column values immediately in the following cases:

  • Adding a column with a DEFAULT expression to an existing table. Vertica populates the new column with its default values when it is added to the table.

    Altering an existing table column to specify a DEFAULT expression has no effect on existing values in that column. Vertica applies the DEFAULT expression only on new rows when they are added to the table, through load operations such as INSERT and COPY. To refresh the entire column with its DEFAULT expression, update the column as follows:

    UPDATE table-name SET column-name=DEFAULT;                
  • Executing a table load operation such as INSERT. Vertica populates columns with default values in the new rows. Values in existing rows, including columns with DEFAULT expressions, remain unchanged.
  • Executing UPDATE on a table and setting default column to DEFAULT. See the example below.
SET USING expression

Sets the column value only when the function REFRESH_COLUMNS is invoked. This approach is especially useful for large denormalized (flattened) tables, where multiple columns get their values by querying other tables. For details, see Flattened Tables.

In general, REFRESH_COLUMNS must be explicitly invoked by the user. One exception applies: when you use ALTER TABLE…ALTER COLUMN to apply SET USING to an existing column, or modify an existing SET USING expression. In this case, the DDL operation automatically invokes REFRESH_COLUMNS on the column, using REBUILD mode. After the refresh operation is complete, the DDL operation auto-commits the updates and returns. If the refresh operation fails, Vertica rolls back the entire DDL operation. Execution time can be significant if the refresh operation involves a large data set.

Supported Expressions

DEFAULT and SET USING generally support the same expressions. These include:

Expression Restrictions

The following restrictions apply to DEFAULT and SET USING expressions:

  • The return value data type must match or be cast to the column data type.
  • The expression must return a value that conforms to the column bounds. For example, a column that is defined as a VARCHAR(1) cannot be set to a default string of abc.
  • The expression cannot specify correlated sub-queries.
  • In a temporary table, DEFAULT and SET USING do not support sub-queries. If you try to create a temporary table with DEFAULT or SET USING using subquery expressions, Vertica returns an error.
  • A column's SET USING expression cannot specify another column in the same table that also sets its value with SET USING. Similarly, a column's DEFAULT expression cannot specify another column in the same table that also sets its value with DEFAULT, or whose value is automatically set to a sequence. However, a column's SET USING expression can specify another column that sets its value with DEFAULT.

    You can set a column's DEFAULT expression from another column in the same table that sets its value with SET USING. However, the DEFAULT column is typically set to NULL, as it is only set on load operations that initially set the SET USING column to NULL.

DEFAULT Restrictions

DEFAULT expressions cannot specify volatile functions with ALTER TABLE…ADD COLUMN. To specify volatile functions, use CREATE TABLE or ALTER TABLE…ALTER COLUMN statements.

SET USING Restrictions

The following restrictions apply to SET USING expressions:

  • Volatile functions are not allowed.
  • The expression cannot specify a sequence.
  • You cannot modify SET USING data directly with DML operations such as COPY, INSERT, UPDATE, or MERGE. SET USING columns can only be updated by calling REFRESH_COLUMNS. Attempts to modify SET USING columns return an error.
  • Vertica limits the use of several meta-functions that copy table data: COPY_TABLE, COPY_PARTITIONS_TO_TABLE, MOVE_PARTITIONS_TO_TABLE, and SWAP_PARTITIONS_BETWEEN_TABLES. The following table describes these limitations:

    SET USING columns in...Limitation
    Source and target tableAll functions allowed only if each SET USING column in source table has a corresponding SET USING column in target table.

    Source table only

    SWAP_PARTITIONS_BETWEEN_TABLES disallowed.

    Target table onlyAll functions disallowed.

Several restrictions apply to Vertica's ability to refresh a SET USING column with REFRESH_COLUMNS. For details, see REFRESH_COLUMNS.

DEFAULT USING Columns and Restrictions

A column can specify both DEFAULT and SET USING constraints, as follows:

column-name data-type DEFAULT default-expr SET USING using-expr

Typically, both constraints specify the same expression. In this case, you can define the column as follows:

column-name data-type DEFAULT USING expression

DEFAULT USING columns support the same expressions as SET USING columns, and are subject to the same restrictions.

Examples

Derive a column's default value from another column

  1. Create table t with two columns, date and state, and insert a row of data:

    => CREATE TABLE t (date DATE, state VARCHAR(2));
    CREATE TABLE
    => INSERT INTO t VALUES (CURRENT_DATE, 'MA');
     OUTPUT
    --------
          1
    (1 row)
    
    => COMMIT;
    COMMMIT
    SELECT * FROM t;
        date    | state
    ------------+-------
     2017-12-28 | MA    
    (1 row)
  2. Use ALTER TABLE to add a third column that extracts the integer month value from column date:

    => ALTER TABLE t ADD COLUMN month INTEGER DEFAULT date_part('month', date);
    ALTER TABLE
    
  3. When you query table t, Vertica returns the number of the month in column date:
    => SELECT * FROM t;
        date    | state | month
    ------------+-------+-------
     2017-12-28 | MA    |    12
    (1 row)
    

Update default column values

  1. Update table t by subtracting 30 days from date:
    => UPDATE t SET date = date-30;
     OUTPUT
    --------
          1
    (1 row)
    
    => COMMIT;
    COMMIT
    => SELECT * FROM t;
        date    | state | month
    ------------+-------+-------
     2017-11-28 | MA    |    12
    (1 row)

    The value in month remains unchanged.

  2.  Refresh the default value in month from column date:
    => UPDATE t SET month=DEFAULT;
     OUTPUT
    --------
          1
    (1 row)
    
    => COMMIT;
    COMMIT
    => SELECT * FROM t;
        date    | state | month
    ------------+-------+-------
     2017-11-28 | MA    |    11
    (1 row)

Derive a default column value from user-defined scalar function

This example shows a user-defined scalar function that adds two integer values. The function is called add2ints and takes two arguments.

  1. Develop and deploy the function, as described in Scalar Functions (UDSFs).
  2. Create a sample table, t1, with two integer columns:

    => CREATE TABLE t1 ( x int, y int );
    CREATE TABLE
    
  3. Insert some values into t1:

    => insert into t1 values (1,2); 
    OUTPUT 
    --------
          1
    (1 row)
    => insert into t1 values (3,4);
     OUTPUT 
    --------
          1
    (1 row)
    
  4. Use ALTER TABLE to add a column to t1, with the default column value derived from the UDSF add2ints:

    alter table t1 add column z int default add2ints(x,y);
    ALTER TABLE
    
  5. List the new column:

    select z from t1;
     z 
    ----
      3
      7
    (2 rows)
    

Table with a SET USING column that queries another table for its values

  1. Define tables t1 and t2. Column t2.b is defined to get its data from column t1.b, through the query in its SET USING clause:
    => CREATE TABLE t1 (a INT PRIMARY KEY ENABLED, b INT);
    CREATE TABLE
    
    => CREATE TABLE t2 (a INT, alpha VARCHAR(10), 
          b INT SET USING (SELECT t1.b FROM t1 WHERE t1.a=t2.a)) 
          ORDER BY a SEGMENTED BY HASH(a) ALL NODES;
    CREATE TABLE
    

    The definition for table t2 includes SEGMENTED BY and ORDER BY clauses that exclude SET USING column b. If these clauses are omitted, Vertica creates an auto-projection for this table that specifies column b in its SEGMENTED BY and ORDER BY clauses. Inclusion of a SET USING column in any projection's segmentation or sort order prevents function REFRESH_COLUMNS from populating this column. Instead, it returns with an error.

    For details on this and other restrictions, see REFRESH_COLUMNS.

  2. Populate the tables with data:
    => INSERT INTO t1 VALUES(1,11);
    => INSERT INTO t1 VALUES(2,22);
    => INSERT INTO t1 VALUES(3,33);
    => INSERT INTO t1 VALUES(4,44);
    => INSERT INTO t2 VALUES(1,'aa');
    => INSERT INTO t2 VALUES(2,'bb');
    => COMMIT;
    COMMIT
    
  3. View the data in table t2: Column in SET USING column b is empty, pending invocation of Vertica function REFRESH_COLUMNS:
  4. => SELECT * FROM t2;
     a | alpha | b
    ---+-------+---
     1 | aa    |
     2 | bb    |
    (2 rows)
    
  5. Refresh the column data in table t2 by calling function REFRESH_COLUMNS:
    => SELECT REFRESH_COLUMNS ('t2','b', 'REBUILD');
          REFRESH_COLUMNS
    ---------------------------
     refresh_columns completed
    (1 row)
    

    In this example, REFRESH_COLUMNS is called with the optional argument REBUILD. This argument specifies to replace all data in SET USING column b. It is generally good practice to call REFRESH_COLUMNS with REBUILD on any new SET USING column. For details, see REFRESH_COLUMNS.

  6. View data in refreshed column b, whose data is obtained from table t1 as specified in the column's SET USING query:
    => SELECT * FROM t2 ORDER BY a;
      a | alpha | b
    ---+-------+----
     1 | aa    | 11
     2 | bb    | 22
    (2 rows)