Materializing Flex Tables

Once flex tables exist, you can change the table structure to promote virtual columns to materialized (real) columns. If your table is already a hybrid table, you can change existing real columns and promote other important virtual columns. This section describes some key aspects of promoting columns, adding columns, specifying constraints, and declaring default values. It also presents some differences when loading flex or hybrid tables, compared with columnar tables.

Materializing virtual columns by promoting them to real columns can significantly improve query performance. Vertica recommends that you materialize important virtual columns before running large and complex queries. Promoted columns cause a small decrease in load performance.

Adding Columns to Flex Tables

Add columns to your flex tables to promote virtual columns:

  1. Add a real column with the same name as a virtual column (user.name):

    => ALTER TABLE darkdata1 ADD COLUMN "user.name" VARCHAR;
    ALTER TABLE
    
  2. Load some data into the table.

     => COPY darkdata1 FROM '/vertica/flextable/DATA/tweets_12.json' PARSER fjsonparser();
     Rows Loaded
    -------------
              12
    (1 row)
    
  3. Query the materialized column. Notice that loading data populates the column automatically. Blank rows indicate no values or NULLs:

    => SELECT "user.name" FROM darkdata1;
          user.name
    ---------------------
     I'm Toasterâ¥
     Flu Beach
     seydo shi
     The End
     Uptown gentleman.
    
     ~G A B R I E L A â¿
     Avita Desai
     laughing at clouds.
    (12 rows)
    

Adding Columns with Default Values

The section Bulk Loading Data into Flex Tables describes the use of default values, and how Vertica evaluates them during loading. As with all tables, using COPY to load data ignores any column default values.

Adding a table column default expression to a flex table requires casting the column to an explicit data type.

  1. Create a darkdata1 table with a column definition. The following example uses a column name (talker) that does not correspond to a virtual column name. Assign a default value with a virtual column name. In this example, the default value for the column talker is ("user.lang"). Since user.lang is a virtual column in the LONG VARBINARY __raw__ column, you must cast its value to VARCHAR to match the talker column definition:

    => CREATE FLEX TABLE darkdata1(talker VARCHAR default "user.lang"::VARCHAR);
    CREATE TABLE
    
  2. Load some JSON data, specifying the __raw__ column:

  3. => COPY darkdata1 (__raw__) FROM '/test/vertica/flextable/DATA/tweets_12.json' 
       PARSER fjsonparser();
     Rows Loaded
    -------------
              12
    (1 row)
    
  4. Query the talker column. Notice that Vertica used the default column value ("user.lang"), because you specified __raw__. Blank rows indicate no values or NULLs:

    => SELECT "talker" FROM darkdata1;
     talker
    --------
     it
     en
     es
     en
     en
     es
     tr
     en
    (12 rows)
    
  5. Alter the table to add a column with a known virtual column name (user.name), assigning the key name as the default value (recommended), and casting it to a VARCHAR:

    => ALTER TABLE darkdata1 ADD COLUMN "user.name" VARCHAR default "user.name"::VARCHAR;
    ALTER TABLE
    
  6. Load data again, this time without __raw__:

    => COPY darkdata1 FROM '/test/vertica/flextable/DATA/tweets_12.json' PARSER fjsonparser();
  7. Query the two real columns. Notice that talker has no values, because you did not specifiy the __raw__ column. The user.lang column contains values from the user.name virtual column: 

    => SELECT "talker", "user.name" FROM darkdata1;
     talker |      user.name
    --------+---------------------
            | laughing at clouds.
            | Avita Desai
            | I'm Toasterâ¥
            |
            |
            |
            | Uptown gentleman.
            | ~G A B R I E L A â¿
            | Flu Beach
            |
            | seydo shi
            | The End
    (12 rows)
    
  8. Load data once more, this time specifying a COPY statement with a default value expression for user.name:

    => COPY darkdata1 (__raw__, "user.name" as 'QueenElizabeth'::varchar) FROM 
    '/test/vertica/flextable/DATA/tweets_12.json' PARSER fjsonparser();
     Rows Loaded
    -------------
              12
    (1 row)
    
  9. Query once more. Notice that the real column talker has its default values (you used __raw__). As specified in COPY, the "user.name" as 'QueenElizabeth' expression overrode the user.name default column value:

    => SELECT "talker", "user.name" FROM darkdata1;
     talker |   user.name
    --------+----------------
     it     | QueenElizabeth
     en     | QueenElizabeth
     es     | QueenElizabeth
            | QueenElizabeth
            | QueenElizabeth
            | QueenElizabeth
     en     | QueenElizabeth
     en     | QueenElizabeth
     es     | QueenElizabeth
            | QueenElizabeth
     tr     | QueenElizabeth
     en     | QueenElizabeth
    (12 rows)
    

To summarize, you can set a default column value as part of the ALTER TABLE…ADD COLUMN operation. For materializing columns, the default value should reference the key name of the virtual column (as in "user.lang"). Subsequently loading data with a COPY value expression overrides the default value of the column definition.

Changing the __raw__ Column Size

You can change the default size of the __raw__ column for flex tables you plan to create, the current size of an existing flex table, or both.

To change the default size for the flex table __raw__ column, use the following database configuration parameter (described in Setting Flex Table Configuration Parameters): 

=> ALTER DATABASE DEFAULT SET FlexTableRawSize = 120000;

Changing the configuration parameter affects all flex tables you create after making this change.

To change the size of the _raw_ column in an existing flex table, use the ALTER TABLE statement to change the definition of the __raw__ column: 

=> ALTER TABLE tester ALTER COLUMN __raw__ SET DATA TYPE LONG VARBINARY(120000);
ALTER TABLE

An error occurs if you try to reduce the __raw__ column size to a value smaller than any data the column contains.

Changing Flex Table Real Columns

You can make the following changes to the flex table real columns (__raw__ and __identity__), but not to any virtual columns:

Actions __raw__ __identity__

Change NOT NULL constraints (default)

Yes

Yes

Add primary key and foreign key (PK/FK) constraints

No

Yes

Create projections

No

Yes

Segment

No

Yes

Partition

No

Yes

Specify a user-defined scalar function (UDSF) as a default column expression in ALTER TABLE x ADD COLUMN y statement

No

No

While segmenting and partitioning the __raw__ column is permitted, it is not recommended due to its long data type. By default, if you not define any real columns, flex tables are segmented on the __identity__ column.

Dropping Flex Table Columns

There are two considerations about dropping columns: 

  • You cannot drop the last column in your flex table's sort order.
  • If you have not created a flex table with any real columns, or materialized any columns, you cannot drop the __identity__ column.