Computing Flex Table Keys
After loading data into a flex table, you can determine the set of keys that exist in the __raw__
column (the map data). Two helper functions compute keys from flex table map data:
- COMPUTE_FLEXTABLE_KEYS— Determines which keys exist as virtual columns in the flex map.
- COMPUTE_FLEXTABLE_KEYS_AND_BUILD_VIEW— Performs the same functionality as COMPUTE_FLEXTABLE_KEYS, additionally building a new view. See also Updating Flex Table Views.
Using COMPUTE_FLEXTABLE_KEYS
During execution, this function calculates the following information for the flex keys table columns:
Column | Description |
---|---|
|
The name of the virtual column (key). |
|
The number of times the key occurs in the map. |
|
The type guess that the helper functions determine for each distinct key in the map data. The function determines the type of each non-string value, depending on the length of the key, and whether the key includes nested maps. Changing the default value of the |
Determining Key Data Types
By default, using COMPUTE_FLEXTABLE_KEYS
determines non-string key values from the __raw__
column LONG VARBINARY type. The non-string keys include these data types (and others listed in SQL Data Types):
- BOOLEAN
- INTEGER
- FLOAT
- TIMESTAMP
- DATE
Assigning Flex Key Data Types
Use the sample CSV data in this section to compare the results of using or not using the EnableBetterFlexTypeGuessing
configuration parameter. When the parameter is ON, the function determines key non-string data types in your map data more accurately. The default for the parameter is 1 (ON).
Year,Quarter,Region,Species,Grade,Pond Value,Number of Quotes,Available 2015,1,2 - Northwest Oregon & Willamette,Douglas-fir,1P,$615.12 ,12,No 2015,1,2 - Northwest Oregon & Willamette,Douglas-fir,SM,$610.78 ,12,Yes 2015,1,2 - Northwest Oregon & Willamette,Douglas-fir,2S,$596.00 ,20,Yes 2015,1,2 - Northwest Oregon & Willamette,Hemlock,P,$520.00 ,6,Yes 2015,1,2 - Northwest Oregon & Willamette,Hemlock,SM,$510.00 ,6,No 2015,1,2 - Northwest Oregon & Willamette,Hemlock,2S,$490.00 ,14,No
To compare the data type assignment results, complete the following steps:
- Save the CSV data file (here, as
trees.csv
). - Create a flex table (
trees
) and loadtrees.csv
using thefcsvparser
: - Use
COMPUTE_FLEXTABLE_KEYS
with thetrees
flex table. - Query the
trees_keys
table output.: - Set the
EnableBetterFlexTypeGuessing
parameter to 0 (OFF). - Call
COMPUTE_FLEXTABLE_KEYS
with thetrees
flex table again. - Query the
trees_keys
table to compare thedata_type_guess
values with the previous results. Without the configuration parameter set, all of the non-string data types are VARCHARS of various lengths: - To maintain accurate results for non-string data types, set the
EnableBetterFlexTypeGuessing
parameter back to 1 (ON).
=> CREATE FLEX TABLE trees(); => COPY trees FROM '/home/dbadmin/tempdat/trees.csv' PARSER fcsvparser();
=> SELECT COMPUTE_FLEXTABLE_KEYS('trees'); COMPUTE_FLEXTABLE_KEYS ----------------------------------------------- Please see public.trees_keys for updated keys (1 row)
=> SELECT * FROM trees_keys; key_name | frequency | data_type_guess ------------------+-----------+----------------- Year | 6 | Integer Quarter | 6 | Integer Region | 6 | Varchar(66) Available | 6 | Boolean Number of Quotes | 6 | Integer Grade | 6 | Varchar(20) Species | 6 | Varchar(22) Pond Value | 6 | Numeric(8,3) (8 rows)
=> SELECT * FROM trees_keys; key_name | frequency | data_type_guess ------------------+-----------+----------------- Year | 6 | varchar(20) Quarter | 6 | varchar(20) Region | 6 | varchar(66) Available | 6 | varchar(20) Grade | 6 | varchar(20) Number of Quotes | 6 | varchar(20) Pond Value | 6 | varchar(20) Species | 6 | varchar(22) (8 rows)
For more information about setting the EnableBetterFlexTypeGuessing
configuration parameter, see Setting Flex Table Configuration Parameters.
Calculating Key Value Column Widths
The COMPUTE_FLEXTABLE_KEYS
function determines the column width for keys by determining the length of the largest value for each key, multiplied by the FlexTableDataTypeGuessMultiplier
factor. For more about this configuration parameter, see Setting Flex Table Configuration Parameters.
The next example shows the results of populating the _keys table after creating a flex table (darkdata1
) and loading data. The column widths are shown in parentheses, where applicable, after the value of the data_type_guess
column:
=> SELECT compute_flextable_keys('darkdata1'); compute_flextable_keys -------------------------------------------------- Please see public.darkdata1_keys for updated keys (1 row) => SELECT * from darkdata1_keys; key_name | frequency | data_type_guess ----------------------------------------------------------+-----------+---------------------- created_at | 8 | TimestampTz delete.status.id_str | 4 | Integer delete.status.user_id | 4 | Integer entities.hashtags | 8 | long varbinary(186) favorited | 8 | Boolean id_str | 8 | Integer in_reply_to_screen_name | 8 | Varchar(24) retweeted_status.contributors | 1 | Varchar(20) retweeted_status.coordinates | 1 | Varchar(20) retweeted_status.created_at | 1 | TimestampTz .
.
.
(125 rows)