Prioritizing Column Access Speed
If you measure and set the performance of storage locations within your cluster, Vertica uses this information to determine where to store columns based on their rank. For more information, see Setting Storage Performance.
How Columns are Ranked
Vertica stores columns included in the projection sort order on the fastest available storage locations. Columns not included in the projection sort order are stored on slower disks. Columns for each projection are ranked as follows:
- Columns in the sort order are given the highest priority (numbers > 1000).
- The last column in the sort order is given the rank number 1001.
- The next-to-last column in the sort order is given the rank number 1002, and so on until the first column in the sort order is given 1000 + # of sort columns.
- The remaining columns are given numbers from 1000–1, starting with 1000 and decrementing by one per column.
Vertica then stores columns on disk from the highest ranking to the lowest ranking. It places highest-ranking columns on the fastest disks and the lowest-ranking columns on the slowest disks.
You can modify which columns are stored on fast disks by manually overriding the default ranks for these columns. To accomplish this, set the
ACCESSRANK keyword in the column list. Make sure to use an integer that is not already being used for another column. For example, if you want to give a column the fastest access rank, use a number that is significantly higher than 1000 + the number of sort columns. This allows you to enter more columns over time without bumping into the access rank you set.
The following example sets column
store_key's access rank to 1500:
CREATE PROJECTION retail_sales_fact_p (
store_key ENCODING RLE ACCESSRANK 1500,
pos_transaction_number ENCODING RLE,
ORDER BY store_key
SEGMENTED BY HASH(pos_transaction_number) ALL NODES;