Eliminating Partitions

If the ROS containers of partitioned tables are not needed, Vertica can eliminate the containers from being processed during query execution. To eliminate ROS containers, Vertica compares query predicates to partition-related metadata.

Each ROS partition expression column maintains the minimum and maximum values of data stored in that ROS, and Vertica uses those min/max values to potentially eliminate ROS containers from query planning. Partitions that cannot contain matching values are not scanned. For example, if a ROS does not contain data that satisfies a given query predicate, the optimizer eliminates (prunes) that ROS from the query plan. After non-participating ROS containers have been eliminated, queries that use partitioned tables run more quickly.

Note: Partition pruning occurs at query run time and requires a query predicate on the partitioning column.

Assume a table that is partitioned by year (2007, 2008, 2009) into three ROS containers, one for each year. Given the following series of commands, the two ROS containers that contain data for 2007 and 2008 fall outside the boundaries of the requested year (2009) and get eliminated.

=> CREATE TABLE ... PARTITION BY EXTRACT(year FROM date);
=> SELECT ... WHERE date = '12-2-2009'; 

Partition pruning example

Making Past Partitions Eligible for Elimination

The following procedure lets you make past partitions eligible for elimination. The easiest way to guarantee that all ROS containers are eligible is to:

  1. Create a fact table with the same projections as the existing table.
  2. Use INSERT..SELECT to populate the new table.
  3. Drop the original table and rename the new table.

If the disk lacks enough space for a second copy of the fact table, follow this procedure:

  1. Verify that the Tuple Mover finished all post-upgrade work—for example, when the following command shows no mergeout activity:

    => SELECT * FROM TUPLE_MOVER_OPERATIONS;
  2. Identify which partitions need to be merged to get the ROS minimum/maximum values:

    => SELECT DISTINCT table_schema, projection_name, partition_key
        FROM partitions p LEFT OUTER JOIN vs_ros_min_max_values v 
        ON p.ros_id = v.delid 
        WHERE v.min_value IS null;
    
  3. Insert a record into each partition that has ineligible ROS containers, and commit.
  4. Delete each inserted record and commit again.

At this point, the Tuple Mover automatically merges ROS containers from past partitions.

Verifying the ROS Merge

  1. Query the TUPLE_MOVER_OPERATIONS table:

    => SELECT * FROM TUPLE_MOVER_OPERATIONS;
  2. Check for any partitions that need to be merged:

    => SELECT DISTINCT table_schema, projection_name, partition_key
        FROM partitions p LEFT OUTER JOIN vs_ros_min_max_values v 
        ON p.ros_id = v.rosid 
        WHERE v.min_value IS null;
    

Examples

Assume a table that is partitioned by time and will use queries that restrict data on time.

=> CREATE TABLE time ( tdate DATE NOT NULL, tnum INTEGER)
     PARTITION BY EXTRACT(year FROM tdate);
=> CREATE PROJECTION time_p (tdate, tnum) AS
=> SELECT * FROM time ORDER BY tdate, tnum UNSEGMENTED ALL NODES;

Note: Projection sort order has no effect on partition elimination.

=> INSERT INTO time VALUES ('03/15/04' , 1);
=> INSERT INTO time VALUES ('03/15/05' , 2);
=> INSERT INTO time VALUES ('03/15/06' , 3);
=> INSERT INTO time VALUES ('03/15/06' , 4);

The data inserted in the previous series of commands are loaded into three ROS containers, one per year, as that is how the data is partitioned:

=> SELECT * FROM time ORDER BY tnum;
   tdate    | tnum
------------+------
 2004-03-15 |    1  --ROS1 (min 03/01/04, max 03/15/04)
 2005-03-15 |    2  --ROS2 (min 03/15/05, max 03/15/05)
 2006-03-15 |    3  --ROS3 (min 03/15/06, max 03/15/06)
 2006-03-15 |    4  --ROS3 (min 03/15/06, max 03/15/06)
(4 rows)

Here's what happens when you query the time table: