Active and Inactive Partitions
The Tuple Mover assumes that all loads and updates to a partitioned table are targeted to one or more partitions that it identifies as active. In general, the partitions with the largest partition keys—typically, the most recently created partitions—are regarded as active. As the partition ages, it commonly transitions to a mostly read-only workload and requires much less activity.
Setting Active Partition Count
You can specify how many partitions are active for partitioned tables at two levels:
- Configuration parameter
ActivePartitionCount
determines how many partitions are active for partitioned tables in the database. By default,ActivePartitionCount
is set to 1. The Tuple Mover applies this setting to all tables that do not set their own active partition count. - Individual tables can supersede
ActivePartitionCount
by setting their own active partition count withCREATE TABLE
andALTER TABLE
.
Partitioned tables in the same database can be subject to different distributions of update and load activity. When these differences are significant, it might make sense for some tables to set their own active partition counts.
For example, table store_orders
is partitioned by month and gets its active partition count from configuration parameter ActivePartitionCount
. If the parameter is set to 1, the Tuple Mover identifes the latest month—typically, the current one—as the table's active partition. If store_orders
is subject to frequent activity on data for the current month and the one before it, you might want the table to supersede the configuration parameter, and set its active partition count to 2:
ALTER TABLE public.store_orders SET ACTIVEPARTITIONCOUNT 2;
For tables partitioned by non-temporal attributes, set its active partition count to reflect the number of partitions that are subject to a high level of activity—for example, frequent loads or queries.
Identifying the Active Partition
The Tuple Mover typically identifies the active partition as the one most recently created. Vertica uses the following algorithm to determine which partitions are older than others:
- If partition X was created before partition Y, partition X is older.
- If partitions X and Y were created at the same time, but partition X was last updated before partition Y, partition X is older.
- If partitions X and Y were created and last updated at the same time, the partition with the smaller key is older.
You can obtain the active partitions for a table by joining system tables PARTITIONS
and STRATA
and querying on its projections. For example, the following query gets the active partition for projection store_orders_super
:
=> SELECT p.node_name, p.partition_key, p.ros_id, p.ros_size_bytes, p.ros_row_count, ROS_container_count FROM partitions p JOIN strata s ON p.partition_key = s.stratum_key AND p.node_name=s.node_name WHERE p.projection_name = 'store_orders_super' ORDER BY p.node_name, p.partition_key; node_name | partition_key | ros_id | ros_size_bytes | ros_row_count | ROS_container_count ------------------+---------------+-------------------+----------------+---------------+--------------------- v_vmart_node0001 | 2017-09-01 | 45035996279322851 | 6905 | 960 | 1 v_vmart_node0002 | 2017-09-01 | 49539595906590663 | 6905 | 960 | 1 v_vmart_node0003 | 2017-09-01 | 54043195533961159 | 6905 | 960 | 1 (3 rows)
Active Partition Groups
If a table's partition clause includes a GROUP BY
expression, Vertica applies the table's active partition count to its largest partition group key, and regards all the partitions in that group as active. If you group partitions with Vertica meta-function CALENDAR_HIERARCHY_DAY
, the most recent date partitions are also grouped by day. Thus, the largest partition group key and largest partition key are identical. In effect, this means that only the most recent partitions are active.
For more information about partition grouping, see Partition Grouping and Hierarchical Partitioning.