DROP_PARTITIONS function to drop one or more partition keys for a given table. You can specify a single partition key or a range of partition keys.
For example, the table shown in Partitioning a New Table is partitioned by column
=> CREATE TABLE public.store_orders ( order_no int, order_date timestamp NOT NULL, shipper varchar(20), ship_date date ) PARTITION BY YEAR(order_date);
Given this table definition, Vertica creates a partition key for each unique
order_date year—in this case, 2017, 2016, 2015, and 2014—and divides the data into separate ROS containers accordingly.
DROP_PARTITIONS statement drops from table
store_orders all order records associated with partition key 2014:
=> SELECT DROP_PARTITIONS ('store_orders', 2014, 2014); Partition dropped
If a table partition clause includes a
GROUP BY clause, partitions are consolidated in the ROS by their partition group keys.
DROP_PARTITIONS can then specify a range of partition keys within a given partition group, or across multiple partition groups. In either case, the drop operation requires Vertica to split the ROS containers that store these partitions. To do so, the function's
force_split parameter must be set to true.
For example, the
store_orders table shown above can be repartitioned with a
GROUP BY clause as follows:
=> ALTER TABLE store_orders PARTITION BY order_date::DATE GROUP BY DATE_TRUNC('year', (order_date)::DATE) REORGANIZE;
With all 2014 order records having been dropped earlier,
order_date values now span three years—2017, 2016, and 2015. Accordingly, the Tuple Mover creates three partition group keys for each year, and designates one or more ROS containers for each group. It then merges
store_orders partitions into the appropriate groups.
DROP_PARTITIONS statement specifies to drop order dates that span two years, 2014 and 2015:
=> SELECT DROP_PARTITIONS('store_orders', '2015-05-30', '2016-01-16', 'true'); Partition dropped
The drop operation requires Vertica to drop partitions from two partition groups—2015 and 2016. These groups span at least two ROS containers, which must be split in order to remove the target partitions. Accordingly, the function's
force_split parameter is set to true.
Scheduling Partition Drops
If your hardware has fixed disk space, you might need to configure a regular process to roll out old data by dropping partitions.
For example, if you have only enough space to store data for a fixed number of days, configure Vertica to drop the oldest partition keys. To do so, create a time-based job scheduler such as
cron to schedule dropping the partition keys during low-load periods.
If the ingest rate for data has peaks and valleys, you can use two techniques to manage how you drop partition keys:
- Set up a process to check the disk space on a regular (daily) basis. If the percentage of used disk space exceeds a certain threshold—for example, 80%—drop the oldest partition keys.
- Add an artificial column in a partition that increments based on a metric like row count. For example, that column might increment each time the row count increases by 100 rows. Set up a process that queries this column on a regular (daily) basis. If the value in the new column exceeds a certain threshold—for example, 100—drop the oldest partition keys, and set the column value back to 0.
If the operation cannot obtain an O lock on the target table, Vertica tries to close any internal Tuple Mover sessions that are running on that table. If successful, the operation can proceed. Explicit Tuple Mover operations that are running in user sessions do not close. If an explicit Tuple Mover operation is running on the table, the operation proceeds only when the operation is complete.
DROP_PARTITIONS acquires an exclusive O lock on the target table to block any DML operation (
COPY) that might affect table data. The lock also blocks
SELECT statements that are issued at SERIALIZABLE isolation level.