Defining Partitions
You can specify partitioning for a table when you initially define the table with CREATE TABLE
. Alternatively, you can specify partitioning for an existing table by modifying its definition with ALTER TABLE
. In the first case, Vertica automatically partitions data with each load operation. In the second case, you must explicitly repartition existing data with the Vertica function PARTITION_TABLE
.
A table definition specifies partitioning through a PARTITION BY
clause:
PARTITION BY expression
where expression resolves to a value derived from one or more table columns.
Partitioning a New Table
The following CREATE TABLE
statement creates the trade
table, which partitions data into separate years:
=> CREATE TABLE trade (
tdate DATE NOT NULL,
tsymbol VARCHAR(8) NOT NULL,
ttime TIME)
PARTITION BY EXTRACT (year FROM tdate);
CREATE TABLE
CREATE PROJECTION trade_p (tdate, tsymbol, ttime) AS
SELECT * FROM trade
ORDER BY tdate, tsymbol, ttime UNSEGMENTED ALL NODES;
INSERT INTO trade VALUES ('01/02/08' , 'AAA' , '13:00:00');
INSERT INTO trade VALUES ('02/04/09' , 'BBB' , '14:30:00');
INSERT INTO trade VALUES ('09/18/10' , 'AAA' , '09:55:00');
INSERT INTO trade VALUES ('05/06/09' , 'AAA' , '11:14:30');
INSERT INTO trade VALUES ('12/22/08' , 'BBB' , '15:30:00');
As new data is inserted into the table, Vertica implements its PARTITION BY
clause as follows:
Partitioning an Existing Table
Use the ALTER TABLE
statement to partition or repartition an existing table, as specified by the PARTITION BY
clause. On executing this statement, Vertica immediately drops any existing partition keys.
For example, the following statements create the unpartitioned trade
table, alter the table to add a constraint, and then insert data:
=> CREATE TABLE public.employee_dim( employee_key int NOT NULL, employee_gender varchar(8), courtesy_title varchar(8), employee_first_name varchar(64), employee_middle_initial varchar(8), employee_last_name varchar(64), employee_age int, hire_date date, employee_street_address varchar(256), employee_city varchar(64) NOT NULL, employee_state char(2), employee_region char(32) NOT NULL, job_title varchar(64), reports_to int, salaried_flag int, annual_salary int, hourly_rate float, vacation_days int ); CREATE TABLE => ALTER TABLE public.employee_dim ADD CONSTRAINT C_PRIMARY PRIMARY KEY (employee_key) ENABLED; ALTER TABLE => INSERT INTO employee_dim SELECT * from employee_dimension; OUTPUT -------- 10000 (1 row)
You can view how Vertica distributes the newly loaded data in various ROS containers across the cluster nodes, by querying the system table STORAGE_CONTAINERS
:
=> SELECT node_name, projection_name, storage_type, total_row_count FROM storage_containers WHERE projection_name ilike '%employee_dim_b%' ORDER BY projection_name; node_name | projection_name | storage_type | total_row_count ------------------+-----------------+--------------+----------------- v_vmart_node0001 | employee_dim_b0 | ROS | 3347 v_vmart_node0002 | employee_dim_b0 | ROS | 3321 v_vmart_node0003 | employee_dim_b0 | ROS | 3332 v_vmart_node0001 | employee_dim_b1 | ROS | 3332 v_vmart_node0002 | employee_dim_b1 | ROS | 3347 v_vmart_node0003 | employee_dim_b1 | ROS | 3321 (6 rows)
The following statement partitions the table on column employee_region
:
=> ALTER TABLE employee_dim PARTITION BY employee_region; => NOTICE 4954: The new partitioning scheme will produce 6 partitions WARNING 4493: Queries using table "employee_dim" may not perform optimally since the data may not be repartitioned in accordance with the new partition expression HINT: Use "ALTER TABLE public.employee_dim REORGANIZE;" to repartition the data ALTER TABLE
The partition clause omits the keyword REORGANIZE
, so table data remains unpartitioned. If you query the STORAGE_CONTAINERS
table again, Vertica returns the same or similar results as before. Vertica repartitions table data only when it executes a mergeout operation. You can initiate a mergeout with functions DO_TM_TASK
or PARTITION_TABLE
, or with an ALTER TABLE...REORGANIZE
statement:
=> SELECT PARTITION_TABLE('employee_dim'); PARTITION_TABLE ------------------------------------------------------------------------------ Task: partition operation (Table: public.employee_dim) (Projection: public.employee_dim_b0) (Table: public.employee_dim) (Projection: public.employee_dim_b1) (1 row)
Now when you query the STORAGE_CONTAINERS
table, the results confirm that employee_dim
data is repartitioned into logical divisions among ROS containers. The following query joins STORAGE_CONTAINERS
with system table PARTITIONS
. The truncated output below shows how Vertica redistributes the repartitioned data. The data is now ordered by partition keys (East
, Midwest
, etc.) within each projection buddy (employee_dim_b0
and employee_dim_b1
):
=> SELECT c.projection_name, p.partition_key, c.node_name, c.storage_type, c.total_row_count ROS_rows FROM STORAGE_CONTAINERS c JOIN PARTITIONS p ON c.projection_name=p.projection_name WHERE c.projection_name LIKE '%employee_dim%' ORDER BY c.projection_name, p.partition_key; projection_name | partition_key | node_name | storage_type | ROS_rows -----------------+---------------+------------------+--------------+---------- employee_dim_b0 | East | v_vmart_node0002 | ROS | 745 employee_dim_b0 | East | v_vmart_node0002 | ROS | 557 ... employee_dim_b0 | East | v_vmart_node0003 | ROS | 761 employee_dim_b0 | East | v_vmart_node0003 | ROS | 563 ... employee_dim_b0 | East | v_vmart_node0001 | ROS | 366 employee_dim_b0 | East | v_vmart_node0001 | ROS | 920 employee_dim_b0 | MidWest | v_vmart_node0002 | ROS | 745 employee_dim_b0 | MidWest | v_vmart_node0002 | ROS | 557 ... employee_dim_b0 | MidWest | v_vmart_node0003 | ROS | 761 employee_dim_b0 | MidWest | v_vmart_node0003 | ROS | 563 ... employee_dim_b0 | MidWest | v_vmart_node0001 | ROS | 366 employee_dim_b0 | MidWest | v_vmart_node0001 | ROS | 920 employee_dim_b0 | NorthWest | v_vmart_node0002 | ROS | 745 employee_dim_b0 | NorthWest | v_vmart_node0002 | ROS | 557 ... employee_dim_b0 | NorthWest | v_vmart_node0003 | ROS | 761 employee_dim_b0 | NorthWest | v_vmart_node0003 | ROS | 563 ... employee_dim_b0 | NorthWest | v_vmart_node0001 | ROS | 366 employee_dim_b0 | NorthWest | v_vmart_node0001 | ROS | 920 employee_dim_b0 | South | v_vmart_node0002 | ROS | 745 employee_dim_b0 | South | v_vmart_node0002 | ROS | 557 ... employee_dim_b0 | South | v_vmart_node0002 | ROS | 632 employee_dim_b0 | South | v_vmart_node0002 | ROS | 95 employee_dim_b0 | South | v_vmart_node0003 | ROS | 761 employee_dim_b0 | South | v_vmart_node0003 | ROS | 563 ... employee_dim_b0 | South | v_vmart_node0001 | ROS | 366 employee_dim_b0 | South | v_vmart_node0001 | ROS | 920 employee_dim_b0 | SouthWest | v_vmart_node0002 | ROS | 745 employee_dim_b0 | SouthWest | v_vmart_node0002 | ROS | 557 ... employee_dim_b0 | SouthWest | v_vmart_node0003 | ROS | 761 employee_dim_b0 | SouthWest | v_vmart_node0003 | ROS | 563 ... employee_dim_b0 | SouthWest | v_vmart_node0001 | ROS | 366 employee_dim_b0 | SouthWest | v_vmart_node0001 | ROS | 920 employee_dim_b0 | West | v_vmart_node0002 | ROS | 745 employee_dim_b0 | West | v_vmart_node0002 | ROS | 557 ... employee_dim_b0 | West | v_vmart_node0003 | ROS | 761 employee_dim_b0 | West | v_vmart_node0003 | ROS | 563 ... employee_dim_b0 | West | v_vmart_node0001 | ROS | 366 employee_dim_b0 | West | v_vmart_node0001 | ROS | 920 employee_dim_b1 | East | v_vmart_node0001 | ROS | 350 employee_dim_b1 | East | v_vmart_node0001 | ROS | 350 ... employee_dim_b1 | East | v_vmart_node0003 | ROS | 954 employee_dim_b1 | East | v_vmart_node0001 | ROS | 921 employee_dim_b1 | MidWest | v_vmart_node0001 | ROS | 350 employee_dim_b1 | MidWest | v_vmart_node0001 | ROS | 350 ... employee_dim_b1 | MidWest | v_vmart_node0003 | ROS | 954 employee_dim_b1 | MidWest | v_vmart_node0001 | ROS | 921 employee_dim_b1 | NorthWest | v_vmart_node0001 | ROS | 350 employee_dim_b1 | NorthWest | v_vmart_node0001 | ROS | 350 ... employee_dim_b1 | NorthWest | v_vmart_node0003 | ROS | 954 employee_dim_b1 | NorthWest | v_vmart_node0001 | ROS | 921 employee_dim_b1 | South | v_vmart_node0001 | ROS | 350 employee_dim_b1 | South | v_vmart_node0001 | ROS | 350 ... employee_dim_b1 | South | v_vmart_node0003 | ROS | 954 employee_dim_b1 | South | v_vmart_node0001 | ROS | 921 employee_dim_b1 | SouthWest | v_vmart_node0001 | ROS | 350 employee_dim_b1 | SouthWest | v_vmart_node0001 | ROS | 350 ... employee_dim_b1 | SouthWest | v_vmart_node0003 | ROS | 954 employee_dim_b1 | SouthWest | v_vmart_node0001 | ROS | 921 employee_dim_b1 | West | v_vmart_node0001 | ROS | 350 employee_dim_b1 | West | v_vmart_node0001 | ROS | 350 ... employee_dim_b1 | West | v_vmart_node0003 | ROS | 954 employee_dim_b1 | West | v_vmart_node0001 | ROS | 921 (648 rows)
Caution: If you repartition a table without specifying REORGANIZE
, Vertica stores new data according to the new partition expression, while existing data storage remains unchanged. Until you explicitly partition all table data, using either with ALTER TABLE...REORGANIZE
statement, or with PARTITION_TABLE
, query performance, DROP_PARTITION
, and node recovery can be adversely affected.