MERGE Optimization
You can improve MERGE
performance in several ways:
- Design projections for optimal
MERGE
performance. - Facilitate creation of optimized query plans.
- Use source tables that are smaller than target tables.
Projections for MERGE Operations
The Vertica query optimizer automatically chooses the best projections to implement a merge operation. A good projection design strategy provides projections that help the query optimizer avoid extra sort and data transfer operations, and facilitate MERGE
performance.
You can rely on Database Designer to generate projections that address merge requirements. You can then customize these projections as needed.
For example, the following MERGE
statement fragment joins source and target tables tgt
and src
, respectively, on columns tgt.a
and src.b
:
=> MERGE INTO tgt USING src ON tgt.a = src.b ...
Vertica can use a local merge join if projections for tables tgt
and src
use one of the following projection designs, where inputs are presorted by projection ORDER BY
clauses:
-
Replicated projections are sorted on:
- Column
a
for tabletgt
- Column
b
for tablesrc
- Column
-
Segmented projections are identically segmented on:
- Column
a
for tabletgt
- Column
b
for tablesrc
- Corresponding segmented columns
- Column
Optimizing MERGE Query Plans
Vertica prepares an optimized query plan if the following conditions are all true:
- The
MERGE
statement contains both matching clausesWHEN MATCHED THEN UPDATE SET
andWHEN NOT MATCHED THEN INSERT
. If theMERGE
statement contains only one matching clause, it uses a non-optimized query plan. - The
MERGE
statement excludes update and insert filters. - The target table join column has a unique or primary key constraint. This requirement does not apply to the source table join column.
- Both matching clauses specify all columns in the target table.
- Both matching clauses specify identical source values.
For details on evaluating an EXPLAIN
-generated query plan, see MERGE Path.
The examples that follow use a simple schema to illustrate some of the conditions under which Vertica prepares or does not prepare an optimized query plan for MERGE
:
CREATE TABLE target(a INT PRIMARY KEY, b INT, c INT) ORDER BY b,a; CREATE TABLE source(a INT, b INT, c INT) ORDER BY b,a; INSERT INTO target VALUES(1,2,3); INSERT INTO target VALUES(2,4,7); INSERT INTO source VALUES(3,4,5); INSERT INTO source VALUES(4,6,9); COMMIT;
Optimized MERGE statement
Vertica can prepare an optimized query plan for the following MERGE
statement because:
- The target table's join column
t.a
has a primary key constraint. - All columns in the target table
(a,b,c)
are included in theUPDATE
andINSERT
clauses. - The
UPDATE
andINSERT
clauses specify identical source values:s.a
,s.b
, ands.c
.
MERGE INTO target t USING source s ON t.a = s.a WHEN MATCHED THEN UPDATE SET a=s.a, b=s.b, c=s.c WHEN NOT MATCHED THEN INSERT(a,b,c) VALUES(s.a, s.b, s.c); OUTPUT -------- 2 (1 row)
The output value of 2 indicates success and denotes the number of rows updated/inserted from the source into the target.
Non-optimized MERGE statement
In the next example, the MERGE
statement runs without optimization because the source values in the UPDATE/INSERT
clauses are not identical. Specifically, the UPDATE
clause includes constants for columns s.a
and s.c
and the INSERT
clause does not:
MERGE INTO target t USING source s ON t.a = s.a WHEN MATCHED THEN UPDATE SET a=s.a + 1, b=s.b, c=s.c - 1 WHEN NOT MATCHED THEN INSERT(a,b,c) VALUES(s.a, s.b, s.c);
To make the previous MERGE
statement eligible for optimization, rewrite the statement so that the source values in the UPDATE
and INSERT
clauses are identical:
MERGE INTO target t USING source s ON t.a = s.a WHEN MATCHED THEN UPDATE SET a=s.a + 1, b=s.b, c=s.c -1 WHEN NOT MATCHED THEN INSERT(a,b,c) VALUES(s.a + 1, s.b, s.c - 1);