MERGE Optimization

You can improve MERGE performance in several ways:

Projections for MERGE Operations

The Vertica query optimizer automatically chooses the best projections to implement a merge operation. A good projection design strategy provides projections that help the query optimizer avoid extra sort and data transfer operations, and facilitate MERGE performance.

You can rely on Database Designer to generate projections that address merge requirements. You can then customize these projections as needed.

For example, the following MERGE statement fragment joins source and target tables tgt and src, respectively, on columns tgt.a and src.b:

=> MERGE INTO tgt USING src ON tgt.a = src.b ...

Vertica can use a local merge join if projections for tables tgt and src use one of the following projection designs, where inputs are presorted by projection ORDER BY clauses:

  • Replicated projections are sorted on:

    • Column a for table tgt
    • Column b for table src
  • Segmented projections are identically segmented on:

    • Column a for table tgt
    • Column b for table src
    • Corresponding segmented columns

Optimizing MERGE Query Plans

Vertica prepares an optimized query plan if the following conditions are all true:

  • The MERGE statement contains both matching clauses WHEN MATCHED THEN UPDATE SET and WHEN NOT MATCHED THEN INSERT. If the MERGE statement contains only one matching clause, it uses a non-optimized query plan.
  • The MERGE statement excludes update and insert filters.
  • The target table join column has a unique or primary key constraint. This requirement does not apply to the source table join column.
  • Both matching clauses specify all columns in the target table.
  • Both matching clauses specify identical source values.

For details on evaluating an EXPLAIN-generated query plan, see MERGE Path.

The examples that follow use a simple schema to illustrate some of the conditions under which Vertica prepares or does not prepare an optimized query plan for MERGE:

CREATE TABLE target(a INT PRIMARY KEY, b INT, c INT) ORDER BY b,a;
CREATE TABLE source(a INT, b INT, c INT) ORDER BY b,a;
INSERT INTO target VALUES(1,2,3);
INSERT INTO target VALUES(2,4,7);
INSERT INTO source VALUES(3,4,5);
INSERT INTO source VALUES(4,6,9);
COMMIT;

Optimized MERGE statement

Vertica can prepare an optimized query plan for the following MERGE statement because:

  • The target table's join column t.a has a primary key constraint.
  • All columns in the target table (a,b,c) are included in the UPDATE and INSERT clauses.
  • The UPDATE and INSERT clauses specify identical source values: s.a, s.b, and s.c.
MERGE INTO target t USING source s ON t.a = s.a      
  WHEN MATCHED THEN UPDATE SET a=s.a, b=s.b, c=s.c
  WHEN NOT MATCHED THEN INSERT(a,b,c) VALUES(s.a, s.b, s.c);

 OUTPUT
--------
2
(1 row)

The output value of 2 indicates success and denotes the number of rows updated/inserted from the source into the target.

Non-optimized MERGE statement

In the next example, the MERGE statement runs without optimization because the source values in the UPDATE/INSERT clauses are not identical. Specifically, the UPDATE clause includes constants for columns s.a and s.c and the INSERT clause does not:

MERGE INTO target t USING source s ON t.a = s.a      
  WHEN MATCHED THEN UPDATE SET a=s.a + 1, b=s.b, c=s.c - 1
  WHEN NOT MATCHED THEN INSERT(a,b,c) VALUES(s.a, s.b, s.c);

To make the previous MERGE statement eligible for optimization, rewrite the statement so that the source values in the UPDATE and INSERT clauses are identical:

MERGE INTO target t USING source s ON t.a = s.a      
  WHEN MATCHED THEN UPDATE SET a=s.a + 1, b=s.b, c=s.c -1
  WHEN NOT MATCHED THEN INSERT(a,b,c) VALUES(s.a + 1, s.b, s.c - 1);