Rear view of two partially unrecognizable men sitting in front of a computer. One of them is pointing at a screen where are several lines of computer code.
This blog post was authored by Soniya Shah.
Vertica 8.1 introduced new functionality for the MERGE statement. In this post, we discuss new functionality for MERGE that allows users to filter conditions on INSERT and UPDATE clauses in a MERGE statement.
The MERGE operation allows users to join the target table on another table, a view, or a subquery result set. The new syntax allows users to specify an UPDATE filter and an INSERT filter as follow. The MERGE will INSERT new rows, if an ID does not match, and the INSERT filter conditions are met or UPDATE rows if there is an ID match, the UPDATE filter conditions are met, and at least one column value is different. The syntax is efficient because there is better control of when you can update or insert.
The goal of this new syntax is to limit the number of rows affected by the MERGE statement on a target table. For example, if you are merging source table rows that differ from the matching destination table rows, only the rows with a change are updated. This saves time because identical rows are never updated.
The new syntax is depicted as follows:
MERGE [ /*+ [, ] */ ]
INTO [[.].] [ [AS] ]
You might wonder when to use MERGE with filters and when to use MERGE with subqueries. Often, it depends on what you are looking to merge. To filter rows from a table using just its attributes, use MERGE with subqueries. Both options work in every scenario when you must filter a predicate from the table, such as merging all records after a certain timestamp.
Use this new syntax on wide tables or on tables with many projections. In both cases, you might want to skip identical rows, and use this syntax to save time and resources.
You should NOT use the new MERGE syntax in place of the optimized MERGE statement.
For more information, see Update and Insert Filters in the Vertica documentation and check out our previous post about how to use MERGE to demonstrate Vertica’s performance at scale here.
Currently, a first year law student with a background in science and technology. Experienced technical writer, with specializations in software documentation, big data, blog development, and website development. I build user-centered content to communicate complex and technical information more easily.
I used to work for Vertica full time for about 3 years. I still work at Vertica part time while going to law school.
This blog post was authored by Curtis Bennett. Vertica stores physical data for tables in objects known as projections. Unlike traditional RDBMS's, Vertica does not rely on indexes for performance. Instead, Vertica stores the physical data (either all or some of the columns) in whatever sort order is required for optimal query processing. This can...
This blog post was co-authored by Xiao Ling and Jim Kelley. Introduction When resource pools were first introduced to Vertica, the average computer had a lot less memory than it does today. The default memory size for the Tuple Mover resource pool, 200 MB, reflects the more limited resources of that period. As hardware and...
This blog post was authored by Curtis Bennett. In Vertica, projections can either be replicated (unsegmented), or segmented. A segmented projection divides the data up across all the nodes in your cluster. Segmentation works by hashing a key value, and then using some simple math, figuring out which node that piece of data will live...
This blog post was authored by Eugenia Moreno. It is common to create new projections to improve performance in Vertica. However, you might forget about the old projections. Vertica still loads data in projections that you might not be using. A projection that is loaded but not picked up by the Vertica optimizer consumes storage...