Pre-Aggregating Data in Projections

Queries that use aggregate functions such as SUM and COUNT can perform more efficiently when they use projections that already contain the aggregated data. This improved efficiency is especially true for queries on large quantities of data.

For example, a power grid company reads 30 million smart meters that provide data at five-minute intervals. The company records each reading in a database table. Over a given year, three trillion records are added to this table.

The power grid company can analyze these records with queries that include aggregate functions to perform the following tasks:

  • Establish usage patterns.
  • Detect fraud.
  • Measure correlation to external events such as weather patterns or pricing changes.

To optimize query response time, you can create an aggregate projection, which stores the data is stored after it is aggregated.

Aggregate Projections

Vertica provides three types of projections for storing data that is returned from aggregate functions or expressions:

Recommended Use

  • Aggregate projections are most useful for queries against large sets of data.
  • For optimal query performance, the size of LAP projections should be a small subset of the anchor table—ideally, between 1 and 10 percent of the anchor table, or smaller, if possible.

Restrictions

  • DML operations on anchor tables are faster if the tables' load methods are set to DIRECT.
  • MERGE operations must be optimized if they are performed on target tables that have live aggregate projections.
  • You cannot update or delete data in temporary tables with live aggregate projections.

Requirements

In the event of manual recovery from an unclean database shutdown, live aggregate projections might require some time to refresh.

In This Section