EXPONENTIAL_MOVING_AVERAGE [Analytic]

Calculates the exponential moving average (EMA) of expression E with smoothing factor X. An EMA differs from a simple moving average in that it provides a more stable picture of changes to data over time.

The EMA is calculated by adding the previous EMA value to the current data point scaled by the smoothing factor, as in the following formula:

EMA = EMA0 + (X * (E - EMA0))

where:

  • E is the current data point
  • EMA0 is the previous row's EMA value.
  • X is the smoothing factor.

This function also works at the row level. For example, EMA assumes the data in a given column is sampled at uniform intervals. If the users' data points are sampled at non-uniform intervals, they should run the time series gap filling and interpolation (GFI) operations before EMA()

Behavior Type

Immutable

Syntax

EXPONENTIAL_MOVING_AVERAGE ( E, X ) OVER ( 
... [ window-partition-clause ] 
... window-order-clause  )

Parameters

E

The value whose average is calculated over a set of rows. Can be INTEGER, FLOAT or NUMERIC type and must be a constant.

X A positive FLOAT value between 0 and 1 that is used as the smoothing factor.
OVER() See Analytic Functions.

Examples

The following example uses time series gap filling and interpolation (GFI) first in a subquery, and then performs an EXPONENTIAL_MOVING_AVERAGE operation on the subquery result.

Create a simple four-column table:

=> CREATE TABLE ticker(   
     time TIMESTAMP,
     symbol VARCHAR(8),
     bid1 FLOAT, 
     bid2 FLOAT ); 

Insert some data, including nulls, so GFI can do its interpolation and gap filling:

=> INSERT INTO ticker VALUES ('2009-07-12 03:00:00', 'ABC', 60.45, 60.44);
=> INSERT INTO ticker VALUES ('2009-07-12 03:00:01', 'ABC', 60.49, 65.12);
=> INSERT INTO ticker VALUES ('2009-07-12 03:00:02', 'ABC', 57.78, 59.25);
=> INSERT INTO ticker VALUES ('2009-07-12 03:00:03', 'ABC', null, 65.12);
=> INSERT INTO ticker VALUES ('2009-07-12 03:00:04', 'ABC', 67.88, null);
=> INSERT INTO ticker VALUES ('2009-07-12 03:00:00', 'XYZ', 47.55, 40.15);
=> INSERT INTO ticker VALUES ('2009-07-12 03:00:01', 'XYZ', 44.35, 46.78);
=> INSERT INTO ticker VALUES ('2009-07-12 03:00:02', 'XYZ', 71.56, 75.78);
=> INSERT INTO ticker VALUES ('2009-07-12 03:00:03', 'XYZ', 85.55, 70.21);
=> INSERT INTO ticker VALUES ('2009-07-12 03:00:04', 'XYZ', 45.55, 58.65);
=> COMMIT;

During gap filling and interpolation, Vertica takes the closest non null value on either side of the time slice and uses that value. For example, if you use a linear interpolation scheme and you do not specify IGNORE NULLS, and your data has one real value and one null, the result is null. If the value on either side is null, the result is null. See When Time Series Data Contains Nulls in Analyzing Data for details.

Query the table that you just created to you can see the output:

=> SELECT * FROM ticker;
        time         | symbol | bid1  | bid2
---------------------+--------+-------+-------
 2009-07-12 03:00:00 | ABC    | 60.45 | 60.44
 2009-07-12 03:00:01 | ABC    | 60.49 | 65.12
 2009-07-12 03:00:02 | ABC    | 57.78 | 59.25
 2009-07-12 03:00:03 | ABC    |       | 65.12
 2009-07-12 03:00:04 | ABC    | 67.88 |
 2009-07-12 03:00:00 | XYZ    | 47.55 | 40.15
 2009-07-12 03:00:01 | XYZ    | 44.35 | 46.78
 2009-07-12 03:00:02 | XYZ    | 71.56 | 75.78
 2009-07-12 03:00:03 | XYZ    | 85.55 | 70.21
 2009-07-12 03:00:04 | XYZ    | 45.55 | 58.65
(10 rows)

The following query processes the first and last values that belong to each 2-second time slice in table trades' column a. The query then calculates the exponential moving average of expression fv and lv with a smoothing factor of 50%:

=> SELECT symbol, slice_time, fv, lv,
     EXPONENTIAL_MOVING_AVERAGE(fv, 0.5) 
       OVER (PARTITION BY symbol ORDER BY slice_time) AS ema_first,
   EXPONENTIAL_MOVING_AVERAGE(lv, 0.5) 
       OVER (PARTITION BY symbol ORDER BY slice_time) AS ema_last
   FROM (
     SELECT symbol, slice_time,
        TS_FIRST_VALUE(bid1 IGNORE NULLS) as fv,
        TS_LAST_VALUE(bid2 IGNORE NULLS) AS lv 
      FROM ticker TIMESERIES slice_time AS '2 seconds' 
      OVER (PARTITION BY symbol ORDER BY time) ) AS sq;


 symbol |     slice_time      |  fv   |  lv   | ema_first | ema_last
--------+---------------------+-------+-------+-----------+----------
 ABC    | 2009-07-12 03:00:00 | 60.45 | 65.12 |     60.45 |    65.12
 ABC    | 2009-07-12 03:00:02 | 57.78 | 65.12 |    59.115 |    65.12
 ABC    | 2009-07-12 03:00:04 | 67.88 | 65.12 |   63.4975 |    65.12
 XYZ    | 2009-07-12 03:00:00 | 47.55 | 46.78 |     47.55 |    46.78
 XYZ    | 2009-07-12 03:00:02 | 71.56 | 70.21 |    59.555 |   58.495
 XYZ    | 2009-07-12 03:00:04 | 45.55 | 58.65 |   52.5525 |  58.5725
(6 rows)