Vertica Analytics Platform Version 9.2.x Documentation

TIMESERIES Clause

Provides gap-filling and interpolation (GFI) computation, an important component of time series analytics computation. See Time Series Analytics in Analyzing Data for details and examples.

Syntax

TIMESERIES slice‑time AS 'length‑and‑time‑unit‑expr' OVER ( 
... [ PARTITION BY (column‑expr[,…] ]
... ORDER BY time‑expr ) 
... [ ORDER BY table‑column[,…] ]

Parameters

slice‑time

A time column produced by the TIMESERIES clause, which stores the time slice start times generated from gap filling.

Note: This parameter is an alias, so you can use any name that an alias would take.

length‑and‑time‑unit‑expr

An INTERVAL DAY TO SECOND literal that specifies the length of time unit of time slice computation. For example:

TIMESERIES slice_time AS '3 seconds' ...

OVER() Specifies partitioning and ordering for the function. OVER() also specifies that the time series function operates on a query result set—that is, the rows that are returned after the FROM, WHERE, GROUP BY, and HAVING clauses are evaluated.
PARTITION BY (column‑expr[,…] ) Partitions the data by the specified column expressions. Gap filling and interpolation is performed on each partition separately
ORDER BY time‑expr Sorts the data by the TIMESTAMP expression time‑expr, which computes the time information of the time series data.

The TIMESERIES clause requires an ORDER BY operation on the timestamp column.

Notes

If the window-partition-clause is not specified in TIMESERIES OVER(), for each defined time slice, exactly one output record is produced; otherwise, one output record is produced per partition per time slice. Interpolation is computed there.

Given a query block that contains a TIMESERIES clause, the following are the semantic phases of execution (after evaluating the FROM and the optional WHERE clauses):

  1. Compute time-expression.
  2. Perform the same computation as the TIME_SLICE() function on each input record based on the result of time‑exp and 'length‑and‑time‑unit‑expr'.

    1. Perform gap filling to generate time slices missing from the input.
    2. Name the result of this computation as slice_time, which represents the generated “time series” column (alias) after gap filling.
  3. Partition the data by expression, slice‑time. For each partition, do step 4.
  4. Sort the data by time‑expr. Interpolation is computed here.

There is semantic overlap between the TIMESERIES clause and the TIME_SLICE function with the following key differences:

  • TIMESERIES only supports the interval qualifier DAY TO SECOND; it does not allow YEAR TO MONTH.
  • Unlike TIME_SLICE, the time slice length and time unit expressed in length‑and‑time‑unit‑expr must be constants so gaps in the time slices are well-defined.
  • TIMESERIES performs gap filling; the TIME_SLICE function does not.
  • TIME_SLICE can return the start or end time of a time slice, depending on the value of its fourth input parameter (start‑or‑end). TIMESERIES, on the other hand, always returns the start time of each time slice. To output the end time of each time slice, write a SELECT statement like the following:
    => SELECT slice_time + <slice_length>;

Restrictions

  • When the TIMESERIES clause occurs in a SQL query block, only the following clauses can be used in the same query block:
    • SELECT
    • FROM
    • WHERE
    • ORDER BY
  • GROUP BY and HAVING clauses are not allowed. If a GROUP BY operation is needed before or after gap-filling and interpolation (GFI), use a subquery and place the GROUP BY In the outer query. For example:
    => SELECT symbol, AVG(first_bid) as avg_bid FROM (
            SELECT symbol, slice_time, TS_FIRST_VALUE(bid1) AS first_bid
            FROM Tickstore 
            WHERE symbol IN ('MSFT', 'IBM') 
            TIMESERIES slice_time AS '5 seconds' OVER (PARTITION BY symbol ORDER BY ts)
            ) AS resultOfGFI
    GROUP BY symbol;
    
  • When the TIMESERIES clause is present in the SQL query block, the SELECT list can include only the following:

    For example, the following two queries return a syntax error because bid1 is not a PARTITION BY or GROUP BY column:

    => SELECT bid, symbol, TS_FIRST_VALUE(bid) FROM Tickstore
       TIMESERIES slice_time AS '5 seconds' OVER (PARTITION BY symbol ORDER BY ts);
       ERROR:  column "Tickstore.bid" must appear in the PARTITION BY list of Timeseries clause or be used in a Timeseries Output function
    => SELECT bid, symbol, AVG(bid) FROM Tickstore
       GROUP BY symbol;
       ERROR:  column "Tickstore.bid" must appear in the GROUP BY clause or be used in an aggregate function
    

Examples

For examples, see Gap Filling and Interpolation (GFI) in Analyzing Data.