verticapy.vDataColumn.slice#

vDataColumn.slice(length: int, unit: str = 'second', start: bool = True) → vDataFrame#

Slices and transforms the vDataColumn using a time series rule.

Parameters#

length: int: Slice size.
unit: str, optional: Slice size unit. For example, ‘minute’, ‘hour’…
start: bool, optional: If set to True, the record is sliced using the floor of the slicing instead of the ceiling.

Returns#

vDataFrame: self._parent

Examples#

Let’s begin by importing VerticaPy.

import verticapy as vp

Hint

By assigning an alias to verticapy, we mitigate the risk of code collisions with other libraries. This precaution is necessary because verticapy uses commonly known function names like “average” and “median”, which can potentially lead to naming conflicts. The use of an alias ensures that the functions from verticapy are used as intended without interfering with functions from other libraries.

Let us create a dummy dataset that has timestamp values:

vdf = vp.vDataFrame(
    {
        "time": [
            "1993-11-03 00:00:00",
            "1993-11-03 00:30:01",
            "1993-11-03 00:31:00",
            "1993-11-03 01:05:01",
            "1993-11-03 01:41:02",
            "1993-11-03 01:50:00",
        ],
        "val": [0., 1., 2., 4., 5., 4.],
    }
)

	Abc time Varchar(19) 100%	123 val Numeric(4) 100%
1	1993-11-03 00:00:00	0.0
2	1993-11-03 00:30:01	1.0
3	1993-11-03 00:31:00	2.0
4	1993-11-03 01:05:01	4.0
5	1993-11-03 01:41:02	5.0
6	1993-11-03 01:50:00	4.0

We can make sure that the column has the correct data type:

vdf["time"].astype("datetime")

Next, we can conveniently slice the data into intervals of 30 minutes using:

vdf["time"].slice(30, "minute")

	📅 time Timestamp(29) 100%	123 val Numeric(4) 100%
1	1993-11-03 00:00:00	0.0
2	1993-11-03 00:30:00	1.0
3	1993-11-03 00:30:00	2.0
4	1993-11-03 01:00:00	4.0
5	1993-11-03 01:30:00	5.0
6	1993-11-03 01:30:00	4.0

Note

While the same task can be accomplished using pure SQL (see below), adopting a Pythonic approach can offer greater convenience and help avoid potential syntax errors.

vdf["val"] = "TIME_SLICE(val, 30, 'MINUTE')"