vDataFrame.interpolate

In [ ]:
vDataFrame.interpolate(ts: str,
                  rule: (str, datetime.timedelta),
                  method: dict,
                  by: list = [])

Computes a regular time interval vDataFrame by interpolating the missing values using different techniques.

Parameters

Name Type Optional Description
ts
str
TS (Time Series) vcolumn to use to order the data. The vcolumn type must be date like (date, datetime, timestamp...)
rule
str / time
Interval used to create the time slices. The final interpolation is divided by these intervals. For example, specifying '5 minutes' creates records separated by time intervals of '5 minutes'.
method
dict
Dictionary, with the following format, of interpolation methods: {"column1": "interpolation1" ..., "columnk": "interpolationk"}. Interpolation methods must be one of the following:
  • bfill : Interpolates with the final value of the time slice. With gaps, the behavior is similar to ffill.
  • ffill : Interpolates with the first value of the time slice.
  • linear : Linear Interpolation.
by
list
vcolumns used in the partition.

Returns

vDataFrame : object result of the interpolation.

Example

In [26]:
from verticapy import tablesample
ts = tablesample({"datetime": ["1993-11-03 00:00:00", 
                               "1993-11-03 00:00:01", 
                               "1993-11-03 00:00:02",
                               "1993-11-03 00:00:03",
                               "1993-11-03 00:00:04",
                               "1993-11-03 00:00:05",
                               "1993-11-03 00:01:01",
                               "1993-11-03 00:01:02",
                               "1993-11-03 00:01:03",
                               "1993-11-03 00:01:04",
                               "1993-11-03 00:01:05",],
                  "val": [0., 1., 2., 3., 4., 5., 61., 62., 63., 64., 65.,]})
ts = ts.to_vdf()
ts["datetime"].astype("datetime")
display(ts)
📅
datetime
Datetime
123
val
Numeric(3,1)
11993-11-03 00:00:000.0
21993-11-03 00:00:011.0
31993-11-03 00:00:022.0
41993-11-03 00:00:033.0
51993-11-03 00:00:044.0
61993-11-03 00:00:055.0
71993-11-03 00:01:0161.0
81993-11-03 00:01:0262.0
91993-11-03 00:01:0363.0
101993-11-03 00:01:0464.0
111993-11-03 00:01:0565.0
Rows: 1-11 | Columns: 2
In [36]:
# Linear interpolation by second
ts.interpolate(ts = "datetime",
          rule = "1 second",
          method = {"val": "linear"},)
Out[36]:
📅
datetime
Timestamp
123
val
Float
11993-11-03 00:00:000.0
21993-11-03 00:00:011.0
31993-11-03 00:00:022.0
41993-11-03 00:00:033.0
51993-11-03 00:00:044.0
61993-11-03 00:00:055.0
71993-11-03 00:00:066.0
81993-11-03 00:00:077.0
91993-11-03 00:00:088.0
101993-11-03 00:00:099.0
111993-11-03 00:00:1010.0
121993-11-03 00:00:1111.0
131993-11-03 00:00:1212.0
141993-11-03 00:00:1313.0
151993-11-03 00:00:1414.0
161993-11-03 00:00:1515.0
171993-11-03 00:00:1616.0
181993-11-03 00:00:1717.0
191993-11-03 00:00:1818.0
201993-11-03 00:00:1919.0
211993-11-03 00:00:2020.0
221993-11-03 00:00:2121.0
231993-11-03 00:00:2222.0
241993-11-03 00:00:2323.0
251993-11-03 00:00:2424.0
261993-11-03 00:00:2525.0
271993-11-03 00:00:2626.0
281993-11-03 00:00:2727.0
291993-11-03 00:00:2828.0
301993-11-03 00:00:2929.0
311993-11-03 00:00:3030.0
321993-11-03 00:00:3131.0
331993-11-03 00:00:3232.0
341993-11-03 00:00:3333.0
351993-11-03 00:00:3434.0
361993-11-03 00:00:3535.0
371993-11-03 00:00:3636.0
381993-11-03 00:00:3737.0
391993-11-03 00:00:3838.0
401993-11-03 00:00:3939.0
411993-11-03 00:00:4040.0
421993-11-03 00:00:4141.0
431993-11-03 00:00:4242.0
441993-11-03 00:00:4343.0
451993-11-03 00:00:4444.0
461993-11-03 00:00:4545.0
471993-11-03 00:00:4646.0
481993-11-03 00:00:4747.0
491993-11-03 00:00:4848.0
501993-11-03 00:00:4949.0
511993-11-03 00:00:5050.0
521993-11-03 00:00:5151.0
531993-11-03 00:00:5252.0
541993-11-03 00:00:5353.0
551993-11-03 00:00:5454.0
561993-11-03 00:00:5555.0
571993-11-03 00:00:5656.0
581993-11-03 00:00:5757.0
591993-11-03 00:00:5858.0
601993-11-03 00:00:5959.0
611993-11-03 00:01:0060.0
621993-11-03 00:01:0161.0
631993-11-03 00:01:0262.0
641993-11-03 00:01:0363.0
651993-11-03 00:01:0464.0
661993-11-03 00:01:0565.0
Rows: 1-66 | Columns: 2
In [37]:
# First fill interpolation by second
ts.interpolate / asfreq(ts = "datetime",
          rule = "1 second",
          method = {"val": "ffill"},)
Out[37]:
📅
datetime
Timestamp
123
val
Numeric(3,1)
11993-11-03 00:00:000.0
21993-11-03 00:00:011.0
31993-11-03 00:00:022.0
41993-11-03 00:00:033.0
51993-11-03 00:00:044.0
61993-11-03 00:00:055.0
71993-11-03 00:00:065.0
81993-11-03 00:00:075.0
91993-11-03 00:00:085.0
101993-11-03 00:00:095.0
111993-11-03 00:00:105.0
121993-11-03 00:00:115.0
131993-11-03 00:00:125.0
141993-11-03 00:00:135.0
151993-11-03 00:00:145.0
161993-11-03 00:00:155.0
171993-11-03 00:00:165.0
181993-11-03 00:00:175.0
191993-11-03 00:00:185.0
201993-11-03 00:00:195.0
211993-11-03 00:00:205.0
221993-11-03 00:00:215.0
231993-11-03 00:00:225.0
241993-11-03 00:00:235.0
251993-11-03 00:00:245.0
261993-11-03 00:00:255.0
271993-11-03 00:00:265.0
281993-11-03 00:00:275.0
291993-11-03 00:00:285.0
301993-11-03 00:00:295.0
311993-11-03 00:00:305.0
321993-11-03 00:00:315.0
331993-11-03 00:00:325.0
341993-11-03 00:00:335.0
351993-11-03 00:00:345.0
361993-11-03 00:00:355.0
371993-11-03 00:00:365.0
381993-11-03 00:00:375.0
391993-11-03 00:00:385.0
401993-11-03 00:00:395.0
411993-11-03 00:00:405.0
421993-11-03 00:00:415.0
431993-11-03 00:00:425.0
441993-11-03 00:00:435.0
451993-11-03 00:00:445.0
461993-11-03 00:00:455.0
471993-11-03 00:00:465.0
481993-11-03 00:00:475.0
491993-11-03 00:00:485.0
501993-11-03 00:00:495.0
511993-11-03 00:00:505.0
521993-11-03 00:00:515.0
531993-11-03 00:00:525.0
541993-11-03 00:00:535.0
551993-11-03 00:00:545.0
561993-11-03 00:00:555.0
571993-11-03 00:00:565.0
581993-11-03 00:00:575.0
591993-11-03 00:00:585.0
601993-11-03 00:00:595.0
611993-11-03 00:01:005.0
621993-11-03 00:01:0161.0
631993-11-03 00:01:0262.0
641993-11-03 00:01:0363.0
651993-11-03 00:01:0464.0
661993-11-03 00:01:0565.0
Rows: 1-66 | Columns: 2
In [42]:
# Back fill interpolation by 3 seconds
# Back fill uses the final value of each block. With gaps, the behavior
# can be similar to first fill
ts.interpolate(ts = "datetime",
          rule = "2 seconds",
          method = {"val": "bfill"},)
Out[42]:
📅
datetime
Timestamp
123
val
Numeric(3,1)
11993-11-03 00:00:001.0
21993-11-03 00:00:023.0
31993-11-03 00:00:045.0
41993-11-03 00:00:065.0
51993-11-03 00:00:085.0
61993-11-03 00:00:105.0
71993-11-03 00:00:125.0
81993-11-03 00:00:145.0
91993-11-03 00:00:165.0
101993-11-03 00:00:185.0
111993-11-03 00:00:205.0
121993-11-03 00:00:225.0
131993-11-03 00:00:245.0
141993-11-03 00:00:265.0
151993-11-03 00:00:285.0
161993-11-03 00:00:305.0
171993-11-03 00:00:325.0
181993-11-03 00:00:345.0
191993-11-03 00:00:365.0
201993-11-03 00:00:385.0
211993-11-03 00:00:405.0
221993-11-03 00:00:425.0
231993-11-03 00:00:445.0
241993-11-03 00:00:465.0
251993-11-03 00:00:485.0
261993-11-03 00:00:505.0
271993-11-03 00:00:525.0
281993-11-03 00:00:545.0
291993-11-03 00:00:565.0
301993-11-03 00:00:585.0
311993-11-03 00:01:0061.0
321993-11-03 00:01:0263.0
331993-11-03 00:01:0465.0
Rows: 1-33 | Columns: 2

See Also

vDataFrame[].fillna Fills the vcolumn missing values.
vDataFrame[].slice Slices the vcolumn.