vDataFrame.sessionize

In [ ]:
vDataFrame.sessionize(ts: str,
                      by: list = [],
                      session_threshold: str = "30 minutes",
                      name: str = "session_id")

Adds a new vcolumn to the vDataFrame that corresponds to user sessions (user activity during a specific time). A session ends when ts - lag(ts) is greater than the specified threshold.

Parameters

Name Type Optional Description
ts
str
vcolumn used as timeline. It will be to use to order the data. It can be a numerical or type date like (date, datetime, timestamp...) vcolumn.
by
list
vcolumns used in the partition.
session_threshold
str
This parameter is the threshold which will determine the end of the session. For example, if it is set to '10 minutes' the session ends after 10 minutes of inactivity.
name
str
The session name.

Returns

vDataFrame : self

Example

In [72]:
from verticapy import vDataFrame
expedia = vDataFrame("public.expedia").select(["date_time", "user_id"])
display(expedia)
📅
date_time
Timestamp
123
user_id
Int
12013-01-07 00:00:02461899
22013-01-07 00:00:0613796
32013-01-07 00:00:061128575
42013-01-07 00:00:171018895
52013-01-07 00:00:18783725
62013-01-07 00:00:281197968
72013-01-07 00:00:28593375
82013-01-07 00:00:291174819
92013-01-07 00:00:33233534
102013-01-07 00:00:39519086
112013-01-07 00:00:42176709
122013-01-07 00:00:481174819
132013-01-07 00:00:501173504
142013-01-07 00:00:51233534
152013-01-07 00:00:55836947
162013-01-07 00:00:591046558
172013-01-07 00:00:59886436
182013-01-07 00:01:10207769
192013-01-07 00:01:18901573
202013-01-07 00:01:34593375
212013-01-07 00:01:361103572
222013-01-07 00:01:451197968
232013-01-07 00:01:531174819
242013-01-07 00:02:02614322
252013-01-07 00:02:02725753
262013-01-07 00:02:20901573
272013-01-07 00:02:21799663
282013-01-07 00:02:241128575
292013-01-07 00:02:28970396
302013-01-07 00:02:33756870
312013-01-07 00:02:36856762
322013-01-07 00:02:43614322
332013-01-07 00:02:431162059
342013-01-07 00:02:451150496
352013-01-07 00:02:465061
362013-01-07 00:02:591162059
372013-01-07 00:03:0980737
382013-01-07 00:03:19176709
392013-01-07 00:03:23486357
402013-01-07 00:03:27228687
412013-01-07 00:03:33901573
422013-01-07 00:03:35598076
432013-01-07 00:03:371162059
442013-01-07 00:03:4113796
452013-01-07 00:03:52756870
462013-01-07 00:03:591128575
472013-01-07 00:04:02593375
482013-01-07 00:04:14368084
492013-01-07 00:04:17486357
502013-01-07 00:04:241002737
512013-01-07 00:04:25970396
522013-01-07 00:04:2980737
532013-01-07 00:04:31368084
542013-01-07 00:04:32564664
552013-01-07 00:04:33192004
562013-01-07 00:04:33753394
572013-01-07 00:04:37103449
582013-01-07 00:04:401162059
592013-01-07 00:04:42368084
602013-01-07 00:04:44332853
612013-01-07 00:04:51783725
622013-01-07 00:04:55976118
632013-01-07 00:04:58368084
642013-01-07 00:05:04192004
652013-01-07 00:05:31124561
662013-01-07 00:05:32425340
672013-01-07 00:05:34487265
682013-01-07 00:05:391018895
692013-01-07 00:05:42368084
702013-01-07 00:05:47212900
712013-01-07 00:05:56756870
722013-01-07 00:05:57246177
732013-01-07 00:06:0158429
742013-01-07 00:06:0713796
752013-01-07 00:06:111115804
762013-01-07 00:06:11901573
772013-01-07 00:06:23871663
782013-01-07 00:06:33487265
792013-01-07 00:06:351162059
802013-01-07 00:06:40762423
812013-01-07 00:06:42399892
822013-01-07 00:06:5158429
832013-01-07 00:06:5118250
842013-01-07 00:07:0158429
852013-01-07 00:07:02204066
862013-01-07 00:07:031128575
872013-01-07 00:07:081084815
882013-01-07 00:07:121181711
892013-01-07 00:07:17332853
902013-01-07 00:07:2313796
912013-01-07 00:07:2558429
922013-01-07 00:07:29228687
932013-01-07 00:07:301194880
942013-01-07 00:07:30486357
952013-01-07 00:07:36487265
962013-01-07 00:07:451018895
972013-01-07 00:08:04191759
982013-01-07 00:08:14255311
992013-01-07 00:08:1680737
1002013-01-07 00:08:20486729
Rows: 1-100 of 37670293 | Columns: 2
In [73]:
# Creating use session: incremental label. It increments when the user
# did not click for more than 30 minutes.
expedia.sessionize(ts = "date_time",
                   by = ["user_id"],
                   session_threshold = "30 minutes")
📅
date_time
Timestamp
123
user_id
Int
123
session_id
Integer
12014-04-25 16:14:2810
22014-04-25 16:14:4610
32014-04-25 16:15:1110
42014-04-25 16:15:5010
52014-07-31 07:10:3930
62014-09-08 09:56:2131
72014-09-08 09:58:5031
82014-09-09 10:49:4832
92014-09-22 13:19:0233
102014-09-22 13:22:2533
112014-09-30 12:56:5134
122014-10-20 11:19:2135
132014-01-26 23:56:4650
142014-01-27 00:00:3150
152014-01-27 00:03:4450
162014-01-27 00:05:1250
172014-01-27 00:06:1850
182014-01-27 00:08:4850
192014-01-27 00:10:1750
202014-01-27 00:36:5450
212014-01-27 01:17:0951
222014-08-27 09:45:4270
232014-08-27 09:51:3270
242014-08-27 09:58:0270
252014-08-27 09:58:5570
262014-08-27 10:01:0970
272014-08-27 10:03:5370
282014-09-10 12:20:3971
292014-09-24 11:23:5972
302014-09-24 11:36:0872
312014-09-24 11:57:4972
322014-10-06 18:28:5273
332014-10-06 18:32:3673
342014-10-29 05:57:4474
352014-10-29 05:58:4474
362014-10-31 07:14:0375
372014-11-06 10:14:5976
382014-11-06 10:44:5076
392014-11-06 10:45:5976
402014-11-06 10:47:2576
412014-11-06 11:26:1477
422014-11-06 11:27:1477
432014-11-06 11:27:4777
442014-11-07 11:01:0578
452014-11-07 11:03:1178
462014-11-10 16:27:1379
472014-11-12 11:39:45710
482014-11-12 11:40:28710
492014-11-20 07:54:42711
502014-11-20 07:56:48711
512014-11-20 09:02:53712
522014-12-01 15:07:49713
532014-12-18 10:37:47714
542014-12-18 11:04:14714
552014-12-18 12:16:29715
562014-12-18 13:19:09716
572014-01-26 23:01:5190
582014-01-26 23:06:5590
592014-01-26 23:07:1890
602014-01-26 23:28:1690
612014-01-26 23:30:3090
622014-01-26 23:31:4490
632014-01-26 23:35:5090
642014-01-26 23:36:2290
652014-01-26 23:45:0790
662014-01-27 00:36:2991
672014-01-27 00:24:08110
682014-01-27 00:36:46110
692014-01-27 00:55:26110
702014-02-15 00:27:13111
712014-02-15 00:28:37111
722014-02-15 00:29:26111
732014-02-15 00:30:22111
742014-02-15 00:30:57111
752014-02-15 00:31:14111
762014-02-15 00:31:32111
772014-02-15 00:33:32111
782014-02-15 00:38:49111
792014-02-15 00:39:10111
802014-02-15 00:39:30111
812014-02-15 00:39:49111
822014-02-15 00:40:12111
832014-02-15 00:40:29111
842014-02-15 00:40:47111
852014-02-15 00:41:04111
862014-02-15 00:41:23111
872014-02-15 00:42:18111
882014-02-15 00:45:20111
892014-02-15 00:53:08111
902014-02-15 00:59:58111
912014-02-15 01:06:41111
922014-02-15 01:15:30111
932014-02-15 02:40:52112
942014-02-15 03:43:10113
952014-02-15 03:44:52113
962014-02-15 03:50:16113
972014-02-15 03:55:25113
982014-02-15 04:05:18113
992014-02-15 04:30:13113
1002014-02-15 05:27:52114
Out[73]:
Rows: 1-100 of 37670293 | Columns: 3

See Also

vDataFrame.analytic Adds a new vcolumn to the vDataFrame by using an advanced analytical function on a specific vcolumn.