tablesample¶
In [ ]:
tablesample(values: dict = {},
dtype: dict = {},
count: int = 0,
offset: int = 0,
percent: dict = {})
A tablesample stores aggregated results in memory and can be transformed into a pandas.DataFrame or vDataFrame.
Parameters¶
| Name | Type | Optional | Description |
|---|---|---|---|
values | dict | ✓ | Dictionary of columns (keys) and their values in the following format: {"column1": [val1, ..., valm], ... "columnk": [val1, ..., valm]} |
dtype | dict | ✓ | Columns data types. |
count | int | ✓ | Number of elements to rendeer when loading the entire dataset. This is only used for rendering purposes. |
offset | int | ✓ | Number of elements to skip when loading the entire dataset. This is used only for rendering purposes. |
percent | dict | ✓ | Dictionary of missing values (used to display the percent bars). |
Attributes¶
The attributes of the tablesample are the same as its parameters.
Methods¶
| Name | Description |
|---|---|
| append | Appends the input tablesample to a target tablesample. |
| decimal_to_float | Converts all the tablesample's decimals to floats. |
| merge | Merges the input tablesample to a target tablesample. |
| shape | Computes the tablesample shape. |
| sort | Sorts the tablesample using the input column. |
| transpose | Transposes the tablesample. |
| to_list | Converts the tablesample to a list. |
| to_numpy | Converts the tablesample to a numpy array. |
| to_pandas | Converts the tablesample to a pandas DataFrame. |
| to_sql | Generates the SQL query associated with the tablesample. |
| to_vdf | Converts the tablesample to a vDataFrame. |
Example¶
In [24]:
from verticapy.utilities import *
dataset = tablesample(values = {"index": [0, 1, 2],
"name": ["Bernard", "Fred", "Cassandra"],
"first_name": ["Wall", "Teban", "Polo"]}).to_vdf()
display(dataset)
In [25]:
# using complex datatype
dataset = tablesample(values = {"index": [0, 1, 2],
"name": ["Bernard", "Fred", "Cassandra"],
"first_name": ["Wall", "Teban", "Polo"],
"information": [{"number": 33, "fav": "Real"},
{"number": 2, "fav": "Barcelone"},
{"number": 1, "fav": "Boston"},],
"fav_artists": [["Inna", "Connect R"],
["Majda Roumi"],
["Beyonce", "Alicia Keys", "Dr Dre"],]}).to_vdf()
display(dataset)
