vDataFrame¶
In [ ]:
vDataFrame(input_relation: str,
usecols: list = [],
schema: str = "",
sql: str = "",
empty: bool = False)
The vDataFrame is a Python object that allows you to prepare and explore your data without modifying it. When you make "changes" to your data, the vDataFrame records your modifications as SQL queries and sends them to your Vertica database which then aggregates and returns the final result. For each column of the relation, the vDataFrame creates a virtual column (vColumn) that stores the column's alias and all user modifications to the column.

Parameters¶
Name | Type | Optional | Description |
---|---|---|---|
input_relation | str | ❌ | Relation (view, table, or temporary table) used to create the object. To get a specific schema relation, this value must specify both the relation and schema: 'schema.relation' or '"schema"."relation"'. Alternatively, you can use the 'schema' parameter, in which case the input_relation must exclude the schema name. |
usecols | list | ✓ | List of columns to use to create the object. Specifying fewer columns speeds up object creation. |
schema | str | ✓ | The schema of the relation. Specifying a schema allows you to specify a table within a particular schema, or a schema and relation name that contain period '.' characters. If specified, the input_relation parameter must exclude the schema. |
sql | str | ✓ | A SQL query from which to create the vDataFrame. If specified, the parameter 'input_relation' must be empty. |
empty | bool | ✓ | If True, this function creates an empty vDataFrame. You can use this to create a custom vDataFrame, bypassing initialization checks. |
Attributes¶
Name | Type | Description |
---|---|---|
_VERTICAPY_VARIABLES_ | dict | Dictionary containing all the vDataFrame attributes.
|
vcolumns | vcolumn | Each vColumn of the vDataFrame is accessible by specifying its name between brackets. For example to access to "myVC", you can write vDataFrame["myVC"]. |
Example¶
This example demonstrates how to create a vDataFrame from a sample dataset. Start by loading the 'titanic' dataset:
In [9]:
from verticapy.datasets import load_titanic
titanic = load_titanic(name = "titanic", schema = "public")
We can then create a vDataFrame from the dataset in the following ways:
In [5]:
from verticapy import vDataFrame
# Creating vDataFrame using the schema and the relation name
# in the 'input_relation' parameter
vDataFrame(input_relation = '"public"."titanic"')
Out[5]:
In [6]:
# Creating vDataFrame using the schema and the relation name
vDataFrame(input_relation = 'titanic', schema = 'public')
Out[6]:
In [7]:
# Creating vDataFrame using only the input vcolumns
vDataFrame(input_relation = 'titanic', schema = 'public', usecols = ["age", "survived"])
Out[7]:
In [8]:
# Creating a vDataFrame using a SQL query
vDataFrame(sql = "SELECT age, fare FROM public.titanic;")
Out[8]: