Loading...

verticapy.sql.compute_vmap_keys#

verticapy.sql.compute_vmap_keys(expr: str | StringSQL, vmap_col: str, limit: int = 100) list[tuple]#

Computes the most frequent keys in the input VMap.

Parameters#

expr: SQLRelation

Input expression. You can also specify a vDataFrame or a customized relation, but you must enclose it with an alias. For example, (SELECT 1) x is allowed, whereas (SELECT 1) and “SELECT 1” are not.

vmap_col: str

VMap column.

limit: int, optional

Maximum number of keys to consider.

Returns#

List of tuples

List of virtual column names and their respective frequencies.

Examples#

Create a JSON file:

import json

data = {
    "column1": {
        "subcolumn1A": "value1A",
        "subcolumn1B": "value1B",
    },
    "column2": {
        "subcolumn2A": "value2A",
        "subcolumn2B": "value2B",
    }
}


json_string = json.dumps(data, indent=4)

with open("nested_columns.json", "w") as json_file:
    json_file.write(str(json_string))

We import verticapy:

import verticapy as vp

Hint

By assigning an alias to verticapy, we mitigate the risk of code collisions with other libraries. This precaution is necessary because verticapy uses commonly known function names like “average” and “median”, which can potentially lead to naming conflicts. The use of an alias ensures that the functions from verticapy are used as intended without interfering with functions from other libraries.

We create a temporary schema:

vp.create_schema("temp")
Out[6]: False

We injest the JSON file:

vdf = vp.read_json(
    "nested_columns.json",
    schema = "temp",
    table_name = "test",
    flatten_maps = False,
)

Then compute the most frequent keys in the input VMap’s column1:

from verticapy.sql import compute_vmap_keys

compute_vmap_keys(expr = "temp.test", vmap_col = "column1")
Out[9]: [['subcolumn1A', 1], ['subcolumn1B', 1]]

We drop the temporary table.

vp.drop("temp.test")
Out[10]: True

Hint

Flex tables can be used to identify all the data types needed to ingest the file and can also be employed to flatten a nested JSON file. Explore all the flex functions to understand the possibilities.

See also

compute_flextable_keys() : Computes the flex table keys.
isflextable() : Checks if the input relation is a flextable.
isvmap() : Checks if the input column is a VMap.