verticapy.sql.functions.regexp_substr#

Returns the substring that matches a regular expression within a string.

Parameters#

expr: SQLExpression: Expression.
pattern: SQLExpression: The regular expression to find a substring to extract.
position: int, optional: The number of characters from the start of the string where the function should start searching for matches.
occurrence: int, optional: Controls which occurrence of a pattern match in the string to return.

Returns#

StringSQL: SQL string.

Examples#

For this example, we will use the Titanic dataset.

from verticapy.datasets import load_titanic

titanic = load_titanic()

Note

VerticaPy offers a wide range of sample datasets that are ideal for training and testing purposes. You can explore the full list of available datasets in the Datasets, which provides detailed information on each dataset and how to use them effectively. These datasets are invaluable resources for honing your data analysis and machine learning skills within the VerticaPy environment.

Now, let’s import the VerticaPy SQL functions.

import verticapy.sql.functions as vpf

Now, let’s go ahead and apply the function.

titanic["title"] = vpf.regexp_substr(
    titanic["name"],
    '([A-Za-z])+\.',
)
display(titanic[["name", "title"]])

	Abc Varchar(164) 100%	Abc title Varchar(164) 100%
1		Miss.
2		Mr.
3		Mrs.
4		Mr.
5		Mr.

Note

It’s crucial to utilize VerticaPy SQL functions in coding, as they can be updated over time with new syntax. While SQL functions typically remain stable, they may vary across platforms or versions. VerticaPy effectively manages these changes, a task not achievable with pure SQL.