verticapy.sql.functions.regexp_substr#
- verticapy.sql.functions.regexp_substr(expr: str | list[str] | StringSQL | list[StringSQL], pattern: str | list[str] | StringSQL | list[StringSQL], position: int = 1, occurrence: int = 1) StringSQL #
Returns the substring that matches a regular expression within a string.
Parameters#
- expr: SQLExpression
Expression.
- pattern: SQLExpression
The regular expression to find a substring to extract.
- position: int, optional
The number of characters from the start of the string where the function should start searching for matches.
- occurrence: int, optional
Controls which occurrence of a pattern match in the string to return.
Returns#
- StringSQL
SQL string.
Examples#
For this example, we will use the Titanic dataset.
from verticapy.datasets import load_titanic titanic = load_titanic()
Note
VerticaPy offers a wide range of sample datasets that are ideal for training and testing purposes. You can explore the full list of available datasets in the Datasets, which provides detailed information on each dataset and how to use them effectively. These datasets are invaluable resources for honing your data analysis and machine learning skills within the VerticaPy environment.
Now, let’s import the VerticaPy SQL functions.
import verticapy.sql.functions as vpf
Now, let’s go ahead and apply the function.
titanic["title"] = vpf.regexp_substr( titanic["name"], '([A-Za-z])+\.', ) display(titanic[["name", "title"]])
AbcVarchar(164)100%AbctitleVarchar(164)100%1 Miss. 2 Mr. 3 Mrs. 4 Mr. 5 Mr. Note
It’s crucial to utilize VerticaPy SQL functions in coding, as they can be updated over time with new syntax. While SQL functions typically remain stable, they may vary across platforms or versions. VerticaPy effectively manages these changes, a task not achievable with pure SQL.
See also
vDataFrame.
eval()
: Evaluates the expression.