REGEXP_COUNT

Returns the number times a regular expression matches a string.

Syntax

REGEXP_COUNT( string, pattern [, position [, regexp_modifier ] ] )

Parameters

string

The VARCHAR or LONG VARCHAR string to search for a regular expression pattern match. If string exists in a __raw__ column of a flex or columnar table, cast string to a LONG VARCHAR before searching for pattern.

pattern

The regular expression to search for within string. The syntax of the regular expression is compatible with the Perl 5 regular expression syntax. See the Perl Regular Expressions Documentation for details.

position

[Optional] The number of characters from the start of the string where the function should start searching for matches. By default, the function begins searching for a match at the first (leftmost) character. Setting this parameter to a value greater than 1 begins searching for a match at the nth character you specify.

Default value: 1

regexp_modifier

[Optional] One or more single-character flags that modify how the regular expression finds matches in string:

b

Treat strings as binary octets, rather than UTF-8 characters.

c

Force the match to be case sensitive (the default).

i

Force the match to be case insensitive.

m

Treat the string to match as multiple lines. Using this modifier, the start of line (^) and end of line ($) regular expression operators match line breaks (\n) within the string. Without the m modifier, the start and end of line operators match only the start and end of the string.

n

Allow the single character regular expression operator (.) to match a newline (\n). Without the n modifier, the . operator matches any character except a newline.

x

Add comments to your regular expressions. Using the x modifier causes the function to ignore all unescaped space characters and comments in the regular expression. Comments start with a hash (#) character and end with a newline (\n). All spaces in the regular expression that you want to be matched in strings must be escaped with a backslash (\) character.

Notes

This function operates on UTF-8 strings using the default locale, even if the locale has been set to something else.

If you are porting a regular expression query from an Oracle database, remember that Oracle considers a zero-length string to be equivalent to NULL, while Vertica does not.

Examples

Count the number of occurrences of the substring an in the string "a man, a plan, a canal, Panama."

=> SELECT REGEXP_COUNT('a man, a plan, a canal: Panama', 'an');
 REGEXP_COUNT 
--------------
            4
(1 row)

Find the number of occurrences of the substring an in the string "a man, a plan, a canal: Panama" starting with the fifth character.

=> SELECT REGEXP_COUNT('a man, a plan, a canal: Panama', 'an',5);
 REGEXP_COUNT 
--------------
            3
(1 row)

Find the number of occurrences of a substring containing a lower-case character followed by an. In the first example, do not use a modifier. In the second example, use the i modifier to force the regular expression to ignore case.

=> SELECT REGEXP_COUNT('a man, a plan, a canal: Panama', '[a-z]an');
 REGEXP_COUNT 
--------------
            3
(1 row)
=> SELECT REGEXP_COUNT('a man, a plan, a canal: Panama', '[a-z]an', 1, 'i');

 REGEXP_COUNT 
--------------
            4