REGEXP_REPLACE
Replace all occurrences of a substring that match a regular expression with another substring. It is similar to the REPLACE function, except it uses a regular expression to select the substring to be replaced.
Syntax
REGEXP_REPLACE( string, target [, replacement [, position [, occurrence ... [, regexp_modifiers ] ] ] ] )
Parameters
string |
The |
||||||||||||
target
|
The regular expression to search for within the string. The syntax of the regular expression is compatible with the Perl 5 regular expression syntax. See the Perl Regular Expressions Documentation for details. |
||||||||||||
replacement |
The string to replace matched substrings. If you do not supply a replacement, the function deletes matched substrings. The replacement string can contain backreferences for substrings captured by the regular expression. The first captured substring is inserted into the replacement string using |
||||||||||||
position
|
[Optional] The number of characters from the start of the string where the function should start searching for matches. By default, the function begins searching for a match at the first (leftmost) character. Setting this parameter to a value greater than 1 begins searching for a match at the nth character you specify. Default value: 1 |
||||||||||||
occurrence
|
[Optional] Controls which occurrence of a pattern match in the string to return. By default, the function returns the position of the first matching substring. Use this parameter to find the position of subsequent matching substrings. For example, setting this parameter to 3 returns the position of the third substring that matches the pattern. Default value: 1 |
||||||||||||
regexp_modifier
|
[Optional] One or more single-character flags that modify how the regular expression finds matches in string:
|
Notes
This function operates on UTF-8 strings using the default locale, even if the locale has been set to something else.
If you are porting a regular expression query from an Oracle database, remember that Oracle considers a zero-length string to be equivalent to NULL, while Vertica does not.
Another key difference between Oracle and Vertica is that Vertica can handle an unlimited number of captured subexpressions, while Oracle is limited to nine.
In Vertica, you can use \10
in the replacement pattern to access the substring captured by the tenth set of parentheses in the regular expression. In Oracle, \10
is treated as the substring captured by the first set of parentheses, followed by a zero. To force this Oracle behavior in Vertica, use the \g
back reference and enclose the number of the captured subexpression in curly braces. For example, \g{1}0
is the substring captured by the first set of parentheses followed by a zero.
You can also name captured subexpressions to make your regular expressions less ambiguous. See the PCRE documentation for details.
Examples
Find groups of "word characters" (letters, numbers and underscore) ending with "thy" in the string "healthy, wealthy, and wise" and replace them with nothing.
=> SELECT REGEXP_REPLACE('healthy, wealthy, and wise','\w+thy'); REGEXP_REPLACE ---------------- , , and wise (1 row)
Find groups of word characters ending with "thy" and replace with the string "something."
=> SELECT REGEXP_REPLACE('healthy, wealthy, and wise','\w+thy', 'something'); REGEXP_REPLACE -------------------------------- something, something, and wise (1 row)
Find groups of word characters ending with "thy" and replace with the string "something" starting at the third character in the string.
=> SELECT REGEXP_REPLACE('healthy, wealthy, and wise','\w+thy', 'something', 3); REGEXP_REPLACE ---------------------------------- hesomething, something, and wise (1 row)
Replace the second group of word characters ending with "thy" with the string "something."
=> SELECT REGEXP_REPLACE('healthy, wealthy, and wise','\w+thy', 'something', 1, 2); REGEXP_REPLACE ------------------------------ healthy, something, and wise (1 row)
Find groups of word characters ending with "thy" capturing the letters before the "thy", and replace with the captured letters plus the letters "ish."
=> SELECT REGEXP_REPLACE('healthy, wealthy, and wise','(\w+)thy', '\1ish'); REGEXP_REPLACE ---------------------------- healish, wealish, and wise (1 row)
Create a table to demonstrate replacing strings in a query.
=> CREATE TABLE customers (name varchar(50), phone varchar(11)); CREATE TABLE => CREATE PROJECTION customers1 AS SELECT * FROM customers; CREATE PROJECTION => COPY customers FROM stdin; Enter data to be copied followed by a newline. End with a backslash and a period on a line by itself. >> Able, Adam|17815551234 >> Baker,Bob|18005551111 >> Chu,Cindy|16175559876 >> Dodd,Dinara|15083452121 >> \.
Query the customers, using REGEXP_REPLACE to format the phone numbers.
=> SELECT name, REGEXP_REPLACE(phone, '(\d)(\d{3})(\d{3})(\d{4})', '\1-(\2) \3-\4') as phone FROM customers; name | phone -------------+------------------ Able, Adam | 1-(781) 555-1234 Baker,Bob | 1-(800) 555-1111 Chu,Cindy | 1-(617) 555-9876 Dodd,Dinara | 1-(508) 345-2121 (4 rows)