MAPREGEXEXTRACTOR

Extracts data from a regular expression and returns the results as a VMap. Use the USING PARAMETERS pattern= phrase, followed by the regular expression.

Parameters

pattern=
VARCHAR

The regular expression as a string.

Default value: An empty string ("").

use_jit
BOOLEAN

[Optional] Uses just-in-time compiling when parsing the regular expression.

Default value: false.

record_terminator
VARCHAR

[Optional] The character used to separate input records.

Default value: \n.

logline_column
VARCHAR

[Optional] The destination column containing the full string that the regular expression matched.

Default value: An empty string ("").

Examples

These examples use the following regular expression, which searches for information that includes the timestamp, date, thread_name, and thread_id strings. 

Caution: For display purposes, this sample regular expression adds new line characters to split long lines of text. To use this expression in a query, first copy and edit the example to remove any new line characters.

This example expression loads any thread_id hex value, regardless of whether it has a 0x prefix, (<thread_id>(?:0x)?[0-9a-f]+).

'^(?<time>\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d\.\d+)
 (?<thread_name>[A-Za-z ]+):(?<thread_id>(?:0x)?[0-9a-f]+)
-?(?<transaction_id>[0-9a-f])?(?:[(?<component>\w+)] 
\<(?<level>\w+)\> )?(?:<(?<elevel>\w+)> @[?(?<enode>\w+)]?: )
?(?<text>.*)'

The following examples may include newline characters for display purposes.

  1. Create a flex table, flogs
  2. => CREATE flex table flogs();
    CREATE TABLE
    
  3. Use COPY to load a sample log file (vertica.log), using the flex table fregexparser. Note that this example includes added line characters for displaying long text lines.
  4. => COPY flogs FROM '/home/dbadmin/tempdat/vertica.log' PARSER FREGEXPARSER(pattern='
    ^(?<time>\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d\.\d+) (?<thread_name>[A-Za-z ]+):
    (?<thread_id>(?:0x)?[0-9a-f])-?(?<transaction_id>[0-9a-f])?(?:[(?<component>\w+)] 
    \<(?<level>\w+)\> )?(?:<(?<elevel>\w+)> @[?(?<enode>\w+)]?: )?(?<text>.*)');
    Rows Loaded
    -------------
    81399
    (1 row)
  5. Use MapToString to return the results from calling MapRegexExtractor with a regular expression. The output returns the results of the function in string format.
    => SELECT maptostring(MapregexExtractor(E'2014-04-02 04:02:51.011 
    TM Moveout:0x2aab9000f860-a0000000002067 [Txn] <INFO> 
    Begin Txn: a0000000002067 \'Moveout: Tuple Mover\'' using PARAMETERS 
    pattern='^(?<time>\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d\.\d+)
     (?<thread_name>[A-Za-z ]+):(?<thread_id>(?:0x)?[0-9a-f]+)
    -?(?<transaction_id>[0-9a-f])?(?:[(?<component>\w+)] 
    \<(?<level>\w+)\> )?(?:<(?<elevel>\w+)> @[?(?<enode>\w+)]?: )
    ?(?<text>.*)')) FROM flogs where __identity__=13;
    
    maptostring
    --------------------------------------------------------------------------------------------------
    {
    "component" : "Txn",
    "level" : "INFO",
    "text" : "Begin Txn: a0000000002067 'Moveout: Tuple Mover'",
    "thread_id" : "0x2aab9000f860",
    "thread_name" : "TM Moveout",
    "time" : "2014-04-02 04:02:51.011",
    "transaction_id" : "a0000000002067"
    }
    (1 row)

See Also