FREGEXPARSER

Parses a regular expression, matching columns to the contents of the named regular expression groups.

Syntax

FREGEXPARSER ( pattern=[parameter‑name='value'[,…]] )

Parameters

pattern		
VARCHAR

Specifies the regular expression of data to match.

Default value: Empty string ("")

use_jit
BOOLEAN

[Optional] Indicates whether to use just-in-time compiling when parsing the regular expression.

Default value: false

record_terminator
VARCHAR

[Optional] Specifies the character used to separate input records.

Default value: \n

logline_column
VARCHAR

[Optional] Captures the destination column containing the full string that the regular expression matched.

Default value: Empty string ("")

Example

These examples use the following regular expression, which searches for information that includes the timestamp, date, thread_name, and thread_id strings. 

For display purposes, this sample regular expression adds new line characters to split long lines of text. To use this expression in a query, first copy and edit the example to remove any new line characters.

This example expression loads any thread_id hex value, regardless of whether it has a 0x prefix, (<thread_id>(?:0x)?[0-9a-f]+).

'^(?<time>\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d\.\d+)
 (?<thread_name>[A-Za-z ]+):(?<thread_id>(?:0x)?[0-9a-f]+)
-?(?<transaction_id>[0-9a-f])?(?:[(?<component>\w+)] 
\<(?<level>\w+)\> )?(?:<(?<elevel>\w+)> @[?(?<enode>\w+)]?: )
?(?<text>.*)'
  1. Create a flex table (vlog) to contain the results of a Vertica log file. For this example, we made a copy of a log file in the directory /home/dbadmin/data/vertica.log

    => create flex table vlog1();
    CREATE TABLE
    
  2. Use the fregexparser with the sample regular expression to load data from the log file. Be sure to remove any line characters before using this expression shown here: 

    =>  copy vlog1 from '/home/dbadmin/tempdat/KMvertica.log' 
    PARSER FREGEXPARSER(pattern='^(?<time>\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d\.\d+)
     (?<thread_name>[A-Za-z ]+):(?<thread_id>(?:0x)?[0-9a-f]+)
    -?(?<transaction_id>[0-9a-f])?(?:[(?<component>\w+)] 
    \<(?<level>\w+)\> )?(?:<(?<elevel>\w+)> @[?(?<enode>\w+)]?: )
    ?(?<text>.*)');
    
     Rows Loaded 
    -------------
           31049
    (1 row)
    
  3. After successfully loading data, use the MAPTOSTRING() function with the table's __raw__ column. The four rows (limt 4) that the query returns are regular expression results of the KMvertica.log file, parsed with fregexparser. The output shows thread_id values with a preceding 0x or without: 
  4. => select maptostring(__raw__) from vlog1 limit 4;
                            maptostring                                                                                                                                        
    -------------------------------------------------------------------------------------
     {
       "text" : " [Init] <INFO> Log /home/dbadmin/VMart/v_vmart_node0001_catalog/vertica.log 
    opened; #2",
       "thread_id" : "0x7f2157e287c0",
       "thread_name" : "Main",
       "time" : "2017-03-21 23:30:01.704"
    }
    
     {
       "text" : " [Init] <INFO> Processing command line: /opt/vertica/bin/vertica -D 
    /home/dbadmin/VMart/v_vmart_node0001_catalog -C VMart -n v_vmart_node0001 -h 
    10.20.100.247 -p 5433 -P 4803 -Y ipv4",
       "thread_id" : "0x7f2157e287c0",
       "thread_name" : "Main",
       "time" : "2017-03-21 23:30:01.704"
    }
    
     {
       "text" : " [Init] <INFO> Starting up Vertica Analytic Database v8.1.1-20170321",
       "thread_id" : "7f2157e287c0",
       "thread_name" : "Main",
       "time" : "2017-03-21 23:30:01.704"
    }
    
     {
       "text" : " [Init] <INFO> Compiler Version: 4.8.2 20140120 (Red Hat 4.8.2-15)",
       "thread_id" : "7f2157e287c0",
       "thread_name" : "Main",
       "time" : "2017-03-21 23:30:01.704"
    }
     
    (4 rows)
    

See Also