Loading Common Event Format (CEF) Data

Use the flex parser fcefparser to load OpenText ArcSight or other Common Event Format (CEF) log file data into columnar and flexible tables. For more information, see the ArcSight Common Event Format (CEF) Guide.

When you use the parser to load arbitrary CEF-format files, it interprets key names in the data as virtual columns in your flex table. After loading, you can query your CEF data directly, regardless of which set of keys exist in each row. You can also use the associated flex table data and map functions to manage CEF data access.

Create a Flex Table and Load CEF Data

This section uses a sample set of CEF data. All IP addresses have been purposely changed to be inaccurate, and Return characters added for illustration.

To use this sample data, copy the following text and remove all Return characters. Save the file as CEF_sample.cef, which is the name used throughout these examples.

CEF:0|ArcSight|ArcSight|6.0.3.6664.0|agent:030|Agent [test] type [testalertng] started|Low| 
eventId=1 mrt=1396328238973 categorySignificance=/Normal categoryBehavior=/Execute/Start 
categoryDeviceGroup=/Application catdt=Security Mangement categoryOutcome=/Success 
categoryObject=/Host/Application/Service art=1396328241038 cat=/Agent/Started 
deviceSeverity=Warning rt=1396328238937 fileType=Agent 
cs2=<Resource ID\="3DxKlG0UBABCAA0cXXAZIwA\=\="/> c6a4=fe80:0:0:0:495d:cc3c:db1a:de71 
cs2Label=Configuration Resource c6a4Label=Agent 
IPv6 Address ahost=SKEELES10 agt=888.99.100.1 agentZoneURI=/All Zones/ArcSight 
System/Private Address Space 
Zones/RFC1918: 888.99.0.0-888.200.255.255 av=6.0.3.6664.0 atz=Australia/Sydney 
aid=3DxKlG0UBABCAA0cXXAZIwA\=\= at=testalertng dvchost=SKEELES10 dvc=888.99.100.1 
deviceZoneURI=/All Zones/ArcSight System/Private Address Space Zones/RFC1918: 
888.99.0.0-888.200.255.255 dtz=Australia/Sydney _cefVer=0.1
  1. Create a flex table logs:

    => CREATE FLEX TABLE logs();
    CREATE TABLE
    
  2. Load the sample CEF file, using the flex parser fcefparser:

    => COPY logs FROM '/home/dbadmin/data/CEF_sample.cef' PARSER fcefparser();
     Rows Loaded
    -------------
               1
    (1 row)
    
  3. Use the maptostring() function to see the contents of the logs flex table: 

    => SELECT maptostring(__raw__) FROM logs;
                                      maptostring                                                           
    -------------------------------------------------------------------------------------
      {
       "_cefver" : "0.1",
       "agentzoneuri" : "/All Zones/ArcSight System/Private Address 
    	Space Zones/RFC1918: 888.99.0.0-888.200.255.255",
       "agt" : "888.99.100.1",
       "ahost" : "SKEELES10",
       "aid" : "3DxKlG0UBABCAA0cXXAZIwA==",
       "art" : "1396328241038",
       "at" : "testalertng",
       "atz" : "Australia/Sydney",
       "av" : "6.0.3.6664.0",
       "c6a4" : "fe80:0:0:0:495d:cc3c:db1a:de71",
       "c6a4label" : "Agent IPv6 Address",
       "cat" : "/Agent/Started",
       "catdt" : "Security Mangement",
       "categorybehavior" : "/Execute/Start",
       "categorydevicegroup" : "/Application",
       "categoryobject" : "/Host/Application/Service",
       "categoryoutcome" : "/Success",
       "categorysignificance" : "/Normal",
       "cs2" : "<Resource ID=\"3DxKlG0UBABCAA0cXXAZIwA==\"/>",
       "cs2label" : "Configuration Resource",
       "deviceproduct" : "ArcSight",
       "deviceseverity" : "Warning",
       "devicevendor" : "ArcSight",
       "deviceversion" : "6.0.3.6664.0",
       "devicezoneuri" : "/All Zones/ArcSight System/Private Address Space 
    	Zones/RFC1918: 888.99.0.0-888.200.255.255",
       "dtz" : "Australia/Sydney",
       "dvc" : "888.99.100.1",
       "dvchost" : "SKEELES10",
       "eventid" : "1",
       "filetype" : "Agent",
       "mrt" : "1396328238973",
       "name" : "Agent [test] type [testalertng] started",
       "rt" : "1396328238937",
       "severity" : "Low",
       "signatureid" : "agent:030",
       "version" : "0"
    }
    
    (1 row)
    

Create a Columnar Table and Load CEF Data

This example lets you compare the flex table for CEF data with a columnar table. You do so by creating a new table and load the same CEF_sample.cef file used in the preceding flex table example.

  1. Create a columnar table, col_logs, defining the prefix names that are hard coded in fcefparser

    => CREATE TABLE col_logs(version INT, 
      devicevendor VARCHAR, 
      deviceproduct VARCHAR, 
      deviceversion VARCHAR, 
      signatureid VARCHAR, 
      name VARCHAR, 
      severity VARCHAR);
    CREATE TABLE
    
  2. Load the sample file into col_logs, as you did for the flex table:

    => COPY col_logs FROM '/home/dbadmin/data/CEF_sample.cef' PARSER fcefparser();
     Rows Loaded
    -------------
               1
    (1 row)
    
  3. Query the table. You can find the identical information in the flex table output.

    => \x
    Expanded display is on.
    VMart=> SELECT * FROM col_logs;
    -[ RECORD 1 ]-+----------------------------------------
    version       | 0
    devicevendor  | ArcSight
    deviceproduct | ArcSight
    deviceversion | 6.0.3.6664.0
    signatureid   | agent:030
    name          | Agent [test] type [testalertng] started
    severity      | Low
    
    

Compute Keys and Build a Flex Table View

In this example, you use a flex helper function to compute keys and build a view for the logs flex table.

  1. Use the compute_flextable_keys_and_build_view function to compute keys and populate a view generated from the logs flex table:

    => SELECT compute_flextable_keys_and_build_view('logs');
                                compute_flextable_keys_and_build_view
    -------------------------------------------------------------------------------------
     Please see public.logs_keys for updated keys
    The view public.logs_view is ready for querying
    (1 row)
  2. Query the logs_keys table to see what the function computed from the sample CEF data: 

    => SELECT * FROM logs_keys;
           key_name       | frequency | data_type_guess
    ----------------------+-----------+-----------------
     c6a4                 |         1 | varchar(60)
     c6a4label            |         1 | varchar(36)
     categoryobject       |         1 | varchar(50)
     categoryoutcome      |         1 | varchar(20)
     categorysignificance |         1 | varchar(20)
     cs2                  |         1 | varchar(84)
     cs2label             |         1 | varchar(44)
     deviceproduct        |         1 | varchar(20)
     deviceversion        |         1 | varchar(24)
     devicezoneuri        |         1 | varchar(180)
     dvchost              |         1 | varchar(20)
     version              |         1 | varchar(20)
     ahost                |         1 | varchar(20)
     art                  |         1 | varchar(26)
     at                   |         1 | varchar(22)
     cat                  |         1 | varchar(28)
     catdt                |         1 | varchar(36)
     devicevendor         |         1 | varchar(20)
     dtz                  |         1 | varchar(32)
     dvc                  |         1 | varchar(24)
     filetype             |         1 | varchar(20)
     mrt                  |         1 | varchar(26)
     _cefver              |         1 | varchar(20)
     agentzoneuri         |         1 | varchar(180)
     agt                  |         1 | varchar(24)
     aid                  |         1 | varchar(50)
     atz                  |         1 | varchar(32)
     av                   |         1 | varchar(24)
     categorybehavior     |         1 | varchar(28)
     categorydevicegroup  |         1 | varchar(24)
     deviceseverity       |         1 | varchar(20)
     eventid              |         1 | varchar(20)
     name                 |         1 | varchar(78)
     rt                   |         1 | varchar(26)
     severity             |         1 | varchar(20)
     signatureid          |         1 | varchar(20)
    (36 rows)
    
  3. Query several columns from the logs_view

    => \x
    Expanded display is on.
    VMart=> select version, devicevendor, deviceversion, name, severity, signatureid 
      from logs_view;
    -[ RECORD 1 ]-+----------------------------------------
    version       | 0
    devicevendor  | ArcSight
    deviceversion | 6.0.3.6664.0
    name          | Agent [test] type [testalertng] started
    severity      | Low
    signatureid   | agent:030
    

Use the fcefparser Delimiter Parameter

In this example, you use the fcefparser delimiter parameter to query events located in California, New Mexico, and Arizona.

  1. Create a new columnar table, CEFData3

    => CREATE TABLE CEFData3(eventId INT, location VARCHAR(20));
    CREATE TABLE
  2. Using the delimiter=',' parameter, load some CEF data into the table:  

    => COPY CEFData3 FROM stdin PARSER fcefparser(delimiter=',');
    Enter data to be copied followed by a newline.
    End with a backslash and a period on a line by itself.
    >> eventId=1,location=California
    >> eventId=2,location=New Mexico
    >> eventId=3,location=Arizona
    >> \.
    
  3. Query the table: 

    => SELECT eventId, location FROM CEFData3;
     eventId |  location
    ---------+------------
           1 | California
           2 | New Mexico
           3 | Arizona
    (3 rows)