Vertica Analytics Platform Version 9.2.x Documentation

FCSVPARSER

Parses CSV format (comma-separated values) data. Use this parser to load CSV data into columnar, flex, and hybrid tables. All data must be encoded in Unicode UTF-8 format. The parser fcsvparser supports the RFC 4180 de facto standard for CSV data, and other options, to accommodate variations in CSV file format definitions. Invalid records will be rejected.

The fcsvparser does not support multibyte data. For more information about data formats, see Checking Data Format Before or After Loading.

Parameters

type= {'rfc4180' | 'traditional' }
CHAR

[Optional] Specifies the default parameter values for the parser. You do not have to use the type parameter when loading data that conforms to the RFC 4180 standard (such as MS Excel files). See Loading CSV Data for the RFC4180 default parameters, and other options you can specify for traditional CSV files.

Default value: RFC4180

delimiter
CHAR

[Optional] Indicates the single-character value used to separate fields in the CSV data.

Default value: , (for rfc4180 and traditional)

escape
CHAR

[Optional] Specifies a single-character value. Use an escape character to interpret the next character in the data literally.

Default value: " (for rfc4180)
Default value: \ (for traditional)

enclosed_by
CHAR

[Optional] Specifies a single-character value. Use and enclosed_by value to include a value that is identical to the delimiter, but should be interpreted literally. For example, if the data delimiter is a comma (,), and you want to use a comma within the data ("my name is jane, and his is jim").

Default value: " (for rfc4180 and traditional)

record_terminator
CHAR

[Optional] Indicates the single-character value used to specify the end of a record.

Default value:  \n or \r\n (for rfc4180 and traditional)

header
BOOLEAN

[Optional] Specifies whether to use the first row of data as a header column. When header=true (default), and no header exists, fcsvparser uses a default column heading. The default header consists of ucoln, where n is the column offset number, starting with 0 for the first column. You can specify custom column heading names using the header_names parameter, described next.

If you specify header=false, the fcsvparser parses the first row of input as data, rather than as column headers.

Default value: true

header_names
CHAR

[Optional] Specifies a list of column header names, delimited by the character defined by the parser's delimiter parameter. Use this parameter to specify header names in a CSV file without a header row, or to override the column names present in the CSV source. To override one or more existing column names, specify the header names to use. This parameter overrides any header row in the data.

trim
BOOLEAN

[Optional] Indicates whether to trim white space from header names and key values.

Default value: true

omit_empty_keys
BOOLEAN

[Optional] Indicates how the parser handles header keys without values. If omit_empty_keys=true, keys with an empty value in the header row are not loaded.

Default value: false

reject_on_duplicate
BOOLEAN

[Optional] Specifies whether to ignore duplicate records (false), or to reject duplicates (true). In either case, the load continues.

Default value: false

reject_on_empty_key
BOOLEAN

[Optional] Specifies whether to reject any row containing a key without a value.

Default value: false

reject_on_materialized_type_error
BOOLEAN

[Optional] Indicates whether to reject any materialized column value that the parser cannot coerce into a compatible data type. See Loading CSV Data.

Default value: false

Examples

This example shows how you can use fcsvparser to load a flex table, build a view, and then query that view.

  1. Create a flex table for CSV data:

    => CREATE FLEX TABLE rfc();
    CREATE TABLE
    
    
  2. Use fcsvparser to load the data from STDIN. Specify that no header exists, and enter some data as shown: 

    => COPY rfc FROM stdin PARSER fcsvparser(header='false');
    Enter data to be copied followed by a newline.
    End with a backslash and a period on a line by itself.
    >> 10,10,20
    >> 10,"10",30
    >> 10,"20""5",90
    >> \.
    
  3. Run the compute_flextable_keys_and_build_view function, and query the rfc_view. Notice that the default enclosed_by character permits an escape character (") within a field ("20""5"). Thus, the resulting value was parsed correctly. Since no header existed in the input data, the function added ucoln for each column: 

    => SELECT compute_flextable_keys_and_build_view('rfc');
                               compute_flextable_keys_and_build_view                         
    --------------------------------------------------------------------------------------------
     Please see public.rfc_keys for updated keys
    The view public.rfc_view is ready for querying
    (1 row)
    
    => SELECT * FROM rfc_view;
     ucol0 | ucol1 | ucol2
    -------+-------+-------
     10    | 10    | 20
     10    | 10    | 30
     10    | 20"5  | 90
    (3 rows)
    
    
    					
  4. For more information and examples of using other parameters of this parser, see Loading CSV Data.

See Also