FCSVPARSER

Parses CSV format (comma-separated values) data. Use this parser to load CSV data into columnar, flex, and hybrid tables. All data must be encoded in Unicode UTF-8 format. The fcsvparser parser supports the RFC 4180 standard for CSV data, and other options, to accommodate variations in CSV file format definitions. Invalid records are rejected. For more information about data formats, see Handling Non-UTF-8 Input.

Syntax

FCSVPARSER ( [parameter='value'[,...]] )

Parameters

type

The default parameter values for the parser, one of the following strings:

  • rfc4180
  • traditional

You do not have to use the type parameter when loading data that conforms to the RFC 4180 standard (such as MS Excel files). See Loading CSV Data for the RFC4180 default parameters, and other options you can specify for traditional CSV files.

Default: RFC4180

delimiter

A single-character value used to separate fields in the CSV data.

Default: , (for rfc4180 and traditional)

escape

A single-character value used as an escape character to interpret the next character in the data literally.

Default:

  • rfc4180: "
  • traditional: \
enclosed_by

A single-character value. Use enclosed_by to include a value that is identical to the delimiter, but should be interpreted literally. For example, if the data delimiter is a comma (,), and you want to use a comma within the data ("my name is jane, and his is jim").

Default: "

record_terminator

A single-character value used to specify the end of a record.

Default:

  • rfc4180: \n
  • traditional: \r\n
header

Boolean, specifies whether to use the first row of data as a header column. When header=true (default), and no header exists, fcsvparser uses a default column heading. The default header consists of ucoln, where n is the column offset number, starting with 0 for the first column. You can specify custom column heading names using the header_names parameter, described next.

If you specify header=false, the fcsvparser parses the first row of input as data, rather than as column headers.

Default: true

header_names

A list of column header names, delimited by the character defined by the parser's delimiter parameter. Use this parameter to specify header names in a CSV file without a header row, or to override the column names present in the CSV source. To override one or more existing column names, specify the header names to use. This parameter overrides any header row in the data.

trim

Boolean, specifies whether to trim white space from header names and key values.

Default: true

omit_empty_keys

Boolean, specifies how the parser handles header keys without values. If true, keys with an empty value in the header row are not loaded.

Default: false

reject_on_duplicate

Boolean, specifies whether to ignore duplicate records (false), or to reject duplicates (true). In either case, the load continues.

Default: false

reject_on_empty_key

Boolean, specifies whether to reject any row containing a key without a value.

Default: false

reject_on_materialized_type_error

Boolean, specifies whether to reject any materialized column value that the parser cannot coerce into a compatible data type. See Loading CSV Data.

Default: false

Examples

This example shows how you can use fcsvparser to load a flex table, build a view, and then query that view.

  1. Create a flex table for CSV data:

    => CREATE FLEX TABLE rfc();
    CREATE TABLE
    
    
  2. Use fcsvparser to load the data from STDIN. Specify that no header exists, and enter some data as shown:

    => COPY rfc FROM stdin PARSER fcsvparser(header='false');
    Enter data to be copied followed by a newline.
    End with a backslash and a period on a line by itself.
    >> 10,10,20
    >> 10,"10",30
    >> 10,"20""5",90
    >> \.
    
  3. Run the compute_flextable_keys_and_build_view function, and query the rfc_view. Notice that the default enclosed_by character permits an escape character (") within a field ("20""5"). Thus, the resulting value was parsed correctly. Since no header existed in the input data, the function added ucoln for each column:

    => SELECT compute_flextable_keys_and_build_view('rfc');
                               compute_flextable_keys_and_build_view                         
    --------------------------------------------------------------------------------------------
     Please see public.rfc_keys for updated keys
    The view public.rfc_view is ready for querying
    (1 row)
    
    => SELECT * FROM rfc_view;
     ucol0 | ucol1 | ucol2
    -------+-------+-------
     10    | 10    | 20
     10    | 10    | 30
     10    | 20"5  | 90
    (3 rows)				
  4. For more information and examples using other parameters of this parser, see Loading CSV Data.

See Also