Vertica Analytics Platform Version 10.1.x Documentation

Character Data Types

Stores strings of letters, numbers, and symbols. Data types CHARACTER (CHAR) and CHARACTER VARYING (VARCHAR) are collectively referred to as character string types, and the values of character string types are known as character strings.

Character data can be stored as fixed-length or variable-length strings. Fixed-length strings are right-extended with spaces on output; variable-length strings are not extended.

String literals in SQL statements must be enclosed in single quotes.

Syntax

Parameters

octet‑length

Specifies the length of the string or column width, declared in bytes (octets).

This parameter is optional.

CHAR Versus VARCHAR Data Types

The following differences apply to CHAR and VARCHAR data:

  • CHAR is conceptually a fixed-length, blank-padded string. Trailing blanks (spaces) are removed on input, and are restored on output. The default length is 1, and the maximum length is 65000 octets (bytes).
  • VARCHAR is a variable-length character data type. The default length is 80, and the maximum length is 65000 octets. For string values longer than 65000, use Long Data Types. Values can include trailing spaces.

Normally, you use VARCHAR for all of string data. Use CHAR when you need fixed-width string output. For example, you can use CHAR columns for data to be transferred to a legacy system that requires fixed-width strings.

Setting Maximum Length

When you define character columns, specify the maximum size of any string to be stored in a column. For example, to store strings up to 24 octets in length, use one of the following definitions:

CHAR(24)    /* fixed-length */
VARCHAR(24) /* variable-length */			

The maximum length parameter for VARCHAR and CHAR data types refers to the number of octets that can be stored in that field, not the number of characters (Unicode code points). When using multibyte UTF-8 characters, the fields must be sized to accommodate from 1 to 4 octets per character, depending on the data. If the data loaded into a VARCHAR/CHAR column exceeds the specified maximum size for that column, data is truncated on UTF-8 character boundaries to fit within the specified size. See COPY.

Remember to include the extra octets required for multibyte characters in the column-width declaration, keeping in mind the 65000 octet column-width limit.

Due to compression in Vertica, the cost of overestimating the length of these fields is incurred primarily at load time and during sorts.

NULL Versus NUL

NULL and NUL differ as follows:

  • NUL represents a character whose ASCII/Unicode code is 0, sometimes qualified "ASCII NUL".
  • NULL means no value, and is true of a field (column) or constant, not of a character.

CHAR, LONG VARCHAR, and VARCHAR string data types accept ASCII NUL values.

NULL appears last (largest) in ascending order.

For additional information about NULL ordering, see NULL Sort Order.

The following example casts the input string containing NUL values to VARCHAR:

=> SELECT 'vert\0ica'::CHARACTER VARYING AS VARCHAR;
 VARCHAR 
---------
 vert\0ica
(1 row)

The result contains 9 characters:

=> SELECT LENGTH('vert\0ica'::CHARACTER VARYING);
 length 
--------
      9
(1 row)

If you use an extended string literal, the length is 8 characters:

=> SELECT E'vert\0ica'::CHARACTER VARYING AS VARCHAR;
 VARCHAR 
---------
 vertica
(1 row)
=> SELECT LENGTH(E'vert\0ica'::CHARACTER VARYING);
 LENGTH 
--------
      8
(1 row)