Server versus Client Locale Settings
Vertica differentiates database server locale settings from client application locale settings:
- Server locale settings only impact collation behavior for server-side query processing.
- Client applications verify that locale is set appropriately in order to display characters correctly.
The following sections describe best practices to ensure predictable results.
The server session locale should be set as described in Specify the Default Locale for the Database. If locales vary across different sessions, set the server locale at the start of each session from your client.
- If the database does not have a default session locale, set the server locale for the session to the desired locale.
- The locale setting in the terminal emulator where the vsql client runs should be set to be equivalent to session locale setting on the server side (ICU locale). By doing so, the data is collated correctly on the server and displayed correctly on the client.
- All input data for vsql should be in UTF-8, and all output data is encoded in UTF-8
- Vertica does not support non UTF-8 encodings and associated locale values; .
- For instructions on setting locale and encoding, refer to your terminal emulator documentation.
- ODBC applications can be either in ANSI or Unicode mode. If the user application is Unicode, the encoding used by ODBC is UCS-2. If the user application is ANSI, the data must be in single-byte ASCII, which is compatible with UTF-8 used on the database server. The ODBC driver converts UCS-2 to UTF-8 when passing to the Vertica server and converts data sent by the Vertica server from UTF-8 to UCS-2.
If the user application is not already in UCS-2, the application must convert the input data to UCS-2, or unexpected results could occur. For example:
- For non-UCS-2 data passed to ODBC APIs, when it is interpreted as UCS-2, it could result in an invalid UCS-2 symbol being passed to the APIs, resulting in errors.
- The symbol provided in the alternate encoding could be a valid UCS-2 symbol. If this occurs, incorrect data is inserted into the database.
- If the database does not have a default session locale, ODBC applications should set the desired server session locale using
SQLSetConnectAttr(if different from database wide setting). By doing so, you get the expected collation and string functions behavior on the server.
JDBC and ADO.NET Clients
- JDBC and ADO.NET applications use a UTF-16 character set encoding and are responsible for converting any non-UTF-16 encoded data to UTF-16. The same cautions apply as for ODBC if this encoding is violated.
- The JDBC and ADO.NET drivers convert UTF-16 data to UTF-8 when passing to the Vertica server and convert data sent by Vertica server from UTF-8 to UTF-16.
- If there is no default session locale at the database level, JDBC and ADO.NET applications should set the correct server session locale by executing the SET LOCALE TO command in order to get the expected collation and string functions behavior on the server. For more information, see SET LOCALE.