ISUTF8
Tests whether a string is a valid UTF-8 string. Returns true if the string conforms to UTF-8 standards, and false otherwise. This function is useful to test strings for UTF-8 compliance before passing them to one of the regular expression functions, such as REGEXP_LIKE, which expect UTF-8 characters by default.
ISUTF8 checks for invalid UTF8 byte sequences, according to UTF-8 rules:
- invalid bytes
- an unexpected continuation byte
- a start byte not followed by enough continuation bytes
- an Overload Encoding
The presence of an invalid UTF8 byte sequence results in a return value of false.
Syntax
ISUTF8( string );
Parameters
string |
The string to test for UTF-8 compliance. |
Examples
=> SELECT ISUTF8(E'\xC2\xBF'); -- UTF-8 INVERTED QUESTION MARK ISUTF8 -------- t (1 row)
=> SELECT ISUTF8(E'\xC2\xC0'); -- UNDEFINED UTF-8 CHARACTER ISUTF8 -------- f (1 row)