A character string literal, hexadecimal literal, or bit-value
literal may have an optional character set introducer and
COLLATE clause, to designate it as a string
that uses a particular character set and collation:
[_charset_name]literal[COLLATEcollation_name]
Character set introducers and the COLLATE
clause are implemented according to standard SQL
specifications.
Examples:
SELECT 'abc'; SELECT _latin1'abc'; SELECT _binary'abc'; SELECT _utf8'abc' COLLATE utf8_danish_ci; SELECT _latin1 X'4D7953514C'; SELECT _utf8 0x4D7953514C COLLATE utf8_danish_ci; SELECT _latin1 b'1000001'; SELECT _utf8 0b1000001 COLLATE utf8_danish_ci;
The
_
expression is formally called an
introducer. It tells the parser,
“the string that follows uses character set
charset_namecharset_name.” An introducer
does not change the string to the introducer character set
like CONVERT() would do. It
does not change the string value, although padding may occur.
The introducer is just a signal.
For character string literals, space between the introducer and the string is permitted but optional.
Character string literals can be designated as binary strings
by using the _binary introducer.
Hexadecimal literals and bit-value literals are binary strings
by default, so _binary is permitted, but
unnecessary.
MySQL determines the character set and collation of a character string literal, hexadecimal literal, or bit-value literal in the following manner:
If both _charset_name and
COLLATE
are
specified, character set
collation_namecharset_name and collation
collation_name are used.
collation_name must be a
permitted collation for
charset_name.
If _charset_name is specified
but COLLATE is not specified, character
set charset_name and its
default collation are used. To see the default collation
for each character set, use the SHOW
CHARACTER SET statement.
If _charset_name is not
specified but COLLATE
is
specified:
collation_name
For a character string literal, the connection default
character set given by the
character_set_connection
system variable and collation
collation_name are used.
collation_name must be a
permitted collation for the connection default
character set.
For a hexadecimal literal or bit-value literal, the
only permitted collation is binary
because these types of literals are binary strings by
default.
Otherwise (neither
_charset_name nor
COLLATE
is
specified):
collation_name
For a character string literal, the connection default
character set and collation given by the
character_set_connection
and
collation_connection
system variables are used.
For a hexadecimal literal or bit-value literal, the
character set and collation are
binary.
Examples:
Nonbinary strings with latin1 character
set and latin1_german1_ci collation:
SELECT _latin1'Müller' COLLATE latin1_german1_ci; SELECT _latin1 X'0A0D' COLLATE latin1_german1_ci; SELECT _latin1 b'0110' COLLATE latin1_german1_ci;
Nonbinary strings with utf8 character
set and its default collation (that is,
utf8_general_ci):
SELECT _utf8'Müller'; SELECT _utf8 X'0A0D'; SELECT _utf8 b'0110';
Binary strings with binary character
set and its default collation (that is,
binary):
SELECT _binary'Müller'; SELECT X'0A0D'; SELECT b'0110';
The hexadecimal literal and bit-value literal need no introducer because they are binary strings by default.
A nonbinary string with the connection default character
set and utf8_general_ci collation
(fails if the connection character set is not
utf8):
SELECT 'Müller' COLLATE utf8_general_ci;
This construction (COLLATE only) does
not work for hexadecimal literals or bit literals because
their character set is binary no matter
the connection character set, and
binary is not compatible with the
utf8_general_ci collation. The only
permitted COLLATE clause in the absence
of an introducer is COLLATE binary.
A string with the connection default character set and collation:
SELECT 'Müller';
For character set literals, an introducer indicates the
character set for the following string, but does not change
how the parser performs escape processing within the string.
Escapes are always interpreted by the parser according to the
character set given by
character_set_connection. For
additional discussion and examples, see
Section 10.1.3.6, “Character String Literal Character Set and Collation”.