This section describes how the binary
collation for binary strings compares to the
_bin collations for nonbinary strings.
Binary strings (as stored using the
BINARY,
VARBINARY, and
BLOB data types) have a
character set and collation named binary.
Binary strings are sequences of bytes and the numeric values
of those bytes determine comparison and sort order.
Nonbinary strings (as stored using the
CHAR,
VARCHAR, and
TEXT data types) have a
character set and collation other than
binary. A given nonbinary character set can
have several collations, each of which defines a particular
comparison and sort order for the characters in the set. One
of these is the binary collation for the character set,
indicated by a _bin suffix in the collation
name. For example, the binary collations for
latin1 and utf8 are
named latin1_bin and
utf8_bin, respectively.
The binary collation differs from the
_bin collations in several respects.
The unit for comparison and sorting.
Binary strings are sequences of bytes. For the
binary collation, comparison and sorting
are based on numeric byte values. Nonbinary strings are
sequences of characters, which might be multibyte.
Collations for nonbinary strings define an ordering of the
character values for comparison and sorting. For the
_bin collation, this ordering is based on
numeric character code values, which is similar to ordering
for binary strings except that character code values might
be multibyte.
Character set conversion.
A nonbinary string has a character set and is automatically
converted to another character set in many cases, even when
the string has a _bin collation:
When assigning column values from another column that has a different character set:
UPDATE t1 SET utf8_bin_column=latin1_column; INSERT INTO t1 (latin1_column) SELECT utf8_bin_column FROM t2;
When assigning column values for
INSERT or
UPDATE using a string
literal:
SET NAMES latin1;
INSERT INTO t1 (utf8_bin_column) VALUES ('string-in-latin1');
When sending results from the server to a client:
SET NAMES latin1; SELECT utf8_bin_column FROM t2;
For binary string columns, no conversion occurs. For the preceding cases, the string value is copied byte-wise.
Lettercase conversion.
Collations for nonbinary character sets provide information
about lettercase of characters, so characters in a nonbinary
string can be converted from one lettercase to another, even
for _bin collations that ignore
lettercase for ordering:
mysql>SET NAMES latin1 COLLATE latin1_bin;mysql>SELECT LOWER('aA'), UPPER('zZ');+-------------+-------------+ | LOWER('aA') | UPPER('zZ') | +-------------+-------------+ | aa | ZZ | +-------------+-------------+
The concept of lettercase does not apply to bytes in a binary string. To perform lettercase conversion, the string must be converted to a nonbinary string:
mysql>SET NAMES binary;mysql>SELECT LOWER('aA'), LOWER(CONVERT('aA' USING latin1));+-------------+-----------------------------------+ | LOWER('aA') | LOWER(CONVERT('aA' USING latin1)) | +-------------+-----------------------------------+ | aA | aa | +-------------+-----------------------------------+
Trailing space handling in comparisons.
Nonbinary strings have PADSPACE behavior
for all collations, including _bin
collations. Trailing spaces are insignificant in
comparisons:
mysql>SET NAMES utf8 COLLATE utf8_bin;mysql>SELECT 'a ' = 'a';+------------+ | 'a ' = 'a' | +------------+ | 1 | +------------+
For binary strings, all characters are significant in comparisons, including trailing spaces:
mysql>SET NAMES binary;mysql>SELECT 'a ' = 'a';+------------+ | 'a ' = 'a' | +------------+ | 0 | +------------+
Trailing space handling for inserts and retrievals.
CHAR(
columns store nonbinary strings. Values shorter than
N)N characters are extended with
spaces on insertion. For retrieval, trailing spaces are
removed.
BINARY(
columns store binary strings. Values shorter than
N)N bytes are extended with
0x00 bytes on insertion. For retrieval,
nothing is removed; a value of the declared length is always
returned.
mysql>CREATE TABLE t1 (a CHAR(10) CHARACTER SET utf8 COLLATE utf8_bin,b BINARY(10));mysql>INSERT INTO t1 VALUES ('a','a');mysql>SELECT HEX(a), HEX(b) FROM t1;+--------+----------------------+ | HEX(a) | HEX(b) | +--------+----------------------+ | 61 | 61000000000000000000 | +--------+----------------------+