MySQL :: MySQL 8.0 Reference Manual :: 11.4.2 The BINARY and VARBINARY Types

MySQL 8.0 Reference Manual / ... / The BINARY and VARBINARY Types

Pre-General Availability Draft: 2017-07-17

11.4.2 The BINARY and VARBINARY Types

The BINARY and VARBINARY types are similar to CHAR and VARCHAR, except that they contain binary strings rather than nonbinary strings. That is, they contain byte strings rather than character strings. This means they have the binary character set and collation, and comparison and sorting are based on the numeric values of the bytes in the values.

The permissible maximum length is the same for BINARY and VARBINARY as it is for CHAR and VARCHAR, except that the length for BINARY and VARBINARY is a length in bytes rather than in characters.

The BINARY and VARBINARY data types are distinct from the CHAR BINARY and VARCHAR BINARY data types. For the latter types, the BINARY attribute does not cause the column to be treated as a binary string column. Instead, it causes the binary (_bin) collation for the column character set to be used, and the column itself contains nonbinary character strings rather than binary byte strings. For example, CHAR(5) BINARY is treated as CHAR(5) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin, assuming that the default character set is utf8mb4. This differs from BINARY(5), which stores 5-bytes binary strings that have the binary character set and collation. For information about differences between binary strings and binary collations for nonbinary strings, see Section 10.1.8.5, “The binary Collation Compared to _bin Collations”.

If strict SQL mode is not enabled and you assign a value to a BINARY or VARBINARY column that exceeds the column's maximum length, the value is truncated to fit and a warning is generated. For cases of truncation, you can cause an error to occur (rather than a warning) and suppress insertion of the value by using strict SQL mode. See Section 5.1.8, “Server SQL Modes”.

When BINARY values are stored, they are right-padded with the pad value to the specified length. The pad value is 0x00 (the zero byte). Values are right-padded with 0x00 on insert, and no trailing bytes are removed on select. All bytes are significant in comparisons, including ORDER BY and DISTINCT operations. 0x00 bytes and spaces are different in comparisons, with 0x00 < space.

Example: For a BINARY(3) column, 'a ' becomes 'a \0' when inserted. 'a\0' becomes 'a\0\0' when inserted. Both inserted values remain unchanged when selected.

For VARBINARY, there is no padding on insert and no bytes are stripped on select. All bytes are significant in comparisons, including ORDER BY and DISTINCT operations. 0x00 bytes and spaces are different in comparisons, with 0x00 < space.

For those cases where trailing pad bytes are stripped or comparisons ignore them, if a column has an index that requires unique values, inserting into the column values that differ only in number of trailing pad bytes will result in a duplicate-key error. For example, if a table contains 'a', an attempt to store 'a\0' causes a duplicate-key error.

You should consider the preceding padding and stripping characteristics carefully if you plan to use the BINARY data type for storing binary data and you require that the value retrieved be exactly the same as the value stored. The following example illustrates how 0x00-padding of BINARY values affects column value comparisons:

mysql> CREATE TABLE t (c BINARY(3));
Query OK, 0 rows affected (0.01 sec)

mysql> INSERT INTO t SET c = 'a';
Query OK, 1 row affected (0.01 sec)

mysql> SELECT HEX(c), c = 'a', c = 'a\0\0' from t;
+--------+---------+-------------+
| HEX(c) | c = 'a' | c = 'a\0\0' |
+--------+---------+-------------+
| 610000 |       0 |           1 |
+--------+---------+-------------+
1 row in set (0.09 sec)

If the value retrieved must be the same as the value specified for storage with no padding, it might be preferable to use VARBINARY or one of the BLOB data types instead.

PREV HOME UP NEXT

User Comments

Posted by Andreas Krueger on April 4, 2006

When BINARY or VARBINARY values are stored, e.g. from literal strings like 'abc' or 'Hello', there is of course a character set involved. It' s the standard character set of the operating system that is used to translate each character 'a','b','c' or 'H','e','l','o' to its byte value. (Or byte values for multi-byte character sets.)
Thus, the operating system, with its standard character set, defines how characters are converted into binary values. Only there is no MySQL character set definition involved.

Posted by Rustam Abdullaev on May 19, 2011

To escape binary values, MySQL supports the following formats: X'val', x'val', or 0xval. The former is SQL standard, the latter is ODBC standard.
Examples:
mysql> SELECT x'4D7953514C';
-> 'MySQL'
mysql> SELECT 0x4D7953514C;
-> 'MySQL'

Posted by Xavier P on August 25, 2012

@Andreas, what it means that a value "has no character set" is that what you describe breaks down if the operating system has a different charset when inserting and when querying. If one user inserts text as binary and he's using for example ASCII, then another user reads the binary as text but he's using utf-16, he'll read a different text.

When you use char types the server takes care of things so that anyone will read the same text that was inserted, regardless of everyone's local encoding settings.

Posted by Ben Griffin on March 25, 2013

I'm not really sure that the documentation explains whether the binary column is indexed as a number or as a binarystring. This becomes relevant regarding where the LSB is in UUID fields. Typically with sequences (eg auto-increment) it's the least significant value which changes most frequently, whereas with UUID() I notice that it is the most significant value which changes most frequently. Which generates the most efficient indices?

In the end, I guess I am wondering if it would be more efficient to index the reverse of the UUID, or leave it as it is?

So the options I'm interested in are as follows (assumption is that a primary field of BINARY(16) is populated with the function):

CREATE FUNCTION id() RETURNS binary(16) RETURN unhex(REPLACE(UUID(),'-',''));

or

CREATE FUNCTION id() RETURNS binary(16) RETURN unhex(reverse(REPLACE(UUID(),'-','')));