The MySQL server can support multiple character sets. To list
the available character sets, use the SHOW
CHARACTER SET statement. A partial listing follows.
For more complete information, see
Section 10.1.15, “Character Sets and Collations Supported by MySQL”.
mysql> SHOW CHARACTER SET;
+----------+---------------------------------+---------------------+--------+
| Charset | Description | Default collation | Maxlen |
+----------+---------------------------------+---------------------+--------+
| big5 | Big5 Traditional Chinese | big5_chinese_ci | 2 |
...
| latin1 | cp1252 West European | latin1_swedish_ci | 1 |
| latin2 | ISO 8859-2 Central European | latin2_general_ci | 1 |
...
| utf8 | UTF-8 Unicode | utf8_general_ci | 3 |
| ucs2 | UCS-2 Unicode | ucs2_general_ci | 2 |
...
| utf8mb4 | UTF-8 Unicode | utf8mb4_general_ci | 4 |
...
A given character set always has at least one collation, and may
have several. To list the collations for a Any given character
set always has at least one collation. It may have several
collations. To list the collations for a character set, use the
SHOW COLLATION statement. For
example, to see the collations for the latin1
(cp1252 West European) character set, use this statement to find
those collation names that begin with latin1:
mysql> SHOW COLLATION LIKE 'latin1%';
+---------------------+---------+----+---------+----------+---------+
| Collation | Charset | Id | Default | Compiled | Sortlen |
+---------------------+---------+----+---------+----------+---------+
| latin1_german1_ci | latin1 | 5 | | | 0 |
| latin1_swedish_ci | latin1 | 8 | Yes | Yes | 1 |
| latin1_danish_ci | latin1 | 15 | | | 0 |
| latin1_german2_ci | latin1 | 31 | | Yes | 2 |
| latin1_bin | latin1 | 47 | | Yes | 1 |
| latin1_general_ci | latin1 | 48 | | | 0 |
| latin1_general_cs | latin1 | 49 | | | 0 |
| latin1_spanish_ci | latin1 | 94 | | | 0 |
+---------------------+---------+----+---------+----------+---------+
The latin1 collations have the following
meanings.
| Collation | Meaning |
|---|---|
latin1_german1_ci | German DIN-1 |
latin1_swedish_ci | Swedish/Finnish |
latin1_danish_ci | Danish/Norwegian |
latin1_german2_ci | German DIN-2 |
latin1_bin | Binary according to latin1 encoding |
latin1_general_ci | Multilingual (Western European) |
latin1_general_cs | Multilingual (ISO Western European), case sensitive |
latin1_spanish_ci | Modern Spanish |
Collations have these general characteristics:
Two different character sets cannot have the same collation.
Each character set has one collation that is the default collation. For example, the default collations for
latin1andutf8arelatin1_swedish_ciandutf8_general_ci, respectively. The output forSHOW CHARACTER SETindicates which collation is the default for each displayed character set.Collation names start with the name of the character set with which they are associated, followed by one or more suffixes indicating other collation characteristics. For additional information about naming conventions, see Section 10.1.3, “Collation Naming Conventions”.
In cases where a character set has multiple collations, it might not be clear which collation is most suitable for a given application. To avoid choosing the wrong collation, perform some comparisons with representative data values to make sure that a given collation sorts values the way you expect.
Collation-Charts.Org is a useful site for information that shows how one collation compares to another.