Table of Contents
This section discusses the types of partitioning which are available in MySQL 5.5. These include the types listed here:
RANGE partitioning.
This type of partitioning assigns rows to partitions based
on column values falling within a given range. See
Section 3.1, “RANGE Partitioning”. MySQL 5.5 adds an
extension, RANGE COLUMNS, to this type.
See Section 3.3.1, “RANGE COLUMNS partitioning”.
LIST partitioning.
Similar to partitioning by RANGE, except
that the partition is selected based on columns matching one
of a set of discrete values. See
Section 3.2, “LIST Partitioning”. MySQL 5.5 adds an
extension, LIST COLUMNS, to this type.
See Section 3.3.2, “LIST COLUMNS partitioning”.
HASH partitioning.
With this type of partitioning, a partition is selected
based on the value returned by a user-defined expression
that operates on column values in rows to be inserted into
the table. The function may consist of any expression valid
in MySQL that yields a nonnegative integer value. An
extension to this type, LINEAR HASH, is
also available. See Section 3.4, “HASH Partitioning”.
KEY partitioning.
This type of partitioning is similar to partitioning by
HASH, except that only one or more
columns to be evaluated are supplied, and the MySQL server
provides its own hashing function. These columns can contain
other than integer values, since the hashing function
supplied by MySQL guarantees an integer result regardless of
the column data type. An extension to this type,
LINEAR KEY, is also available. See
Section 3.5, “KEY Partitioning”.
A very common use of database partitioning is to segregate data by
date. Some database systems support explicit date partitioning,
which MySQL does not implement in 5.5. However, it is
not difficult in MySQL to create partitioning schemes based on
DATE,
TIME, or
DATETIME columns, or based on
expressions making use of such columns.
When partitioning by KEY or LINEAR
KEY, you can use a DATE,
TIME, or
DATETIME column as the partitioning
column without performing any modification of the column value.
For example, this table creation statement is perfectly valid in
MySQL:
CREATE TABLE members (
firstname VARCHAR(25) NOT NULL,
lastname VARCHAR(25) NOT NULL,
username VARCHAR(16) NOT NULL,
email VARCHAR(35),
joined DATE NOT NULL
)
PARTITION BY KEY(joined)
PARTITIONS 6;
In MySQL 5.5, it is also possible to use a
DATE or
DATETIME column as the partitioning
column using RANGE COLUMNS and LIST
COLUMNS partitioning.
MySQL's other partitioning types, however, require a partitioning
expression that yields an integer value or
NULL. If you wish to use date-based
partitioning by RANGE, LIST,
HASH, or LINEAR HASH, you
can simply employ a function that operates on a
DATE,
TIME, or
DATETIME column and returns such a
value, as shown here:
CREATE TABLE members (
firstname VARCHAR(25) NOT NULL,
lastname VARCHAR(25) NOT NULL,
username VARCHAR(16) NOT NULL,
email VARCHAR(35),
joined DATE NOT NULL
)
PARTITION BY RANGE( YEAR(joined) ) (
PARTITION p0 VALUES LESS THAN (1960),
PARTITION p1 VALUES LESS THAN (1970),
PARTITION p2 VALUES LESS THAN (1980),
PARTITION p3 VALUES LESS THAN (1990),
PARTITION p4 VALUES LESS THAN MAXVALUE
);
Additional examples of partitioning using dates may be found in the following sections of this chapter:
For more complex examples of date-based partitioning, see the following sections:
MySQL partitioning is optimized for use with the
TO_DAYS(),
YEAR(), and
TO_SECONDS() functions. However,
you can use other date and time functions that return an integer
or NULL, such as
WEEKDAY(),
DAYOFYEAR(), or
MONTH(). See
Date and Time Functions, for more information
about such functions.
It is important to remember—regardless of the type of
partitioning that you use—that partitions are always
numbered automatically and in sequence when created, starting with
0. When a new row is inserted into a
partitioned table, it is these partition numbers that are used in
identifying the correct partition. For example, if your table uses
4 partitions, these partitions are numbered 0,
1, 2, and
3. For the RANGE and
LIST partitioning types, it is necessary to
ensure that there is a partition defined for each partition
number. For HASH partitioning, the user
function employed must return an integer value greater than
0. For KEY partitioning,
this issue is taken care of automatically by the hashing function
which the MySQL server employs internally.
Names of partitions generally follow the rules governing other
MySQL identifiers, such as those for tables and databases.
However, you should note that partition names are not
case-sensitive. For example, the following
CREATE TABLE statement fails as
shown:
mysql>CREATE TABLE t2 (val INT)->PARTITION BY LIST(val)(->PARTITION mypart VALUES IN (1,3,5),->PARTITION MyPart VALUES IN (2,4,6)->);ERROR 1488 (HY000): Duplicate partition name mypart
Failure occurs because MySQL sees no difference between the
partition names mypart and
MyPart.
When you specify the number of partitions for the table, this must
be expressed as a positive, nonzero integer literal with no
leading zeros, and may not be an expression such as
0.8E+01 or 6-2, even if it
evaluates to an integer value. Decimal fractions are not
permitted.
In the sections that follow, we do not necessarily provide all possible forms for the syntax that can be used for creating each partition type; this information may be found in CREATE TABLE Syntax.
A table that is partitioned by range is partitioned in such a
way that each partition contains rows for which the partitioning
expression value lies within a given range. Ranges should be
contiguous but not overlapping, and are defined using the
VALUES LESS THAN operator. For the next few
examples, suppose that you are creating a table such as the
following to hold personnel records for a chain of 20 video
stores, numbered 1 through 20:
CREATE TABLE employees (
id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30),
hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT NOT NULL,
store_id INT NOT NULL
);
The employees table used here has no
primary or unique keys. While the examples work as shown for
purposes of the present discussion, you should keep in mind
that tables are extremely likely in practice to have primary
keys, unique keys, or both, and that allowable choices for
partitioning columns depend on the columns used for these
keys, if any are present. For a discussion of these issues,
see
Section 6.1, “Partitioning Keys, Primary Keys, and Unique Keys”.
This table can be partitioned by range in a number of ways,
depending on your needs. One way would be to use the
store_id column. For instance, you might
decide to partition the table 4 ways by adding a
PARTITION BY RANGE clause as shown here:
CREATE TABLE employees (
id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30),
hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT NOT NULL,
store_id INT NOT NULL
)
PARTITION BY RANGE (store_id) (
PARTITION p0 VALUES LESS THAN (6),
PARTITION p1 VALUES LESS THAN (11),
PARTITION p2 VALUES LESS THAN (16),
PARTITION p3 VALUES LESS THAN (21)
);
In this partitioning scheme, all rows corresponding to employees
working at stores 1 through 5 are stored in partition
p0, to those employed at stores 6 through 10
are stored in partition p1, and so on. Note
that each partition is defined in order, from lowest to highest.
This is a requirement of the PARTITION BY
RANGE syntax; you can think of it as being analogous
to a series of if ... elseif ... statements
in C or Java in this regard.
It is easy to determine that a new row containing the data
(72, 'Mitchell', 'Wilson', '1998-06-25', NULL,
13) is inserted into partition p2,
but what happens when your chain adds a
21st store? Under this scheme, there
is no rule that covers a row whose store_id
is greater than 20, so an error results because the server does
not know where to place it. You can keep this from occurring by
using a “catchall” VALUES LESS
THAN clause in the CREATE
TABLE statement that provides for all values greater
than the highest value explicitly named:
CREATE TABLE employees (
id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30),
hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT NOT NULL,
store_id INT NOT NULL
)
PARTITION BY RANGE (store_id) (
PARTITION p0 VALUES LESS THAN (6),
PARTITION p1 VALUES LESS THAN (11),
PARTITION p2 VALUES LESS THAN (16),
PARTITION p3 VALUES LESS THAN MAXVALUE
);
Another way to avoid an error when no matching value is found
is to use the IGNORE keyword as part of the
INSERT statement. For an
example, see Section 3.2, “LIST Partitioning”. Also see
INSERT Syntax, for general information about
IGNORE.
MAXVALUE represents an integer value that is
always greater than the largest possible integer value (in
mathematical language, it serves as a
least upper bound). Now,
any rows whose store_id column value is
greater than or equal to 16 (the highest value defined) are
stored in partition p3. At some point in the
future—when the number of stores has increased to 25, 30,
or more—you can use an
ALTER
TABLE statement to add new partitions for stores
21-25, 26-30, and so on (see
Chapter 4, Partition Management, for details of how to
do this).
In much the same fashion, you could partition the table based on
employee job codes—that is, based on ranges of
job_code column values. For
example—assuming that two-digit job codes are used for
regular (in-store) workers, three-digit codes are used for
office and support personnel, and four-digit codes are used for
management positions—you could create the partitioned
table using the following statement:
CREATE TABLE employees (
id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30),
hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT NOT NULL,
store_id INT NOT NULL
)
PARTITION BY RANGE (job_code) (
PARTITION p0 VALUES LESS THAN (100),
PARTITION p1 VALUES LESS THAN (1000),
PARTITION p2 VALUES LESS THAN (10000)
);
In this instance, all rows relating to in-store workers would be
stored in partition p0, those relating to
office and support staff in p1, and those
relating to managers in partition p2.
It is also possible to use an expression in VALUES LESS
THAN clauses. However, MySQL must be able to evaluate
the expression's return value as part of a LESS
THAN (<) comparison.
Rather than splitting up the table data according to store
number, you can use an expression based on one of the two
DATE columns instead. For
example, let us suppose that you wish to partition based on the
year that each employee left the company; that is, the value of
YEAR(separated). An example of a
CREATE TABLE statement that
implements such a partitioning scheme is shown here:
CREATE TABLE employees (
id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30),
hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT,
store_id INT
)
PARTITION BY RANGE ( YEAR(separated) ) (
PARTITION p0 VALUES LESS THAN (1991),
PARTITION p1 VALUES LESS THAN (1996),
PARTITION p2 VALUES LESS THAN (2001),
PARTITION p3 VALUES LESS THAN MAXVALUE
);
In this scheme, for all employees who left before 1991, the rows
are stored in partition p0; for those who
left in the years 1991 through 1995, in p1;
for those who left in the years 1996 through 2000, in
p2; and for any workers who left after the
year 2000, in p3.
It is also possible to partition a table by
RANGE, based on the value of a
TIMESTAMP column, using the
UNIX_TIMESTAMP() function, as
shown in this example:
CREATE TABLE quarterly_report_status (
report_id INT NOT NULL,
report_status VARCHAR(20) NOT NULL,
report_updated TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
)
PARTITION BY RANGE ( UNIX_TIMESTAMP(report_updated) ) (
PARTITION p0 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-01-01 00:00:00') ),
PARTITION p1 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-04-01 00:00:00') ),
PARTITION p2 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-07-01 00:00:00') ),
PARTITION p3 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-10-01 00:00:00') ),
PARTITION p4 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-01-01 00:00:00') ),
PARTITION p5 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-04-01 00:00:00') ),
PARTITION p6 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-07-01 00:00:00') ),
PARTITION p7 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-10-01 00:00:00') ),
PARTITION p8 VALUES LESS THAN ( UNIX_TIMESTAMP('2010-01-01 00:00:00') ),
PARTITION p9 VALUES LESS THAN (MAXVALUE)
);
Any other expressions involving
TIMESTAMP values are not
permitted. (See Bug #42849.)
Range partitioning is particularly useful when one or more of the following conditions is true:
You want or need to delete “old” data. If you
are using the partitioning scheme shown previously for the
employees table, you can simply use
ALTER TABLE employees DROP PARTITION p0;
to delete all rows relating to employees who stopped working
for the firm prior to 1991. (See
ALTER TABLE Syntax, and
Chapter 4, Partition Management, for more
information.) For a table with a great many rows, this can
be much more efficient than running a
DELETE query such as
DELETE FROM employees WHERE YEAR(separated) <=
1990;.
You want to use a column containing date or time values, or containing values arising from some other series.
You frequently run queries that depend directly on the
column used for partitioning the table. For example, when
executing a query such as
EXPLAIN PARTITIONS
SELECT COUNT(*) FROM employees WHERE separated BETWEEN
'2000-01-01' AND '2000-12-31' GROUP BY store_id;,
MySQL can quickly determine that only partition
p2 needs to be scanned because the
remaining partitions cannot contain any records satisfying
the WHERE clause. See
Chapter 5, Partition Pruning, for more information
about how this is accomplished.
A variant on this type of partitioning, RANGE
COLUMNS partitioning, was introduced in MySQL 5.5.0.
Partitioning by RANGE COLUMNS makes it
possible to employ multiple columns for defining partitioning
ranges that apply both to placement of rows in partitions and
for determining the inclusion or exclusion of specific
partitions when performing partition pruning. See
Section 3.3.1, “RANGE COLUMNS partitioning”, for more
information.
Partitioning schemes based on time intervals. If you wish to implement a partitioning scheme based on ranges or intervals of time in MySQL 5.5, you have two options:
Partition the table by RANGE, and for the
partitioning expression, employ a function operating on a
DATE,
TIME, or
DATETIME column and returning
an integer value, as shown here:
CREATE TABLE members (
firstname VARCHAR(25) NOT NULL,
lastname VARCHAR(25) NOT NULL,
username VARCHAR(16) NOT NULL,
email VARCHAR(35),
joined DATE NOT NULL
)
PARTITION BY RANGE( YEAR(joined) ) (
PARTITION p0 VALUES LESS THAN (1960),
PARTITION p1 VALUES LESS THAN (1970),
PARTITION p2 VALUES LESS THAN (1980),
PARTITION p3 VALUES LESS THAN (1990),
PARTITION p4 VALUES LESS THAN MAXVALUE
);
Beginning with MySQL 5.5.1, it is also possible to partition
a table by RANGE based on the value of a
TIMESTAMP column, using the
UNIX_TIMESTAMP() function, as
shown in this example:
CREATE TABLE quarterly_report_status (
report_id INT NOT NULL,
report_status VARCHAR(20) NOT NULL,
report_updated TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
)
PARTITION BY RANGE ( UNIX_TIMESTAMP(report_updated) ) (
PARTITION p0 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-01-01 00:00:00') ),
PARTITION p1 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-04-01 00:00:00') ),
PARTITION p2 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-07-01 00:00:00') ),
PARTITION p3 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-10-01 00:00:00') ),
PARTITION p4 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-01-01 00:00:00') ),
PARTITION p5 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-04-01 00:00:00') ),
PARTITION p6 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-07-01 00:00:00') ),
PARTITION p7 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-10-01 00:00:00') ),
PARTITION p8 VALUES LESS THAN ( UNIX_TIMESTAMP('2010-01-01 00:00:00') ),
PARTITION p9 VALUES LESS THAN (MAXVALUE)
);
Also beginning with MySQL 5.5.1, any other expressions
involving TIMESTAMP values
are not permitted. (See Bug #42849.)
It is also possible in MySQL 5.5.1 and later to use
UNIX_TIMESTAMP(timestamp_column)
as a partitioning expression for tables that are
partitioned by LIST. However, it is
usually not practical to do so.
Partition the table by RANGE COLUMNS,
using a DATE or
DATETIME column as the
partitioning column. For example, the
members table could be defined using the
joined column directly, as shown here:
CREATE TABLE members (
firstname VARCHAR(25) NOT NULL,
lastname VARCHAR(25) NOT NULL,
username VARCHAR(16) NOT NULL,
email VARCHAR(35),
joined DATE NOT NULL
)
PARTITION BY RANGE COLUMNS(joined) (
PARTITION p0 VALUES LESS THAN ('1960-01-01'),
PARTITION p1 VALUES LESS THAN ('1970-01-01'),
PARTITION p2 VALUES LESS THAN ('1980-01-01'),
PARTITION p3 VALUES LESS THAN ('1990-01-01'),
PARTITION p4 VALUES LESS THAN MAXVALUE
);
List partitioning in MySQL is similar to range partitioning in
many ways. As in partitioning by RANGE, each
partition must be explicitly defined. The chief difference
between the two types of partitioning is that, in list
partitioning, each partition is defined and selected based on
the membership of a column value in one of a set of value lists,
rather than in one of a set of contiguous ranges of values. This
is done by using PARTITION BY
LIST( where
expr)expr is a column value or an
expression based on a column value and returning an integer
value, and then defining each partition by means of a
VALUES IN
(, where
value_list)value_list is a comma-separated list
of integers.
In MySQL 5.5, it is possible to match against
only a list of integers (and possibly
NULL—see
Section 3.7, “How MySQL Partitioning Handles NULL”) when
partitioning by LIST.
However, beginning with MySQL 5.5.0, other column types may be
used in value lists when employing LIST
COLUMN partitioning, which is described later in
this section.
Unlike the case with partitions defined by range, list partitions do not need to be declared in any particular order. For more detailed syntactical information, see CREATE TABLE Syntax.
For the examples that follow, we assume that the basic
definition of the table to be partitioned is provided by the
CREATE TABLE statement shown
here:
CREATE TABLE employees (
id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30),
hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT,
store_id INT
);
(This is the same table used as a basis for the examples in Section 3.1, “RANGE Partitioning”.)
Suppose that there are 20 video stores distributed among 4 franchises as shown in the following table.
| Region | Store ID Numbers |
|---|---|
| North | 3, 5, 6, 9, 17 |
| East | 1, 2, 10, 11, 19, 20 |
| West | 4, 12, 13, 14, 18 |
| Central | 7, 8, 15, 16 |
To partition this table in such a way that rows for stores
belonging to the same region are stored in the same partition,
you could use the CREATE TABLE
statement shown here:
CREATE TABLE employees (
id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30),
hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT,
store_id INT
)
PARTITION BY LIST(store_id) (
PARTITION pNorth VALUES IN (3,5,6,9,17),
PARTITION pEast VALUES IN (1,2,10,11,19,20),
PARTITION pWest VALUES IN (4,12,13,14,18),
PARTITION pCentral VALUES IN (7,8,15,16)
);
This makes it easy to add or drop employee records relating to
specific regions to or from the table. For instance, suppose
that all stores in the West region are sold to another company.
Beginning with MySQL 5.5.0, all rows relating to employees
working at stores in that region can be deleted with the query
ALTER TABLE employees TRUNCATE PARTITION
pWest, which can be executed much more efficiently
than the equivalent DELETE
statement DELETE FROM employees WHERE store_id IN
(4,12,13,14,18);. (Using ALTER TABLE
employees DROP PARTITION pWest would also delete all
of these rows, but would also remove the partition
pWest from the definition of the table; you
would need to use an ALTER TABLE ... ADD
PARTITION statement to restore the table's
original partitioning scheme.)
As with RANGE partitioning, it is possible to
combine LIST partitioning with partitioning
by hash or key to produce a composite partitioning
(subpartitioning). See
Section 3.6, “Subpartitioning”.
Unlike the case with RANGE partitioning,
there is no “catch-all” such as
MAXVALUE; all expected values for the
partitioning expression should be covered in PARTITION
... VALUES IN (...) clauses. An
INSERT statement containing an
unmatched partitioning column value fails with an error, as
shown in this example:
mysql>CREATE TABLE h2 (->c1 INT,->c2 INT->)->PARTITION BY LIST(c1) (->PARTITION p0 VALUES IN (1, 4, 7),->PARTITION p1 VALUES IN (2, 5, 8)->);Query OK, 0 rows affected (0.11 sec) mysql>INSERT INTO h2 VALUES (3, 5);ERROR 1525 (HY000): Table has no partition for value 3
When inserting multiple rows using a single
INSERT statement the behavior
depends on whether the table uses a transactional storage
engine. For an InnoDB table, the
statement is considered a single transaction, so the presence of
any unmatched values causes the statement to fail completely,
and no rows are inserted. For a table using a nontransactional
storage engine such as MyISAM, any
rows coming before the row containing the unmatched value are
inserted, but any coming after it are not.
You can cause this type of error to be ignored by using the
IGNORE keyword. If you do so, rows containing
unmatched partitioning column values are not inserted, but any
rows with matching values are inserted, and
no errors are reported:
mysql>TRUNCATE h2;Query OK, 1 row affected (0.00 sec) mysql>SELECT * FROM h2;Empty set (0.00 sec) mysql>INSERT IGNORE INTO h2 VALUES (2, 5), (6, 10), (7, 5), (3, 1), (1, 9);Query OK, 3 rows affected (0.00 sec) Records: 5 Duplicates: 2 Warnings: 0 mysql>SELECT * FROM h2;+------+------+ | c1 | c2 | +------+------+ | 7 | 5 | | 1 | 9 | | 2 | 5 | +------+------+ 3 rows in set (0.00 sec)
MySQL 5.5 provides support for LIST
COLUMNS partitioning. This is a variant of
LIST partitioning that enables you to use
columns of types other than integer types for partitioning
columns, as well as to use multiple columns as partitioning
keys. For more information, see
Section 3.3.2, “LIST COLUMNS partitioning”.
The next two sections discuss
COLUMNS
partitioning, which are variants on
RANGE and LIST
partitioning that were introduced in MySQL 5.5.0.
COLUMNS partitioning enables the use of
multiple columns in partitioning keys. All of these columns are
taken into account both for the purpose of placing rows in
partitions and for the determination of which partitions are to
be checked for matching rows in partition pruning.
In addition, both RANGE COLUMNS partitioning
and LIST COLUMNS partitioning support the use
of non-integer columns for defining value ranges or list
members. The permitted data types are shown in the following
list:
All integer types: TINYINT,
SMALLINT,
MEDIUMINT,
INT
(INTEGER), and
BIGINT. (This is the same as
with partitioning by RANGE and
LIST.)
Other numeric data types (such as
DECIMAL or
FLOAT) are not supported as
partitioning columns.
Columns using other data types relating to dates or times are not supported as partitioning columns.
The following string types:
CHAR,
VARCHAR,
BINARY, and
VARBINARY.
TEXT and
BLOB columns are not
supported as partitioning columns.
The discussions of RANGE COLUMNS and
LIST COLUMNS partitioning in the next two
sections assume that you are already familiar with partitioning
based on ranges and lists as supported in MySQL 5.1 and later;
for more information about these, see
Section 3.1, “RANGE Partitioning”, and
Section 3.2, “LIST Partitioning”, respectively.
Range columns partitioning is similar to range partitioning, but enables you to define partitions using ranges based on multiple column values. In addition, you can define the ranges using columns of types other than integer types.
RANGE COLUMNS partitioning differs
significantly from RANGE partitioning in
the following ways:
RANGE COLUMNS does not accept
expressions, only names of columns.
RANGE COLUMNS accepts a list of one or
more columns.
RANGE COLUMNS partitions are based on
comparisons between
tuples (lists of
column values) rather than comparisons between scalar
values. Placement of rows in RANGE
COLUMNS partitions is also based on comparisons
between tuples; this is discussed further later in this
section.
RANGE COLUMNS partitioning columns are
not restricted to integer columns; string,
DATE and
DATETIME columns can also
be used as partitioning columns. (See
Section 3.3, “COLUMNS Partitioning”, for details.)
The basic syntax for creating a table partitioned by
RANGE COLUMNS is shown here:
CREATE TABLEtable_namePARTITIONED BY RANGE COLUMNS(column_list) ( PARTITIONpartition_nameVALUES LESS THAN (value_list)[, PARTITIONpartition_nameVALUES LESS THAN (value_list)][, ...] )column_list:column_name[,column_name][, ...]value_list:value[,value][, ...]
Not all CREATE TABLE options
that can be used when creating partitioned tables are shown
here. For complete information, see
CREATE TABLE Syntax.
In the syntax just shown,
column_list is a list of one or
more columns (sometimes called a
partitioning column
list), and value_list is
a list of values (that is, it is a
partition definition value
list). A value_list must
be supplied for each partition definition, and each
value_list must have the same
number of values as the column_list
has columns. Generally speaking, if you use
N columns in the
COLUMNS clause, then each VALUES
LESS THAN clause must also be supplied with a list
of N values.
The elements in the partitioning column list and in the value
list defining each partition must occur in the same order. In
addition, each element in the value list must be of the same
data type as the corresponding element in the column list.
However, the order of the column names in the partitioning
column list and the value lists does not have to be the same
as the order of the table column definitions in the main part
of the CREATE TABLE statement.
As with table partitioned by RANGE, you can
use MAXVALUE to represent a value such that
any legal value inserted into a given column is always less
than this value. Here is an example of a
CREATE TABLE statement that
helps to illustrate all of these points:
mysql>CREATE TABLE rcx (->a INT,->b INT,->c CHAR(3),->d INT->)->PARTITION BY RANGE COLUMNS(a,d,c) (->PARTITION p0 VALUES LESS THAN (5,10,'ggg'),->PARTITION p1 VALUES LESS THAN (10,20,'mmmm'),->PARTITION p2 VALUES LESS THAN (15,30,'sss'),->PARTITION p3 VALUES LESS THAN (MAXVALUE,MAXVALUE,MAXVALUE)->);Query OK, 0 rows affected (0.15 sec)
Table rcx contains the columns
a, b,
c, d. The partitioning
column list supplied to the COLUMNS clause
uses 3 of these columns, in the order a,
d, c. Each value list
used to define a partition contains 3 values in the same
order; that is, each value list tuple has the form
(INT, INT,
CHAR(3)), which corresponds to the data
types used by columns a,
d, and c (in that
order).
Placement of rows into partitions is determined by comparing
the tuple from a row to be inserted that matches the column
list in the COLUMNS clause with the tuples
used in the VALUES LESS THAN clauses to
define partitions of the table. Because we are comparing
tuples (that is, lists or sets of values) rather than scalar
values, the semantics of VALUES LESS THAN
as used with RANGE COLUMNS partitions
differs somewhat from the case with simple
RANGE partitions. In
RANGE partitioning, a row generating an
expression value that is equal to a limiting value in a
VALUES LESS THAN is never placed in the
corresponding partition; however, when using RANGE
COLUMNS partitioning, it is sometimes possible for a
row whose partitioning column list's first element is
equal in value to the that of the first element in a
VALUES LESS THAN value list to be placed in
the corresponding partition.
Consider the RANGE partitioned table
created by this statement:
CREATE TABLE r1 (
a INT,
b INT
)
PARTITION BY RANGE (a) (
PARTITION p0 VALUES LESS THAN (5),
PARTITION p1 VALUES LESS THAN (MAXVALUE)
);
If we insert 3 rows into this table such that the column value
for a is 5 for each row,
all 3 rows are stored in partition p1
because the a column value is in each case
not less than 5, as we can see by executing the proper query
against the
INFORMATION_SCHEMA.PARTITIONS
table:
mysql>INSERT INTO r1 VALUES (5,10), (5,11), (5,12);Query OK, 3 rows affected (0.00 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql>SELECT PARTITION_NAME,TABLE_ROWS->FROM INFORMATION_SCHEMA.PARTITIONS->WHERE TABLE_NAME = 'r1';+----------------+------------+ | PARTITION_NAME | TABLE_ROWS | +----------------+------------+ | p0 | 0 | | p1 | 3 | +----------------+------------+ 2 rows in set (0.00 sec)
Now consider a similar table rc1 that uses
RANGE COLUMNS partitioning with both
columns a and b
referenced in the COLUMNS clause, created
as shown here:
CREATE TABLE rc1 (
a INT,
b INT
)
PARTITION BY RANGE COLUMNS(a, b) (
PARTITION p0 VALUES LESS THAN (5, 12),
PARTITION p3 VALUES LESS THAN (MAXVALUE, MAXVALUE)
);
If we insert exactly the same rows into rc1
as we just inserted into r1, the
distribution of the rows is quite different:
mysql>INSERT INTO rc1 VALUES (5,10), (5,11), (5,12);Query OK, 3 rows affected (0.00 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql>SELECT PARTITION_NAME,TABLE_ROWS->FROM INFORMATION_SCHEMA.PARTITIONS->WHERE TABLE_NAME = 'rc1';+--------------+----------------+------------+ | TABLE_SCHEMA | PARTITION_NAME | TABLE_ROWS | +--------------+----------------+------------+ | p | p0 | 2 | | p | p1 | 1 | +--------------+----------------+------------+ 2 rows in set (0.00 sec)
This is because we are comparing rows rather than scalar
values. We can compare the row values inserted with the
limiting row value from the VALUES THAN LESS
THAN clause used to define partition
p0 in table rc1, like
this:
mysql> SELECT (5,10) < (5,12), (5,11) < (5,12), (5,12) < (5,12);
+-----------------+-----------------+-----------------+
| (5,10) < (5,12) | (5,11) < (5,12) | (5,12) < (5,12) |
+-----------------+-----------------+-----------------+
| 1 | 1 | 0 |
+-----------------+-----------------+-----------------+
1 row in set (0.00 sec)
The 2 tuples (5,10) and
(5,11) evaluate as less than
(5,12), so they are stored in partition
p0. Since 5 is not less than 5 and 12 is
not less than 12, (5,12) is considered not
less than (5,12), and is stored in
partition p1.
The SELECT statement in the
preceding example could also have been written using explicit
row constructors, like this:
SELECT ROW(5,10) < ROW(5,12), ROW(5,11) < ROW(5,12), ROW(5,12) < ROW(5,12);
For more information about the use of row constructors in MySQL, see Row Subqueries.
For a table partitioned by RANGE COLUMNS
using only a single partitioning column, the storing of rows
in partitions is the same as that of an equivalent table that
is partitioned by RANGE. The following
CREATE TABLE statement creates a table
partitioned by RANGE COLUMNS using 1
partitioning column:
CREATE TABLE rx (
a INT,
b INT
)
PARTITION BY RANGE COLUMNS (a) (
PARTITION p0 VALUES LESS THAN (5),
PARTITION p1 VALUES LESS THAN (MAXVALUE)
);
If we insert the rows (5,10),
(5,11), and (5,12) into
this table, we can see that their placement is the same as it
is for the table r we created and populated
earlier:
mysql>INSERT INTO rx VALUES (5,10), (5,11), (5,12);Query OK, 3 rows affected (0.00 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql>SELECT PARTITION_NAME,TABLE_ROWS->FROM INFORMATION_SCHEMA.PARTITIONS->WHERE TABLE_NAME = 'rx';+--------------+----------------+------------+ | TABLE_SCHEMA | PARTITION_NAME | TABLE_ROWS | +--------------+----------------+------------+ | p | p0 | 0 | | p | p1 | 3 | +--------------+----------------+------------+ 2 rows in set (0.00 sec)
It is also possible to create tables partitioned by
RANGE COLUMNS where limiting values for one
or more columns are repeated in successive partition
definitions. You can do this as long as the tuples of column
values used to define the partitions are strictly increasing.
For example, each of the following CREATE
TABLE statements is valid:
CREATE TABLE rc2 (
a INT,
b INT
)
PARTITION BY RANGE COLUMNS(a,b) (
PARTITION p0 VALUES LESS THAN (0,10),
PARTITION p1 VALUES LESS THAN (10,20),
PARTITION p2 VALUES LESS THAN (10,30),
PARTITION p3 VALUES LESS THAN (MAXVALUE,MAXVALUE)
);
CREATE TABLE rc3 (
a INT,
b INT
)
PARTITION BY RANGE COLUMNS(a,b) (
PARTITION p0 VALUES LESS THAN (0,10),
PARTITION p1 VALUES LESS THAN (10,20),
PARTITION p2 VALUES LESS THAN (10,30),
PARTITION p3 VALUES LESS THAN (10,35),
PARTITION p4 VALUES LESS THAN (20,40),
PARTITION p5 VALUES LESS THAN (MAXVALUE,MAXVALUE)
);
The following statement also succeeds, even though it might
appear at first glance that it would not, since the limiting
value of column b is 25 for partition
p0 and 20 for partition
p1, and the limiting value of column
c is 100 for partition
p1 and 50 for partition
p2:
CREATE TABLE rc4 (
a INT,
b INT,
c INT
)
PARTITION BY RANGE COLUMNS(a,b,c) (
PARTITION p0 VALUES LESS THAN (0,25,50),
PARTITION p1 VALUES LESS THAN (10,20,100),
PARTITION p2 VALUES LESS THAN (10,30,50)
PARTITION p3 VALUES LESS THAN (MAXVALUE,MAXVALUE,MAXVALUE)
);
When designing tables partitioned by RANGE
COLUMNS, you can always test successive partition
definitions by comparing the desired tuples using the
mysql client, like this:
mysql> SELECT (0,25,50) < (10,20,100), (10,20,100) < (10,30,50);
+-------------------------+--------------------------+
| (0,25,50) < (10,20,100) | (10,20,100) < (10,30,50) |
+-------------------------+--------------------------+
| 1 | 1 |
+-------------------------+--------------------------+
1 row in set (0.00 sec)
If a CREATE TABLE statement
contains partition definitions that are not in strictly
increasing order, it fails with an error, as shown in this
example:
mysql>CREATE TABLE rcf (->a INT,->b INT,->c INT->)->PARTITION BY RANGE COLUMNS(a,b,c) (->PARTITION p0 VALUES LESS THAN (0,25,50),->PARTITION p1 VALUES LESS THAN (20,20,100),->PARTITION p2 VALUES LESS THAN (10,30,50),->PARTITION p3 VALUES LESS THAN (MAXVALUE,MAXVALUE,MAXVALUE)->);ERROR 1493 (HY000): VALUES LESS THAN value must be strictly increasing for each partition
When you get such an error, you can deduce which partition
definitions are invalid by making “less than”
comparisons between their column lists. In this case, the
problem is with the definition of partition
p2 because the tuple used to define it is
not less than the tuple used to define partition
p3, as shown here:
mysql> SELECT (0,25,50) < (20,20,100), (20,20,100) < (10,30,50);
+-------------------------+--------------------------+
| (0,25,50) < (20,20,100) | (20,20,100) < (10,30,50) |
+-------------------------+--------------------------+
| 1 | 0 |
+-------------------------+--------------------------+
1 row in set (0.00 sec)
It is also possible for MAXVALUE to appear
for the same column in more than one VALUES LESS
THAN clause when using RANGE
COLUMNS. However, the limiting values for individual
columns in successive partition definitions should otherwise
be increasing, there should be no more than one partition
defined where MAXVALUE is used as the upper
limit for all column values, and this partition definition
should appear last in the list of PARTITION ...
VALUES LESS THAN clauses. In addition, you cannot
use MAXVALUE as the limiting value for the
first column in more than one partition definition.
As stated previously, it is also possible with RANGE
COLUMNS partitioning to use non-integer columns as
partitioning columns. (See
Section 3.3, “COLUMNS Partitioning”, for a complete listing
of these.) Consider a table named employees
(which is not partitioned), created using the following
statement:
CREATE TABLE employees (
id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30),
hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT NOT NULL,
store_id INT NOT NULL
);
Using RANGE COLUMNS partitioning, you can
create a version of this table that stores each row in one of
four partitions based on the employee's last name, like
this:
CREATE TABLE employees_by_lname (
id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30),
hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT NOT NULL,
store_id INT NOT NULL
)
PARTITION BY RANGE COLUMNS (lname) (
PARTITION p0 VALUES LESS THAN ('g'),
PARTITION p1 VALUES LESS THAN ('m'),
PARTITION p2 VALUES LESS THAN ('t'),
PARTITION p3 VALUES LESS THAN (MAXVALUE)
);
Alternatively, you could cause the
employees table as created previously to be
partitioned using this scheme by executing the following
ALTER
TABLE statement:
ALTER TABLE employees PARTITION BY RANGE COLUMNS (lname) (
PARTITION p0 VALUES LESS THAN ('g'),
PARTITION p1 VALUES LESS THAN ('m'),
PARTITION p2 VALUES LESS THAN ('t'),
PARTITION p3 VALUES LESS THAN (MAXVALUE)
);
Because different character sets and collations have
different sort orders, the character sets and collations in
use may effect which partition of a table partitioned by
RANGE COLUMNS a given row is stored in
when using string columns as partitioning columns. In
addition, changing the character set or collation for a
given database, table, or column after such a table is
created may cause changes in how rows are distributed. For
example, when using a case-sensitive collation,
'and' sorts before
'Andersen', but when using a collation
that is case insensitive, the reverse is true.
For information about how MySQL handles character sets and collations, see Character Set Support.
Similarly, you can cause the employees
table to be partitioned in such a way that each row is stored
in one of several partitions based on the decade in which the
corresponding employee was hired using the
ALTER
TABLE statement shown here:
ALTER TABLE employees PARTITION BY RANGE COLUMNS (hired) (
PARTITION p0 VALUES LESS THAN ('1970-01-01'),
PARTITION p1 VALUES LESS THAN ('1980-01-01'),
PARTITION p2 VALUES LESS THAN ('1990-01-01'),
PARTITION p3 VALUES LESS THAN ('2000-01-01'),
PARTITION p4 VALUES LESS THAN ('2010-01-01'),
PARTITION p5 VALUES LESS THAN (MAXVALUE)
);
See CREATE TABLE Syntax, for additional information
about PARTITION BY RANGE COLUMNS syntax.
MySQL 5.5 provides support for LIST
COLUMNS partitioning. This is a variant of
LIST partitioning that enables the use of
multiple columns as partition keys, and for columns of data
types other than integer types to be used as partitioning
columns; you can use string types,
DATE, and
DATETIME columns. (For more
information about permitted data types for
COLUMNS partitioning columns, see
Section 3.3, “COLUMNS Partitioning”.)
Suppose that you have a business that has customers in 12 cities which, for sales and marketing purposes, you organize into 4 regions of 3 cities each as shown in the following table:
| Region | Cities |
|---|---|
| 1 | Oskarshamn, Högsby, Mönsterås |
| 2 | Vimmerby, Hultsfred, Västervik |
| 3 | Nässjö, Eksjö, Vetlanda |
| 4 | Uppvidinge, Alvesta, Växjo |
With LIST COLUMNS partitioning, you can
create a table for customer data that assigns a row to any of
4 partitions corresponding to these regions based on the name
of the city where a customer resides, as shown here:
CREATE TABLE customers_1 (
first_name VARCHAR(25),
last_name VARCHAR(25),
street_1 VARCHAR(30),
street_2 VARCHAR(30),
city VARCHAR(15),
renewal DATE
)
PARTITION BY LIST COLUMNS(city) (
PARTITION pRegion_1 VALUES IN('Oskarshamn', 'Högsby', 'Mönsterås'),
PARTITION pRegion_2 VALUES IN('Vimmerby', 'Hultsfred', 'Västervik'),
PARTITION pRegion_3 VALUES IN('Nässjö', 'Eksjö', 'Vetlanda'),
PARTITION pRegion_4 VALUES IN('Uppvidinge', 'Alvesta', 'Växjo')
);
As with partitioning by RANGE COLUMNS, you
do not need to use expressions in the
COLUMNS() clause to convert column values
into integers. (In fact, the use of expressions other than
column names is not permitted with
COLUMNS().)
It is also possible to use DATE
and DATETIME columns, as shown
in the following example that uses the same name and columns
as the customers_1 table shown previously,
but employs LIST COLUMNS partitioning based
on the renewal column to store rows in one
of 4 partitions depending on the week in February 2010 the
customer's account is scheduled to renew:
CREATE TABLE customers_2 (
first_name VARCHAR(25),
last_name VARCHAR(25),
street_1 VARCHAR(30),
street_2 VARCHAR(30),
city VARCHAR(15),
renewal DATE
)
PARTITION BY LIST COLUMNS(renewal) (
PARTITION pWeek_1 VALUES IN('2010-02-01', '2010-02-02', '2010-02-03',
'2010-02-04', '2010-02-05', '2010-02-06', '2010-02-07'),
PARTITION pWeek_2 VALUES IN('2010-02-08', '2010-02-09', '2010-02-10',
'2010-02-11', '2010-02-12', '2010-02-13', '2010-02-14'),
PARTITION pWeek_3 VALUES IN('2010-02-15', '2010-02-16', '2010-02-17',
'2010-02-18', '2010-02-19', '2010-02-20', '2010-02-21'),
PARTITION pWeek_4 VALUES IN('2010-02-22', '2010-02-23', '2010-02-24',
'2010-02-25', '2010-02-26', '2010-02-27', '2010-02-28')
);
This works, but becomes cumbersome to define and maintain if
the number of dates involved grows very large; in such cases,
it is usually more practical to employ
RANGE or RANGE COLUMNS
partitioning instead. In this case, since the column we wish
to use as the partitioning key is a
DATE column, we use
RANGE COLUMNS partitioning, as shown here:
CREATE TABLE customers_3 (
first_name VARCHAR(25),
last_name VARCHAR(25),
street_1 VARCHAR(30),
street_2 VARCHAR(30),
city VARCHAR(15),
renewal DATE
)
PARTITION BY RANGE COLUMNS(renewal) (
PARTITION pWeek_1 VALUES LESS THAN('2010-02-09'),
PARTITION pWeek_2 VALUES LESS THAN('2010-02-15'),
PARTITION pWeek_3 VALUES LESS THAN('2010-02-22'),
PARTITION pWeek_4 VALUES LESS THAN('2010-03-01')
);
See Section 3.3.1, “RANGE COLUMNS partitioning”, for more information.
In addition (as with RANGE COLUMNS
partitioning), you can use multiple columns in the
COLUMNS() clause.
See CREATE TABLE Syntax, for additional information
about PARTITION BY LIST COLUMNS() syntax.
Partitioning by HASH is used primarily to
ensure an even distribution of data among a predetermined number
of partitions. With range or list partitioning, you must specify
explicitly into which partition a given column value or set of
column values is to be stored; with hash partitioning, MySQL
takes care of this for you, and you need only specify a column
value or expression based on a column value to be hashed and the
number of partitions into which the partitioned table is to be
divided.
To partition a table using HASH partitioning,
it is necessary to append to the CREATE
TABLE statement a PARTITION BY HASH
( clause, where
expr)expr is an expression that returns an
integer. This can simply be the name of a column whose type is
one of MySQL's integer types. In addition, you most likely
want to follow this with PARTITIONS
, where
numnum is a positive integer
representing the number of partitions into which the table is to
be divided.
For simplicity, the tables in the examples that follow do not use any keys. You should be aware that, if a table has any unique keys, every column used in the partitioning expression for this table must be part of every unique key, including the primary key. See Section 6.1, “Partitioning Keys, Primary Keys, and Unique Keys”, for more information.
The following statement creates a table that uses hashing on the
store_id column and is divided into 4
partitions:
CREATE TABLE employees (
id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30),
hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT,
store_id INT
)
PARTITION BY HASH(store_id)
PARTITIONS 4;
If you do not include a PARTITIONS clause,
the number of partitions defaults to 1.
Using the PARTITIONS keyword without a number
following it results in a syntax error.
You can also use an SQL expression that returns an integer for
expr. For instance, you might want to
partition based on the year in which an employee was hired. This
can be done as shown here:
CREATE TABLE employees (
id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30),
hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT,
store_id INT
)
PARTITION BY HASH( YEAR(hired) )
PARTITIONS 4;
expr must return a nonconstant,
nonrandom integer value (in other words, it should be varying
but deterministic), and must not contain any prohibited
constructs as described in
Chapter 6, Restrictions and Limitations on Partitioning. You should also keep
in mind that this expression is evaluated each time a row is
inserted or updated (or possibly deleted); this means that very
complex expressions may give rise to performance issues,
particularly when performing operations (such as batch inserts)
that affect a great many rows at one time.
The most efficient hashing function is one which operates upon a single table column and whose value increases or decreases consistently with the column value, as this allows for “pruning” on ranges of partitions. That is, the more closely that the expression varies with the value of the column on which it is based, the more efficiently MySQL can use the expression for hash partitioning.
For example, where date_col is a column of
type DATE, then the expression
TO_DAYS(date_col) is said to vary
directly with the value of date_col, because
for every change in the value of date_col,
the value of the expression changes in a consistent manner. The
variance of the expression
YEAR(date_col) with respect to
date_col is not quite as direct as that of
TO_DAYS(date_col), because not
every possible change in date_col produces an
equivalent change in
YEAR(date_col). Even so,
YEAR(date_col) is a good
candidate for a hashing function, because it varies directly
with a portion of date_col and there is no
possible change in date_col that produces a
disproportionate change in
YEAR(date_col).
By way of contrast, suppose that you have a column named
int_col whose type is
INT. Now consider the expression
POW(5-int_col,3) + 6. This would
be a poor choice for a hashing function because a change in the
value of int_col is not guaranteed to produce
a proportional change in the value of the expression. Changing
the value of int_col by a given amount can
produce by widely different changes in the value of the
expression. For example, changing int_col
from 5 to 6 produces a
change of -1 in the value of the expression,
but changing the value of int_col from
6 to 7 produces a change
of -7 in the expression value.
In other words, the more closely the graph of the column value
versus the value of the expression follows a straight line as
traced by the equation
y= where
cxc is some nonzero constant, the
better the expression is suited to hashing. This has to do with
the fact that the more nonlinear an expression is, the more
uneven the distribution of data among the partitions it tends to
produce.
In theory, pruning is also possible for expressions involving more than one column value, but determining which of such expressions are suitable can be quite difficult and time-consuming. For this reason, the use of hashing expressions involving multiple columns is not particularly recommended.
When PARTITION BY HASH is used, MySQL
determines which partition of num
partitions to use based on the modulus of the result of the user
function. In other words, for an expression
expr, the partition in which the
record is stored is partition number
N, where
. Suppose that table
N =
MOD(expr,
num)t1 is defined as follows, so that it has 4
partitions:
CREATE TABLE t1 (col1 INT, col2 CHAR(5), col3 DATE)
PARTITION BY HASH( YEAR(col3) )
PARTITIONS 4;
If you insert a record into t1 whose
col3 value is
'2005-09-15', then the partition in which it
is stored is determined as follows:
MOD(YEAR('2005-09-01'),4)
= MOD(2005,4)
= 1
MySQL 5.5 also supports a variant of
HASH partitioning known as
linear hashing which
employs a more complex algorithm for determining the placement
of new rows inserted into the partitioned table. See
Section 3.4.1, “LINEAR HASH Partitioning”, for a description of
this algorithm.
The user function is evaluated each time a record is inserted or updated. It may also—depending on the circumstances—be evaluated when records are deleted.
If a table to be partitioned has a UNIQUE
key, then any columns supplied as arguments to the
HASH user function or to the
KEY's
column_list must be part of that
key.
MySQL also supports linear hashing, which differs from regular hashing in that linear hashing utilizes a linear powers-of-two algorithm whereas regular hashing employs the modulus of the hashing function's value.
Syntactically, the only difference between linear-hash
partitioning and regular hashing is the addition of the
LINEAR keyword in the PARTITION
BY clause, as shown here:
CREATE TABLE employees (
id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30),
hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT,
store_id INT
)
PARTITION BY LINEAR HASH( YEAR(hired) )
PARTITIONS 4;
Given an expression expr, the
partition in which the record is stored when linear hashing is
used is partition number N from
among num partitions, where
N is derived according to the
following algorithm:
Find the next power of 2 greater than
num. We call this value
V; it can be calculated as:
V= POWER(2, CEILING(LOG(2,num)))
(Suppose that num is 13. Then
LOG(2,13) is
3.7004397181411.
CEILING(3.7004397181411) is
4, and V =
POWER(2,4), which is 16.)
Set N =
F(column_list)
& (V - 1).
While N >=
num:
Set V =
CEIL(V / 2)
Set N =
N &
(V - 1)
Suppose that the table t1, using linear
hash partitioning and having 6 partitions, is created using
this statement:
CREATE TABLE t1 (col1 INT, col2 CHAR(5), col3 DATE)
PARTITION BY LINEAR HASH( YEAR(col3) )
PARTITIONS 6;
Now assume that you want to insert two records into
t1 having the col3
column values '2003-04-14' and
'1998-10-19'. The partition number for the
first of these is determined as follows:
V= POWER(2, CEILING( LOG(2,6) )) = 8N= YEAR('2003-04-14') & (8 - 1) = 2003 & 7 = 3 (3 >= 6 is FALSE: record stored in partition #3)
The number of the partition where the second record is stored is calculated as shown here:
V= 8N= YEAR('1998-10-19') & (8-1) = 1998 & 7 = 6 (6 >= 6 is TRUE: additional step required)N= 6 & CEILING(8 / 2) = 6 & 3 = 2 (2 >= 6 is FALSE: record stored in partition #2)
The advantage in partitioning by linear hash is that the adding, dropping, merging, and splitting of partitions is made much faster, which can be beneficial when dealing with tables containing extremely large amounts (terabytes) of data. The disadvantage is that data is less likely to be evenly distributed between partitions as compared with the distribution obtained using regular hash partitioning.
Partitioning by key is similar to partitioning by hash, except
that where hash partitioning employs a user-defined expression,
the hashing function for key partitioning is supplied by the
MySQL server. MySQL Cluster uses
MD5() for this purpose; for
tables using other storage engines, the server employs its own
internal hashing function which is based on the same algorithm
as PASSWORD().
The syntax rules for CREATE TABLE ... PARTITION BY
KEY are similar to those for creating a table that is
partitioned by hash. The major differences are listed here:
KEY is used rather than
HASH.
KEY takes only a list of zero or more
column names. Any columns used as the partitioning key must
comprise part or all of the table's primary key, if the
table has one. Where no column name is specified as the
partitioning key, the table's primary key is used, if
there is one. For example, the following
CREATE TABLE statement is
valid in MySQL 5.5:
CREATE TABLE k1 (
id INT NOT NULL PRIMARY KEY,
name VARCHAR(20)
)
PARTITION BY KEY()
PARTITIONS 2;
If there is no primary key but there is a unique key, then the unique key is used for the partitioning key:
CREATE TABLE k1 (
id INT NOT NULL,
name VARCHAR(20),
UNIQUE KEY (id)
)
PARTITION BY KEY()
PARTITIONS 2;
However, if the unique key column were not defined as
NOT NULL, then the previous statement
would fail.
In both of these cases, the partitioning key is the
id column, even though it is not shown in
the output of SHOW CREATE
TABLE or in the
PARTITION_EXPRESSION column of the
INFORMATION_SCHEMA.PARTITIONS
table.
Unlike the case with other partitioning types, columns used
for partitioning by KEY are not
restricted to integer or NULL values. For
example, the following CREATE
TABLE statement is valid:
CREATE TABLE tm1 (
s1 CHAR(32) PRIMARY KEY
)
PARTITION BY KEY(s1)
PARTITIONS 10;
The preceding statement would not be
valid, were a different partitioning type to be specified.
(In this case, simply using PARTITION BY
KEY() would also be valid and have the same effect
as PARTITION BY KEY(s1), since
s1 is the table's primary key.)
For additional information about this issue, see Chapter 6, Restrictions and Limitations on Partitioning.
Tables using the NDBCLUSTER
storage engine are implicitly partitioned by
KEY, again using the table's
primary key as the partitioning key. In the event that the
MySQL Cluster table has no explicit primary key, the
“hidden” primary key generated by the
NDBCLUSTER storage engine for
each MySQL Cluster table is used as the partitioning key.
If you define an explicit partitioning scheme for an
NDBCLUSTER table, the table
must have an explicit primary key, and any columns used in
the partitioning expression must be part of this key.
However, if the table uses an “empty”
partitioning expression—that is, PARTITION
BY KEY() with no column references—then no
explicit primary key is required.
You can observe this partitioning using the
ndb_desc utility (with the
-p option).
For a key-partitioned table, you cannot execute an
ALTER TABLE DROP PRIMARY KEY, as doing
so generates the error ERROR 1466 (HY000):
Field in list of fields for partition function not found
in table. This is not an issue for MySQL
Cluster tables which are partitioned by
KEY; in such cases, the table is
reorganized using the “hidden” primary key as
the table's new partitioning key. See
MySQL Cluster NDB 7.2.
It is also possible to partition a table by linear key. Here is a simple example:
CREATE TABLE tk (
col1 INT NOT NULL,
col2 CHAR(5),
col3 DATE
)
PARTITION BY LINEAR KEY (col1)
PARTITIONS 3;
Using LINEAR has the same effect on
KEY partitioning as it does on
HASH partitioning, with the partition number
being derived using a powers-of-two algorithm rather than modulo
arithmetic. See Section 3.4.1, “LINEAR HASH Partitioning”, for
a description of this algorithm and its implications.
Subpartitioning—also known as
composite
partitioning—is the further division of each
partition in a partitioned table. Consider the following
CREATE TABLE statement:
CREATE TABLE ts (id INT, purchased DATE)
PARTITION BY RANGE( YEAR(purchased) )
SUBPARTITION BY HASH( TO_DAYS(purchased) )
SUBPARTITIONS 2 (
PARTITION p0 VALUES LESS THAN (1990),
PARTITION p1 VALUES LESS THAN (2000),
PARTITION p2 VALUES LESS THAN MAXVALUE
);
Table ts has 3 RANGE
partitions. Each of these
partitions—p0, p1,
and p2—is further divided into 2
subpartitions. In effect, the entire table is divided into
3 * 2 = 6 partitions. However, due to the
action of the PARTITION BY RANGE clause, the
first 2 of these store only those records with a value less than
1990 in the purchased column.
In MySQL 5.5, it is possible to subpartition tables
that are partitioned by RANGE or
LIST. Subpartitions may use either
HASH or KEY partitioning.
This is also known as composite
partitioning.
SUBPARTITION BY HASH and
SUBPARTITION BY KEY generally follow the
same syntax rules as PARTITION BY HASH and
PARTITION BY KEY, respectively. An
exception to this is that SUBPARTITION BY
KEY (unlike PARTITION BY KEY)
does not currently support a default column, so the column
used for this purpose must be specified, even if the table has
an explicit primary key. This is a known issue which we are
working to address; see
Issues with subpartitions, for
more information and an example.
It is also possible to define subpartitions explicitly using
SUBPARTITION clauses to specify options for
individual subpartitions. For example, a more verbose fashion of
creating the same table ts as shown in the
previous example would be:
CREATE TABLE ts (id INT, purchased DATE)
PARTITION BY RANGE( YEAR(purchased) )
SUBPARTITION BY HASH( TO_DAYS(purchased) ) (
PARTITION p0 VALUES LESS THAN (1990) (
SUBPARTITION s0,
SUBPARTITION s1
),
PARTITION p1 VALUES LESS THAN (2000) (
SUBPARTITION s2,
SUBPARTITION s3
),
PARTITION p2 VALUES LESS THAN MAXVALUE (
SUBPARTITION s4,
SUBPARTITION s5
)
);
Some syntactical items of note are listed here:
Each partition must have the same number of subpartitions.
If you explicitly define any subpartitions using
SUBPARTITION on any partition of a
partitioned table, you must define them all. In other words,
the following statement will fail:
CREATE TABLE ts (id INT, purchased DATE)
PARTITION BY RANGE( YEAR(purchased) )
SUBPARTITION BY HASH( TO_DAYS(purchased) ) (
PARTITION p0 VALUES LESS THAN (1990) (
SUBPARTITION s0,
SUBPARTITION s1
),
PARTITION p1 VALUES LESS THAN (2000),
PARTITION p2 VALUES LESS THAN MAXVALUE (
SUBPARTITION s2,
SUBPARTITION s3
)
);
This statement would still fail even if it included a
SUBPARTITIONS 2 clause.
Each SUBPARTITION clause must include (at
a minimum) a name for the subpartition. Otherwise, you may
set any desired option for the subpartition or allow it to
assume its default setting for that option.
Subpartition names must be unique across the entire table.
For example, the following CREATE
TABLE statement is valid in MySQL
5.5:
CREATE TABLE ts (id INT, purchased DATE)
PARTITION BY RANGE( YEAR(purchased) )
SUBPARTITION BY HASH( TO_DAYS(purchased) ) (
PARTITION p0 VALUES LESS THAN (1990) (
SUBPARTITION s0,
SUBPARTITION s1
),
PARTITION p1 VALUES LESS THAN (2000) (
SUBPARTITION s2,
SUBPARTITION s3
),
PARTITION p2 VALUES LESS THAN MAXVALUE (
SUBPARTITION s4,
SUBPARTITION s5
)
);
Subpartitions can be used with especially large tables to
distribute data and indexes across many disks. Suppose that you
have 6 disks mounted as /disk0,
/disk1, /disk2, and so
on. Now consider the following example:
CREATE TABLE ts (id INT, purchased DATE)
PARTITION BY RANGE( YEAR(purchased) )
SUBPARTITION BY HASH( TO_DAYS(purchased) ) (
PARTITION p0 VALUES LESS THAN (1990) (
SUBPARTITION s0
DATA DIRECTORY = '/disk0/data'
INDEX DIRECTORY = '/disk0/idx',
SUBPARTITION s1
DATA DIRECTORY = '/disk1/data'
INDEX DIRECTORY = '/disk1/idx'
),
PARTITION p1 VALUES LESS THAN (2000) (
SUBPARTITION s2
DATA DIRECTORY = '/disk2/data'
INDEX DIRECTORY = '/disk2/idx',
SUBPARTITION s3
DATA DIRECTORY = '/disk3/data'
INDEX DIRECTORY = '/disk3/idx'
),
PARTITION p2 VALUES LESS THAN MAXVALUE (
SUBPARTITION s4
DATA DIRECTORY = '/disk4/data'
INDEX DIRECTORY = '/disk4/idx',
SUBPARTITION s5
DATA DIRECTORY = '/disk5/data'
INDEX DIRECTORY = '/disk5/idx'
)
);
In this case, a separate disk is used for the data and for the
indexes of each RANGE. Many other variations
are possible; another example might be:
CREATE TABLE ts (id INT, purchased DATE)
PARTITION BY RANGE(YEAR(purchased))
SUBPARTITION BY HASH( TO_DAYS(purchased) ) (
PARTITION p0 VALUES LESS THAN (1990) (
SUBPARTITION s0a
DATA DIRECTORY = '/disk0'
INDEX DIRECTORY = '/disk1',
SUBPARTITION s0b
DATA DIRECTORY = '/disk2'
INDEX DIRECTORY = '/disk3'
),
PARTITION p1 VALUES LESS THAN (2000) (
SUBPARTITION s1a
DATA DIRECTORY = '/disk4/data'
INDEX DIRECTORY = '/disk4/idx',
SUBPARTITION s1b
DATA DIRECTORY = '/disk5/data'
INDEX DIRECTORY = '/disk5/idx'
),
PARTITION p2 VALUES LESS THAN MAXVALUE (
SUBPARTITION s2a,
SUBPARTITION s2b
)
);
Here, the storage is as follows:
Rows with purchased dates from before
1990 take up a vast amount of space, so are split up 4 ways,
with a separate disk dedicated to the data and to the
indexes for each of the two subpartitions
(s0a and s0b) making
up partition p0. In other words:
The data for subpartition s0a is
stored on /disk0.
The indexes for subpartition s0a are
stored on /disk1.
The data for subpartition s0b is
stored on /disk2.
The indexes for subpartition s0b are
stored on /disk3.
Rows containing dates ranging from 1990 to 1999 (partition
p1) do not require as much room as those
from before 1990. These are split between 2 disks
(/disk4 and
/disk5) rather than 4 disks as with the
legacy records stored in p0:
Data and indexes belonging to p1's
first subpartition (s1a) are stored
on /disk4—the data in
/disk4/data, and the indexes in
/disk4/idx.
Data and indexes belonging to p1's
second subpartition (s1b) are stored
on /disk5—the data in
/disk5/data, and the indexes in
/disk5/idx.
Rows reflecting dates from the year 2000 to the present
(partition p2) do not take up as much
space as required by either of the two previous ranges.
Currently, it is sufficient to store all of these in the
default location.
In future, when the number of purchases for the decade
beginning with the year 2000 grows to a point where the
default location no longer provides sufficient space, the
corresponding rows can be moved using an ALTER
TABLE ... REORGANIZE PARTITION statement. See
Chapter 4, Partition Management, for an
explanation of how this can be done.
The DATA DIRECTORY and INDEX
DIRECTORY options are not permitted in partition
definitions when the
NO_DIR_IN_CREATE server SQL
mode is in effect. Beginning with MySQL 5.5.5, these options are
also not permitted when defining subpartitions (Bug #42954).
Partitioning in MySQL does nothing to disallow
NULL as the value of a partitioning
expression, whether it is a column value or the value of a
user-supplied expression. Even though it is permitted to use
NULL as the value of an expression that must
otherwise yield an integer, it is important to keep in mind that
NULL is not a number. MySQL's
partitioning implementation treats NULL as
being less than any non-NULL value, just as
ORDER BY does.
This means that treatment of NULL varies
between partitioning of different types, and may produce
behavior which you do not expect if you are not prepared for it.
This being the case, we discuss in this section how each MySQL
partitioning type handles NULL values when
determining the partition in which a row should be stored, and
provide examples for each.
Handling of NULL with RANGE partitioning.
If you insert a row into a table partitioned by
RANGE such that the column value used to
determine the partition is NULL, the row is
inserted into the lowest partition. Consider these two tables
in a database named p, created as follows:
mysql>CREATE TABLE t1 (->c1 INT,->c2 VARCHAR(20)->)->PARTITION BY RANGE(c1) (->PARTITION p0 VALUES LESS THAN (0),->PARTITION p1 VALUES LESS THAN (10),->PARTITION p2 VALUES LESS THAN MAXVALUE->);Query OK, 0 rows affected (0.09 sec) mysql>CREATE TABLE t2 (->c1 INT,->c2 VARCHAR(20)->)->PARTITION BY RANGE(c1) (->PARTITION p0 VALUES LESS THAN (-5),->PARTITION p1 VALUES LESS THAN (0),->PARTITION p2 VALUES LESS THAN (10),->PARTITION p3 VALUES LESS THAN MAXVALUE->);Query OK, 0 rows affected (0.09 sec)
You can see the partitions created by these two
CREATE TABLE statements using the
following query against the
PARTITIONS table in the
INFORMATION_SCHEMA database:
mysql>SELECT TABLE_NAME, PARTITION_NAME, TABLE_ROWS, AVG_ROW_LENGTH, DATA_LENGTH>FROM INFORMATION_SCHEMA.PARTITIONS>WHERE TABLE_SCHEMA = 'p' AND TABLE_NAME LIKE 't_';+------------+----------------+------------+----------------+-------------+ | TABLE_NAME | PARTITION_NAME | TABLE_ROWS | AVG_ROW_LENGTH | DATA_LENGTH | +------------+----------------+------------+----------------+-------------+ | t1 | p0 | 0 | 0 | 0 | | t1 | p1 | 0 | 0 | 0 | | t1 | p2 | 0 | 0 | 0 | | t2 | p0 | 0 | 0 | 0 | | t2 | p1 | 0 | 0 | 0 | | t2 | p2 | 0 | 0 | 0 | | t2 | p3 | 0 | 0 | 0 | +------------+----------------+------------+----------------+-------------+ 7 rows in set (0.00 sec)
(For more information about this table, see
The INFORMATION_SCHEMA PARTITIONS Table.) Now let us populate each of
these tables with a single row containing a
NULL in the column used as the partitioning
key, and verify that the rows were inserted using a pair of
SELECT statements:
mysql>INSERT INTO t1 VALUES (NULL, 'mothra');Query OK, 1 row affected (0.00 sec) mysql>INSERT INTO t2 VALUES (NULL, 'mothra');Query OK, 1 row affected (0.00 sec) mysql>SELECT * FROM t1;+------+--------+ | id | name | +------+--------+ | NULL | mothra | +------+--------+ 1 row in set (0.00 sec) mysql>SELECT * FROM t2;+------+--------+ | id | name | +------+--------+ | NULL | mothra | +------+--------+ 1 row in set (0.00 sec)
You can see which partitions are used to store the inserted rows
by rerunning the previous query against
INFORMATION_SCHEMA.PARTITIONS and
inspecting the output:
mysql>SELECT TABLE_NAME, PARTITION_NAME, TABLE_ROWS, AVG_ROW_LENGTH, DATA_LENGTH>FROM INFORMATION_SCHEMA.PARTITIONS>WHERE TABLE_SCHEMA = 'p' AND TABLE_NAME LIKE 't_';+------------+----------------+------------+----------------+-------------+ | TABLE_NAME | PARTITION_NAME | TABLE_ROWS | AVG_ROW_LENGTH | DATA_LENGTH | +------------+----------------+------------+----------------+-------------+ | t1 | p0 | 1 | 20 | 20 | | t1 | p1 | 0 | 0 | 0 | | t1 | p2 | 0 | 0 | 0 | | t2 | p0 | 1 | 20 | 20 | | t2 | p1 | 0 | 0 | 0 | | t2 | p2 | 0 | 0 | 0 | | t2 | p3 | 0 | 0 | 0 | +------------+----------------+------------+----------------+-------------+ 7 rows in set (0.01 sec)
You can also demonstrate that these rows were stored in the
lowest partition of each table by dropping these partitions, and
then re-running the SELECT
statements:
mysql>ALTER TABLE t1 DROP PARTITION p0;Query OK, 0 rows affected (0.16 sec) mysql>ALTER TABLE t2 DROP PARTITION p0;Query OK, 0 rows affected (0.16 sec) mysql>SELECT * FROM t1;Empty set (0.00 sec) mysql>SELECT * FROM t2;Empty set (0.00 sec)
(For more information on ALTER TABLE ... DROP
PARTITION, see ALTER TABLE Syntax.)
NULL is also treated in this way for
partitioning expressions that use SQL functions. Suppose that we
define a table using a CREATE
TABLE statement such as this one:
CREATE TABLE tndate (
id INT,
dt DATE
)
PARTITION BY RANGE( YEAR(dt) ) (
PARTITION p0 VALUES LESS THAN (1990),
PARTITION p1 VALUES LESS THAN (2000),
PARTITION p2 VALUES LESS THAN MAXVALUE
);
As with other MySQL functions,
YEAR(NULL) returns
NULL. A row with a dt
column value of NULL is treated as though the
partitioning expression evaluated to a value less than any other
value, and so is inserted into partition p0.
Handling of NULL with LIST partitioning.
A table that is partitioned by LIST admits
NULL values if and only if one of its
partitions is defined using that value-list that contains
NULL. The converse of this is that a table
partitioned by LIST which does not
explicitly use NULL in a value list rejects
rows resulting in a NULL value for the
partitioning expression, as shown in this example:
mysql>CREATE TABLE ts1 (->c1 INT,->c2 VARCHAR(20)->)->PARTITION BY LIST(c1) (->PARTITION p0 VALUES IN (0, 3, 6),->PARTITION p1 VALUES IN (1, 4, 7),->PARTITION p2 VALUES IN (2, 5, 8)->);Query OK, 0 rows affected (0.01 sec) mysql>INSERT INTO ts1 VALUES (9, 'mothra');ERROR 1504 (HY000): Table has no partition for value 9 mysql>INSERT INTO ts1 VALUES (NULL, 'mothra');ERROR 1504 (HY000): Table has no partition for value NULL
Only rows having a c1 value between
0 and 8 inclusive can be
inserted into ts1. NULL
falls outside this range, just like the number
9. We can create tables
ts2 and ts3 having value
lists containing NULL, as shown here:
mysql> CREATE TABLE ts2 (
-> c1 INT,
-> c2 VARCHAR(20)
-> )
-> PARTITION BY LIST(c1) (
-> PARTITION p0 VALUES IN (0, 3, 6),
-> PARTITION p1 VALUES IN (1, 4, 7),
-> PARTITION p2 VALUES IN (2, 5, 8),
-> PARTITION p3 VALUES IN (NULL)
-> );
Query OK, 0 rows affected (0.01 sec)
mysql> CREATE TABLE ts3 (
-> c1 INT,
-> c2 VARCHAR(20)
-> )
-> PARTITION BY LIST(c1) (
-> PARTITION p0 VALUES IN (0, 3, 6),
-> PARTITION p1 VALUES IN (1, 4, 7, NULL),
-> PARTITION p2 VALUES IN (2, 5, 8)
-> );
Query OK, 0 rows affected (0.01 sec)
When defining value lists for partitioning, you can (and should)
treat NULL just as you would any other value.
For example, both VALUES IN (NULL) and
VALUES IN (1, 4, 7, NULL) are valid, as are
VALUES IN (1, NULL, 4, 7), VALUES IN
(NULL, 1, 4, 7), and so on. You can insert a row
having NULL for column c1
into each of the tables ts2 and
ts3:
mysql>INSERT INTO ts2 VALUES (NULL, 'mothra');Query OK, 1 row affected (0.00 sec) mysql>INSERT INTO ts3 VALUES (NULL, 'mothra');Query OK, 1 row affected (0.00 sec)
By issuing the appropriate query against
INFORMATION_SCHEMA.PARTITIONS, you
can determine which partitions were used to store the rows just
inserted (we assume, as in the previous examples, that the
partitioned tables were created in the p
database):
mysql>SELECT TABLE_NAME, PARTITION_NAME, TABLE_ROWS, AVG_ROW_LENGTH, DATA_LENGTH>FROM INFORMATION_SCHEMA.PARTITIONS>WHERE TABLE_SCHEMA = 'p' AND TABLE_NAME LIKE 'ts_';+------------+----------------+------------+----------------+-------------+ | TABLE_NAME | PARTITION_NAME | TABLE_ROWS | AVG_ROW_LENGTH | DATA_LENGTH | +------------+----------------+------------+----------------+-------------+ | ts2 | p0 | 0 | 0 | 0 | | ts2 | p1 | 0 | 0 | 0 | | ts2 | p2 | 0 | 0 | 0 | | ts2 | p3 | 1 | 20 | 20 | | ts3 | p0 | 0 | 0 | 0 | | ts3 | p1 | 1 | 20 | 20 | | ts3 | p2 | 0 | 0 | 0 | +------------+----------------+------------+----------------+-------------+ 7 rows in set (0.01 sec)
As shown earlier in this section, you can also verify which
partitions were used for storing the rows by deleting these
partitions and then performing a
SELECT.
Handling of NULL with HASH and KEY partitioning.
NULL is handled somewhat differently for
tables partitioned by HASH or
KEY. In these cases, any partition
expression that yields a NULL value is
treated as though its return value were zero. We can verify
this behavior by examining the effects on the file system of
creating a table partitioned by HASH and
populating it with a record containing appropriate values.
Suppose that you have a table th (also in
the p database) created using the following
statement:
mysql>CREATE TABLE th (->c1 INT,->c2 VARCHAR(20)->)->PARTITION BY HASH(c1)->PARTITIONS 2;Query OK, 0 rows affected (0.00 sec)
The partitions belonging to this table can be viewed using the query shown here:
mysql> SELECT TABLE_NAME,PARTITION_NAME,TABLE_ROWS,AVG_ROW_LENGTH,DATA_LENGTH
> FROM INFORMATION_SCHEMA.PARTITIONS
> WHERE TABLE_SCHEMA = 'p' AND TABLE_NAME ='th';
+------------+----------------+------------+----------------+-------------+
| TABLE_NAME | PARTITION_NAME | TABLE_ROWS | AVG_ROW_LENGTH | DATA_LENGTH |
+------------+----------------+------------+----------------+-------------+
| th | p0 | 0 | 0 | 0 |
| th | p1 | 0 | 0 | 0 |
+------------+----------------+------------+----------------+-------------+
2 rows in set (0.00 sec)
Note that TABLE_ROWS for each partition is 0.
Now insert two rows into th whose
c1 column values are NULL
and 0, and verify that these rows were inserted, as shown here:
mysql>INSERT INTO th VALUES (NULL, 'mothra'), (0, 'gigan');Query OK, 1 row affected (0.00 sec) mysql>SELECT * FROM th;+------+---------+ | c1 | c2 | +------+---------+ | NULL | mothra | +------+---------+ | 0 | gigan | +------+---------+ 2 rows in set (0.01 sec)
Recall that for any integer N, the
value of NULL MOD
is always
NNULL. For tables that are partitioned by
HASH or KEY, this result
is treated for determining the correct partition as
0. Checking the
INFORMATION_SCHEMA.PARTITIONS table
once again, we can see that both rows were inserted into
partition p0:
mysql>SELECT TABLE_NAME, PARTITION_NAME, TABLE_ROWS, AVG_ROW_LENGTH, DATA_LENGTH>FROM INFORMATION_SCHEMA.PARTITIONS>WHERE TABLE_SCHEMA = 'p' AND TABLE_NAME ='th';+------------+----------------+------------+----------------+-------------+ | TABLE_NAME | PARTITION_NAME | TABLE_ROWS | AVG_ROW_LENGTH | DATA_LENGTH | +------------+----------------+------------+----------------+-------------+ | th | p0 | 2 | 20 | 20 | | th | p1 | 0 | 0 | 0 | +------------+----------------+------------+----------------+-------------+ 2 rows in set (0.00 sec)
If you repeat this example using PARTITION BY
KEY in place of PARTITION BY HASH
in the definition of the table, you can verify easily that
NULL is also treated like 0 for this type of
partitioning.