Each ALTER TABLE operation for an
InnoDB table is governed by several aspects:
Whether there is any change to the physical representation of the table, or whether it purely a change to metadata that can be done without touching the table itself.
Whether the volume of data in the table stays the same, increases, or decreases.
Whether a change in table data involves the clustered index, secondary indexes, or both.
Whether there are any foreign
key relationships between the table being altered and
some other table. The mechanics differ depending on whether
the foreign_key_checks
configuration option is enabled or disabled.
Whether the table is partitioned. Partitioning clauses of
ALTER TABLE are turned into
low-level operations involving one or more tables, and those
operations follow the regular rules for online DDL.
Whether the table data must be copied, whether the table can be reorganized “in-place”, or a combination of both.
Whether the table contains any auto-increment columns.
What degree of locking is
required, either by the nature of the underlying database
operations, or a LOCK clause that you
specify in the ALTER TABLE
statement.
This section explains how these factors affect the different kinds
of ALTER TABLE operations on
InnoDB tables.
Here are the primary reasons why an online DDL operation could fail:
If a LOCK clause specifies a low degree of
locking (SHARED or NONE)
that is not compatible with the particular type of DDL
operation.
If a timeout occurs while waiting to get an exclusive lock on the table, which may be needed briefly during the initial and final phases of the DDL operation.
If the tmpdir file system
runs out of disk space, while MySQL writes temporary sort
files on disk during index creation.
If the ALTER TABLE takes so
long, and concurrent DML modifies the table so much, that the
size of the temporary online log exceeds the value of the
innodb_online_alter_log_max_size
configuration option. This condition causes a
DB_ONLINE_LOG_TOO_BIG error.
If concurrent DML makes changes to the table that are allowed
with the original table definition, but not with the new one.
The operation only fails at the very end, when MySQL tries to
apply all the changes from concurrent DML statements. For
example, you might insert duplicate values into a column while
a unique index is being created, or you might insert
NULL values into a column while creating a
primary key index on
that column. The changes made by the concurrent DML take
precedence, and the ALTER TABLE
operation is effectively rolled
back.
Although the configuration option
innodb_file_per_table has a
dramatic effect on the representation for an
InnoDB table, all online DDL operations work
equally well whether that option is enabled or disabled, and
whether the table is physically located in its own
.ibd file or inside the
system tablespace.
InnoDB has two types of indexes: the clustered index representing all the data in the table, and optional secondary indexes to speed up queries. Since the clustered index contains the data values in its B-tree nodes, adding or dropping a clustered index does involve copying the data, and creating a new copy of the table. A secondary index, however, contains only the index key and the value of the primary key. This type of index can be created or dropped without copying the data in the clustered index. Because each secondary index contains copies of the primary key values (used to access the clustered index when needed), when you change the definition of the primary key, all secondary indexes are recreated as well.
Dropping a secondary index is simple. Only the internal InnoDB system tables and the MySQL data dictionary tables are updated to reflect the fact that the index no longer exists. InnoDB returns the storage used for the index to the tablespace that contained it, so that new indexes or additional table rows can use the space.
To add a secondary index to an existing table, InnoDB scans the table, and sorts the rows using memory buffers and temporary files in order by the values of the secondary index key columns. The B-tree is then built in key-value order, which is more efficient than inserting rows into an index in random order. Because the B-tree nodes are split when they fill, building the index in this way results in a higher fill-factor for the index, making it more efficient for subsequent access.
Historically, the MySQL server and InnoDB have
each kept their own metadata about table and index structures. The
MySQL server stores this information in
.frm files that are not
protected by a transactional mechanism, while
InnoDB has its own
data dictionary as
part of the system
tablespace. If a DDL operation was interrupted by a crash
or other unexpected event partway through, the metadata could be
left inconsistent between these two locations, causing problems
such as startup errors or inability to access the table that was
being altered. Now that InnoDB is the default
storage engine, addressing such issues is a high priority. These
enhancements to DDL operations reduce the window of opportunity
for such issues to occur.