Table of Contents
NDB Cluster is a technology that enables clustering of in-memory databases in a shared-nothing system. The shared-nothing architecture enables the system to work with very inexpensive hardware, and with a minimum of specific requirements for hardware or software.
NDB Cluster is designed not to have any single point of failure. In a shared-nothing system, each component is expected to have its own memory and disk, and the use of shared storage mechanisms such as network shares, network file systems, and SANs is not recommended or supported.
NDB Cluster integrates the standard MySQL server with an in-memory
clustered storage engine called NDB
(which stands for “Network
DataBase”). In our
documentation, the term NDB refers to
the part of the setup that is specific to the storage engine,
whereas “NDB Cluster” refers to the combination of one
or more MySQL servers with the NDB
storage engine.
An NDB Cluster consists of a set of computers, known as hosts, each running one or more processes. These processes, known as nodes, may include MySQL servers (for access to NDB data), data nodes (for storage of the data), one or more management servers, and possibly other specialized data access programs. The relationship of these components in an NDB Cluster is shown here:
All these programs work together to form an NDB Cluster (see
Chapter 6, NDB Cluster Programs. When data is stored by the
NDB storage engine, the tables (and
table data) are stored in the data nodes. Such tables are directly
accessible from all other MySQL servers (SQL nodes) in the cluster.
Thus, in a payroll application storing data in a cluster, if one
application updates the salary of an employee, all other MySQL
servers that query this data can see this change immediately.
Although an NDB Cluster SQL node uses the mysqld server daemon, it differs in a number of critical respects from the mysqld binary supplied with the MySQL 5.5 distributions, and the two versions of mysqld are not interchangeable.
In addition, a MySQL server that is not connected to an NDB Cluster
cannot use the NDB storage engine and
cannot access any NDB Cluster data.
The data stored in the data nodes for NDB Cluster can be mirrored; the cluster can handle failures of individual data nodes with no other impact than that a small number of transactions are aborted due to losing the transaction state. Because transactional applications are expected to handle transaction failure, this should not be a source of problems.
Individual nodes can be stopped and restarted, and can then rejoin the system (cluster). Rolling restarts (in which all nodes are restarted in turn) are used in making configuration changes and software upgrades (see Section 7.5, “Performing a Rolling Restart of an NDB Cluster”). Rolling restarts are also used as part of the process of adding new data nodes online (see Section 7.13, “Adding NDB Cluster Data Nodes Online”). For more information about data nodes, how they are organized in an NDB Cluster, and how they handle and store NDB Cluster data, see Section 3.2, “NDB Cluster Nodes, Node Groups, Replicas, and Partitions”.
Backing up and restoring NDB Cluster databases can be done using the
NDB-native functionality found in the NDB Cluster
management client and the ndb_restore program
included in the NDB Cluster distribution. For more information, see
Section 7.3, “Online Backup of NDB Cluster”, and
Section 6.20, “ndb_restore — Restore an NDB Cluster Backup”. You can also
use the standard MySQL functionality provided for this purpose in
mysqldump and the MySQL server. See
mysqldump — A Database Backup Program, for more information.
NDB Cluster nodes can employ different transport mechanisms for inter-node communications; TCP/IP over standard 100 Mbps or faster Ethernet hardware is used in most real-world deployments.
NDBCLUSTER
(also known as NDB) is an in-memory
storage engine offering high-availability and data-persistence
features.
The NDBCLUSTER storage engine can be
configured with a range of failover and load-balancing options,
but it is easiest to start with the storage engine at the cluster
level. NDB Cluster's NDB storage
engine contains a complete set of data, dependent only on other
data within the cluster itself.
The “Cluster” portion of NDB Cluster is configured independently of the MySQL servers. In an NDB Cluster, each part of the cluster is considered to be a node.
In many contexts, the term “node” is used to indicate a computer, but when discussing NDB Cluster it means a process. It is possible to run multiple nodes on a single computer; for a computer on which one or more cluster nodes are being run we use the term cluster host.
There are three types of cluster nodes, and in a minimal NDB Cluster configuration, there will be at least three nodes, one of each of these types:
Management node: The role of this type of node is to manage the other nodes within the NDB Cluster, performing such functions as providing configuration data, starting and stopping nodes, and running backups. Because this node type manages the configuration of the other nodes, a node of this type should be started first, before any other node. An MGM node is started with the command ndb_mgmd.
Data node: This type of node stores cluster data. There are as many data nodes as there are replicas, times the number of fragments (see Section 3.2, “NDB Cluster Nodes, Node Groups, Replicas, and Partitions”). For example, with two replicas, each having two fragments, you need four data nodes. One replica is sufficient for data storage, but provides no redundancy; therefore, it is recommended to have 2 (or more) replicas to provide redundancy, and thus high availability. A data node is started with the command ndbd (see Section 6.1, “ndbd — The NDB Cluster Data Node Daemon”) or ndbmtd (see Section 6.3, “ndbmtd — The NDB Cluster Data Node Daemon (Multi-Threaded)”).
NDB Cluster tables are normally stored completely in memory rather than on disk (this is why we refer to NDB Cluster as an in-memory database). However, some NDB Cluster data can be stored on disk; see Section 7.12, “NDB Cluster Disk Data Tables”, for more information.
SQL node: This is a node
that accesses the cluster data. In the case of NDB Cluster, an
SQL node is a traditional MySQL server that uses the
NDBCLUSTER storage engine. An SQL
node is a mysqld process started with the
--ndbcluster and
--ndb-connectstring options, which are
explained elsewhere in this chapter, possibly with additional
MySQL server options as well.
An SQL node is actually just a specialized type of API node, which designates any application which accesses NDB Cluster data. Another example of an API node is the ndb_restore utility that is used to restore a cluster backup. It is possible to write such applications using the NDB API. For basic information about the NDB API, see Getting Started with the NDB API.
It is not realistic to expect to employ a three-node setup in a production environment. Such a configuration provides no redundancy; to benefit from NDB Cluster's high-availability features, you must use multiple data and SQL nodes. The use of multiple management nodes is also highly recommended.
For a brief introduction to the relationships between nodes, node groups, replicas, and partitions in NDB Cluster, see Section 3.2, “NDB Cluster Nodes, Node Groups, Replicas, and Partitions”.
Configuration of a cluster involves configuring each individual node in the cluster and setting up individual communication links between nodes. NDB Cluster is currently designed with the intention that data nodes are homogeneous in terms of processor power, memory space, and bandwidth. In addition, to provide a single point of configuration, all configuration data for the cluster as a whole is located in one configuration file.
The management server manages the cluster configuration file and the cluster log. Each node in the cluster retrieves the configuration data from the management server, and so requires a way to determine where the management server resides. When interesting events occur in the data nodes, the nodes transfer information about these events to the management server, which then writes the information to the cluster log.
In addition, there can be any number of cluster client processes
or applications. These include standard MySQL clients,
NDB-specific API programs, and management
clients. These are described in the next few paragraphs.
Standard MySQL clients. NDB Cluster can be used with existing MySQL applications written in PHP, Perl, C, C++, Java, Python, Ruby, and so on. Such client applications send SQL statements to and receive responses from MySQL servers acting as NDB Cluster SQL nodes in much the same way that they interact with standalone MySQL servers.
MySQL clients using an NDB Cluster as a data source can be
modified to take advantage of the ability to connect with multiple
MySQL servers to achieve load balancing and failover. For example,
Java clients using Connector/J 5.0.6 and later can use
jdbc:mysql:loadbalance:// URLs (improved in
Connector/J 5.1.7) to achieve load balancing transparently; for
more information about using Connector/J with NDB Cluster, see
Using Connector/J with NDB Cluster.
NDB client programs.
Client programs can be written that access NDB Cluster data
directly from the NDBCLUSTER storage engine,
bypassing any MySQL Servers that may be connected to the
cluster, using the NDB
API, a high-level C++ API. Such applications may be
useful for specialized purposes where an SQL interface to the
data is not needed. For more information, see
The NDB API.
NDB-specific Java applications can also be
written for NDB Cluster using the NDB
Cluster Connector for Java. This NDB Cluster Connector
includes ClusterJ, a
high-level database API similar to object-relational mapping
persistence frameworks such as Hibernate and JPA that connect
directly to NDBCLUSTER, and so does not require
access to a MySQL Server. Support is also provided in NDB Cluster
7.1 and later for
ClusterJPA, an OpenJPA
implementation for NDB Cluster that leverages the strengths of
ClusterJ and JDBC; ID lookups and other fast operations are
performed using ClusterJ (bypassing the MySQL Server), while more
complex queries that can benefit from MySQL's query optimizer
are sent through the MySQL Server, using JDBC. See
Java and NDB Cluster, and
The ClusterJ API and Data Object Model, for more
information.
The Memcache API for NDB Cluster, implemented as the loadable ndbmemcache storage engine for memcached version 1.6 and later, is available beginning with NDB 7.2.2. This API can be used to provide a persistent NDB Cluster data store, accessed using the memcache protocol.
The standard memcached caching engine is included in the NDB Cluster 7.2 distribution (7.2.2 and later). Each memcached server has direct access to data stored in NDB Cluster, but is also able to cache data locally and to serve (some) requests from this local cache.
For more information, see ndbmemcache—Memcache API for NDB Cluster.
Management clients. These clients connect to the management server and provide commands for starting and stopping nodes gracefully, starting and stopping message tracing (debug versions only), showing node versions and status, starting and stopping backups, and so on. An example of this type of program is the ndb_mgm management client supplied with NDB Cluster (see Section 6.5, “ndb_mgm — The NDB Cluster Management Client”). Such applications can be written using the MGM API, a C-language API that communicates directly with one or more NDB Cluster management servers. For more information, see The MGM API.
Oracle also makes available MySQL Cluster Manager, which provides an advanced command-line interface simplifying many complex NDB Cluster management tasks, such restarting an NDB Cluster with a large number of nodes. The MySQL Cluster Manager client also supports commands for getting and setting the values of most node configuration parameters as well as mysqld server options and variables relating to NDB Cluster. See MySQL™ Cluster Manager 1.4.1 User Manual, for more information.
Event logs. NDB Cluster logs events by category (startup, shutdown, errors, checkpoints, and so on), priority, and severity. A complete listing of all reportable events may be found in Section 7.6, “Event Reports Generated in NDB Cluster”. Event logs are of the two types listed here:
Cluster log: Keeps a record of all desired reportable events for the cluster as a whole.
Node log: A separate log which is also kept for each individual node.
Under normal circumstances, it is necessary and sufficient to keep and examine only the cluster log. The node logs need be consulted only for application development and debugging purposes.
Checkpoint.
Generally speaking, when data is saved to disk, it is said that
a checkpoint has been
reached. More specific to NDB Cluster, a checkpoint is a point
in time where all committed transactions are stored on disk.
With regard to the NDB storage
engine, there are two types of checkpoints which work together
to ensure that a consistent view of the cluster's data is
maintained. These are shown in the following list:
Local Checkpoint (LCP): This is a checkpoint that is specific to a single node; however, LCPs take place for all nodes in the cluster more or less concurrently. An LCP involves saving all of a node's data to disk, and so usually occurs every few minutes. The precise interval varies, and depends upon the amount of data stored by the node, the level of cluster activity, and other factors.
Global Checkpoint (GCP): A GCP occurs every few seconds, when transactions for all nodes are synchronized and the redo-log is flushed to disk.
For more information about the files and directories created by local checkpoints and global checkpoints, see NDB Cluster Data Node File System Directory Files.
This section discusses the manner in which NDB Cluster divides and duplicates data for storage.
A number of concepts central to an understanding of this topic are discussed in the next few paragraphs.
(Data) Node. An ndbd or ndbmtd process, which stores one or more replicas —that is, copies of the partitions (discussed later in this section) assigned to the node group of which the node is a member.
Each data node should be located on a separate computer. While it is also possible to host multiple data node processes on a single computer, such a configuration is not usually recommended.
It is common for the terms “node” and “data node” to be used interchangeably when referring to an ndbd or ndbmtd process; where mentioned, management nodes (ndb_mgmd processes) and SQL nodes (mysqld processes) are specified as such in this discussion.
Node Group. A node group consists of one or more nodes, and stores partitions, or sets of replicas (see next item).
The number of node groups in an NDB Cluster is not directly
configurable; it is a function of the number of data nodes and of
the number of replicas
(NoOfReplicas
configuration parameter), as shown here:
[number_of_node_groups] =number_of_data_nodes/NoOfReplicas
Thus, an NDB Cluster with 4 data nodes has 4 node groups if
NoOfReplicas is set to 1
in the config.ini file, 2 node groups if
NoOfReplicas is set to 2,
and 1 node group if
NoOfReplicas is set to 4.
Replicas are discussed later in this section; for more information
about NoOfReplicas, see
Section 5.3.6, “Defining NDB Cluster Data Nodes”.
All node groups in an NDB Cluster must have the same number of data nodes.
You can add new node groups (and thus new data nodes) online, to a running NDB Cluster; see Section 7.13, “Adding NDB Cluster Data Nodes Online”, for more information.
Partition. This is a portion of the data stored by the cluster. There are as many cluster partitions as nodes participating in the cluster. Each node is responsible for keeping at least one copy of any partitions assigned to it (that is, at least one replica) available to the cluster.
A replica belongs entirely to a single node; a node can (and usually does) store several replicas.
NDB and user-defined partitioning.
NDB Cluster normally partitions
NDBCLUSTER tables automatically.
However, it is also possible to employ user-defined partitioning
with NDBCLUSTER tables. This is
subject to the following limitations:
Only the KEY and LINEAR
KEY partitioning schemes are supported in production
with NDB tables.
The maximum number of partitions that may be defined
explicitly for any NDB table is
8 * MaxNoOfExecutionThreads * [, the number of node
groups in an NDB Cluster being determined as discussed
previously in this section. When using ndbd
for data node processes, setting
number of
node groups]MaxNoOfExecutionThreads
has no effect; in such a case, it can be treated as though it
were equal to 1 for purposes of performing this calculation.
See Section 6.3, “ndbmtd — The NDB Cluster Data Node Daemon (Multi-Threaded)”, for more information.
For more information relating to NDB Cluster and user-defined partitioning, see Section 3.6, “Known Limitations of NDB Cluster”, and Partitioning Limitations Relating to Storage Engines.
Replica. This is a copy of a cluster partition. Each node in a node group stores a replica. Also sometimes known as a partition replica. The number of replicas is equal to the number of nodes per node group.
The following diagram illustrates an NDB Cluster with four data nodes, arranged in two node groups of two nodes each; nodes 1 and 2 belong to node group 0, and nodes 3 and 4 belong to node group 1.
Only data (ndbd) nodes are shown here; although a working cluster requires an ndb_mgm process for cluster management and at least one SQL node to access the data stored by the cluster, these have been omitted in the figure for clarity.
The data stored by the cluster is divided into four partitions, numbered 0, 1, 2, and 3. Each partition is stored—in multiple copies—on the same node group. Partitions are stored on alternate node groups as follows:
Partition 0 is stored on node group 0; a primary replica (primary copy) is stored on node 1, and a backup replica (backup copy of the partition) is stored on node 2.
Partition 1 is stored on the other node group (node group 1); this partition's primary replica is on node 3, and its backup replica is on node 4.
Partition 2 is stored on node group 0. However, the placing of its two replicas is reversed from that of Partition 0; for Partition 2, the primary replica is stored on node 2, and the backup on node 1.
Partition 3 is stored on node group 1, and the placement of its two replicas are reversed from those of partition 1. That is, its primary replica is located on node 4, with the backup on node 3.
What this means regarding the continued operation of an NDB Cluster is this: so long as each node group participating in the cluster has at least one node operating, the cluster has a complete copy of all data and remains viable. This is illustrated in the next diagram.
In this example, where the cluster consists of two node groups of two nodes each, any combination of at least one node in node group 0 and at least one node in node group 1 is sufficient to keep the cluster “alive” (indicated by arrows in the diagram). However, if both nodes from either node group fail, the remaining two nodes are not sufficient (shown by the arrows marked out with an X); in either case, the cluster has lost an entire partition and so can no longer provide access to a complete set of all cluster data.
One of the strengths of NDB Cluster is that it can be run on commodity hardware and has no unusual requirements in this regard, other than for large amounts of RAM, due to the fact that all live data storage is done in memory. (It is possible to reduce this requirement using Disk Data tables—see Section 7.12, “NDB Cluster Disk Data Tables”, for more information about these.) Naturally, multiple and faster CPUs can enhance performance. Memory requirements for other NDB Cluster processes are relatively small.
The software requirements for NDB Cluster are also modest. Host operating systems do not require any unusual modules, services, applications, or configuration to support NDB Cluster. For supported operating systems, a standard installation should be sufficient. The MySQL software requirements are simple: all that is needed is a production release of NDB Cluster. It is not strictly necessary to compile MySQL yourself merely to be able to use NDB Cluster. We assume that you are using the binaries appropriate to your platform, available from the NDB Cluster software downloads page at http://dev.mysql.com/downloads/cluster/.
For communication between nodes, NDB Cluster supports TCP/IP networking in any standard topology, and the minimum expected for each host is a standard 100 Mbps Ethernet card, plus a switch, hub, or router to provide network connectivity for the cluster as a whole. We strongly recommend that an NDB Cluster be run on its own subnet which is not shared with machines not forming part of the cluster for the following reasons:
Security. Communications between NDB Cluster nodes are not encrypted or shielded in any way. The only means of protecting transmissions within an NDB Cluster is to run your NDB Cluster on a protected network. If you intend to use NDB Cluster for Web applications, the cluster should definitely reside behind your firewall and not in your network's De-Militarized Zone (DMZ) or elsewhere.
See Section 7.11.1, “NDB Cluster Security and Networking Issues”, for more information.
Efficiency. Setting up an NDB Cluster on a private or protected network enables the cluster to make exclusive use of bandwidth between cluster hosts. Using a separate switch for your NDB Cluster not only helps protect against unauthorized access to NDB Cluster data, it also ensures that NDB Cluster nodes are shielded from interference caused by transmissions between other computers on the network. For enhanced reliability, you can use dual switches and dual cards to remove the network as a single point of failure; many device drivers support failover for such communication links.
Network communication and latency. NDB Cluster requires communication between data nodes and API nodes (including SQL nodes), as well as between data nodes and other data nodes, to execute queries and updates. Communication latency between these processes can directly affect the observed performance and latency of user queries. In addition, to maintain consistency and service despite the silent failure of nodes, NDB Cluster uses heartbeating and timeout mechanisms which treat an extended loss of communication from a node as node failure. This can lead to reduced redundancy. Recall that, to maintain data consistency, an NDB Cluster shuts down when the last node in a node group fails. Thus, to avoid increasing the risk of a forced shutdown, breaks in communication between nodes should be avoided wherever possible.
The failure of a data or API node results in the abort of all uncommitted transactions involving the failed node. Data node recovery requires synchronization of the failed node's data from a surviving data node, and re-establishment of disk-based redo and checkpoint logs, before the data node returns to service. This recovery can take some time, during which the Cluster operates with reduced redundancy.
Heartbeating relies on timely generation of heartbeat signals by all nodes. This may not be possible if the node is overloaded, has insufficient machine CPU due to sharing with other programs, or is experiencing delays due to swapping. If heartbeat generation is sufficiently delayed, other nodes treat the node that is slow to respond as failed.
This treatment of a slow node as a failed one may or may not be
desirable in some circumstances, depending on the impact of the
node's slowed operation on the rest of the cluster. When
setting timeout values such as
HeartbeatIntervalDbDb and
HeartbeatIntervalDbApi for
NDB Cluster, care must be taken care to achieve quick detection,
failover, and return to service, while avoiding potentially
expensive false positives.
Where communication latencies between data nodes are expected to be higher than would be expected in a LAN environment (on the order of 100 µs), timeout parameters must be increased to ensure that any allowed periods of latency periods are well within configured timeouts. Increasing timeouts in this way has a corresponding effect on the worst-case time to detect failure and therefore time to service recovery.
LAN environments can typically be configured with stable low latency, and such that they can provide redundancy with fast failover. Individual link failures can be recovered from with minimal and controlled latency visible at the TCP level (where NDB Cluster normally operates). WAN environments may offer a range of latencies, as well as redundancy with slower failover times. Individual link failures may require route changes to propagate before end-to-end connectivity is restored. At the TCP level this can appear as large latencies on individual channels. The worst-case observed TCP latency in these scenarios is related to the worst-case time for the IP layer to reroute around the failures.
SCI support. It is also possible to use the high-speed Scalable Coherent Interface (SCI) with NDB Cluster, but this is not a requirement. See Section 5.4, “Using High-Speed Interconnects with NDB Cluster”, for more about this protocol and its use with NDB Cluster.
In this section, we discuss changes in the implementation of NDB Cluster in MySQL NDB Cluster 7.2, as compared to NDB Cluster 7.1 and earlier releases. Changes and features most likely to be of interest are shown in the following table:
| NDB Cluster 7.2 |
|---|
| NDB Cluster 7.2 is based on MySQL 5.5. For more information about new features in MySQL Server 5.5, see What Is New in MySQL 5.5. |
Version 2 binary log row events, to provide support for improvements in
NDB Cluster Replication conflict detection (see next
item). A given mysqld can be made to
use Version 1 or Version 2 binary logging row events with
the
--log-bin-use-v1-row-events
option. |
Two new “primary wins” conflict detection and resolution
functions
NDB$EPOCH()
and
NDB$EPOCH_TRANS()
for use in replication setups with 2 NDB Clusters. For
more information, see
Chapter 8, NDB Cluster Replication. |
| Distribution of MySQL users and privileges across NDB Cluster SQL nodes is now supported—see Section 7.14, “Distributed MySQL Privileges for NDB Cluster”. |
| Improved support for distributed pushed-down joins, which greatly improve performance for many joins that can be executed in parallel on the data nodes. |
Default values for a number of data node configuration parameters such
as
HeartbeatIntervalDbDb
and
ArbitrationTimeout
have been improved. |
| Support for the Memcache API using the loadable ndbmemcache storage engine. See ndbmemcache—Memcache API for NDB Cluster. |
This section contains information about NDB Cluster 7.2 releases through 5.5.53-ndb-7.2.27, which is a previous GA release but still supported, as is NDB Cluster 7.3. NDB Cluster 7.1, NDB Cluster 7.0, and NDB Cluster 6.3 are previous GA release series which are no longer supported. We recommend that new deployments use NDB Cluster 7.4 or NDB Cluster 7.5, both of which are available as General Availability releases. For information about NDB Cluster 7.1 and previous releases, see the MySQL 5.1 Reference Manual.
The following improvements to NDB Cluster have been made in NDB Cluster 7.2:
Based on MySQL Server 5.5. Previous NDB Cluster release series, including NDB Cluster 7.1, used MySQL 5.1 as a base. Beginning with NDB 7.2.1, NDB Cluster 7.2 is based on MySQL Server 5.5, so that NDB Cluster users can benefit from MySQL 5.5's improvements in scalability and performance monitoring. As with MySQL 5.5, NDB 7.2.1 and later use CMake for configuring and building from source in place of GNU Autotools (used in MySQL 5.1 and NDB Cluster releases based on MySQL 5.1). For more information about changes and improvements in MySQL 5.5, see What Is New in MySQL 5.5.
Conflict detection using GCI Reflection.
NDB Cluster Replication implements a new “primary
wins” conflict detection and resolution mechanism.
GCI Reflection applies
in two-cluster circulation “active-active”
replication setups, tracking the order in which changes are
applied on the NDB Cluster designated as primary relative to
changes originating on the other NDB Cluster (referred to as
the secondary). This relative ordering is used to determine
whether changes originating on the slave are concurrent with
any changes that originate locally, and are therefore
potentially in conflict. Two new conflict detection
functions are added: When using
NDB$EPOCH(), rows that are out of sync on
the secondary are realigned with those on the primary; with
NDB$EPOCH_TRANS(), this realignment is
applied to transactions. For more information, see
Section 8.11, “NDB Cluster Replication Conflict Resolution”.
Version 2 binary log row events.
A new format for binary log row events, known as Version 2
binary log row events, provides support for improvements in
NDB Cluster Replication conflict detection (see previous
item) and is intended to facilitate further improvements in
MySQL Replication. You can cause a given
mysqld use Version 1 or Version 2 binary
logging row events with the
--log-bin-use-v1-row-events
option. For backward compatibility, Version 2 binary log row
events are also available in NDB Cluster 7.0 (7.0.27 and
later) and NDB Cluster 7.1 (7.1.16 and later). However, NDB
Cluster 7.0 and NDB Cluster 7.1 continue to use Version 1
binary log row events as the default, whereas the default in
NDB 7.2.1 and later is use Version 2 row events for binary
logging.
Distribution of MySQL users and privileges.
Automatic distribution of MySQL users and privileges across
all SQL nodes in a given NDB Cluster is now supported. To
enable this support, you must first import an SQL script
share/mysql/ndb_dist_priv.sql that is
included with the NDB Cluster 7.2 distribution. This script
creates several stored procedures which you can use to
enable privilege distribution and perform related tasks.
When a new MySQL Server joins an NDB Cluster where privilege distribution is in effect, it also participates in the privilege distribution automatically.
Once privilege distribution is enabled, all changes to the
grant tables made on any mysqld attached to
the cluster are immediately available on any other attached
MySQL Servers. This is true whether the changes are made using
CREATE USER,
GRANT, or any of the other
statements described elsewhere in this Manual (see
Account Management Statements.) This includes
privileges relating to stored routines and views; however,
automatic distribution of the views or stored routines
themselves is not currently supported.
For more information, see Section 7.14, “Distributed MySQL Privileges for NDB Cluster”.
Distributed pushed-down joins.
Many joins can now be pushed down to the NDB kernel for
processing on NDB Cluster data nodes. Previously, a join was
handled in NDB Cluster by means of repeated accesses of
NDB by the SQL node; however,
when pushed-down joins are enabled, a pushable join is sent
in its entirety to the data nodes, where it can be
distributed among the data nodes and executed in parallel on
multiple copies of the data, with a single, merged result
being returned to mysqld. This can reduce
greatly the number of round trips between an SQL node and
the data nodes required to handle such a join, leading to
greatly improved performance of join processing.
It is possible to determine when joins can be pushed down to
the data nodes by examining the join with
EXPLAIN. A number of new system
status variables
(Ndb_pushed_queries_defined,
Ndb_pushed_queries_dropped,
Ndb_pushed_queries_executed,
and Ndb_pushed_reads) and
additions to the counters
table (in the ndbinfo
information database) can also be helpful in determining when
and how well joins are being pushed down.
More information and examples are available in the description
of the ndb_join_pushdown
server system variable. See also the description of the status
variables referenced in the previous paragraph, as well as
Section 7.10.7, “The ndbinfo counters Table”.
Improved default values for data node configuration parameters.
In order to provide more resiliency to environmental issues
and better handling of some potential failure scenarios, and
to perform more reliably with increases in memory and other
resource requirements brought about by recent improvements
in join handling by NDB, the
default values for a number of NDB Cluster data node
configuration parameters have been changed. The parameters
and changes are described in the following list:
HeartbeatIntervalDbDb:
Default increased from 1500 ms to 5000 ms.
ArbitrationTimeout:
Default increased from 3000 ms to 7500 ms.
TimeBetweenEpochsTimeout:
Now effectively disabled by default (default changed from
4000 ms to 0).
SharedGlobalMemory:
Default increased from 20 MB to 128 MB.
MaxParallelScansPerFragment:
Default increased from 32 to 256.
CrashOnCorruptedTuple
changed from FALSE to
TRUE.
Beginning with NDB 7.2.10,
DefaultOperationRedoProblemAction
changed from ABORT to
QUEUE.
In addition, the value computed for
MaxNoOfLocalScans when
this parameter is not set in config.ini
has been increased by a factor of 4.
Fail-fast data nodes.
Beginning with NDB 7.2.1, data nodes handle corrupted tuples
in a fail-fast manner by default. This is a change from
previous versions of NDB Cluster where this behavior had to
be enabled explicitly by enabling the
CrashOnCorruptedTuple
configuration parameter. In NDB 7.2.1 and later, this
parameter is enabled by default and must be explicitly
disabled, in which case data nodes merely log a warning
whenever they detect a corrupted tuple.
Memcache API support (ndbmemcache). The Memcached server is a distributed in-memory caching server that uses a simple text-based protocol. It is often employed with key-value stores. The Memcache API for NDB Cluster, available beginning with NDB 7.2.2, is implemented as a loadable storage engine for memcached version 1.6 and later. This API can be used to access a persistent NDB Cluster data store employing the memcache protocol. It is also possible for the memcached server to provide a strictly defined interface to existing NDB Cluster tables.
Each memcache server can both cache data locally and access data stored in NDB Cluster directly. Caching policies are configurable. For more information, see ndbmemcache—Memcache API for NDB Cluster, in the NDB Cluster API Developers Guide.
Rows per partition limit removed. Previously it was possible to store a maximum of 46137488 rows in a single NDB Cluster partition—that is, per data node. Beginning with NDB 7.2.9, this limitation has been lifted, and there is no longer any practical upper limit to this number. (Bug #13844405, Bug #14000373)
NDB Cluster 7.2 is also supported by MySQL Cluster Manager, which provides an advanced command-line interface that can simplify many complex NDB Cluster management tasks. See MySQL™ Cluster Manager 1.4.1 User Manual, for more information.
MySQL Server offers a number of choices in storage engines. Since
both NDBCLUSTER and
InnoDB can serve as transactional
MySQL storage engines, users of MySQL Server sometimes become
interested in NDB Cluster. They see
NDB as a possible alternative or
upgrade to the default InnoDB storage
engine in MySQL 5.5. While NDB and
InnoDB share common characteristics,
there are differences in architecture and implementation, so that
some existing MySQL Server applications and usage scenarios can be
a good fit for NDB Cluster, but not all of them.
In this section, we discuss and compare some characteristics of
the NDB storage engine used by NDB
Cluster 7.2 with InnoDB used in MySQL
5.5. The next few sections provide a technical comparison. In many
instances, decisions about when and where to use NDB Cluster must
be made on a case-by-case basis, taking all factors into
consideration. While it is beyond the scope of this documentation
to provide specifics for every conceivable usage scenario, we also
attempt to offer some very general guidance on the relative
suitability of some common types of applications for
NDB as opposed to
InnoDB backends.
Recent NDB Cluster 7.2 releases use a mysqld
based on MySQL 5.5, including support for
InnoDB 1.1. While it is possible to
use InnoDB tables with NDB Cluster, such tables
are not clustered. It is also not possible to use programs or
libraries from an NDB Cluster 7.2 distribution with MySQL Server
5.5, or the reverse.
While it is also true that some types of common business
applications can be run either on NDB Cluster or on MySQL Server
(most likely using the InnoDB storage
engine), there are some important architectural and implementation
differences. Section 3.5.1, “Differences Between the NDB and InnoDB Storage Engines”,
provides a summary of the these differences. Due to the
differences, some usage scenarios are clearly more suitable for
one engine or the other; see
Section 3.5.2, “NDB and InnoDB Workloads”. This in turn
has an impact on the types of applications that better suited for
use with NDB or
InnoDB. See
Section 3.5.3, “NDB and InnoDB Feature Usage Summary”, for a comparison
of the relative suitability of each for use in common types of
database applications.
For information about the relative characteristics of the
NDB and
MEMORY storage engines, see
When to Use MEMORY or MySQL Cluster.
See Alternative Storage Engines, for additional information about MySQL storage engines.
The NDB Cluster NDB storage engine
is implemented using a distributed, shared-nothing architecture,
which causes it to behave differently from
InnoDB in a number of ways. For
those unaccustomed to working with
NDB, unexpected behaviors can arise
due to its distributed nature with regard to transactions,
foreign keys, table limits, and other characteristics. These are
shown in the following table:
Feature |
|
NDB Cluster |
|---|---|---|
MySQL Server Version | 5.5 | NDB Cluster 7.2: 5.5 NDB Cluster 7.3: 5.6 |
|
|
|
NDB Cluster Version | N/A |
|
Storage Limits | 64TB | 3TB (Practical upper limit based on 48 data nodes with 64GB RAM each; can be increased with disk-based data and BLOBs) |
Foreign Keys | Yes | Available in NDB Cluster 7.3 and later.
(Prior to NDB Cluster 7.3: Ignored, as with
|
Transactions | All standard types | |
MVCC | Yes | No |
Data Compression | Yes | No (NDB Cluster checkpoint and backup files can be compressed) |
Large Row Support (> 14K) |
Supported for (Using these types to store very large amounts of data can lower NDB Cluster performance) | |
Replication Support | Asynchronous and semisynchronous replication using MySQL Replication | Automatic synchronous replication within an NDB Cluster. Asynchronous replication between NDB Clusters, using MySQL Replication |
Scaleout for Read Operations | Yes (MySQL Replication) | Yes (Automatic partitioning in NDB Cluster; MySQL Replication) |
Scaleout for Write Operations | Requires application-level partitioning (sharding) | Yes (Automatic partitioning in NDB Cluster is transparent to applications) |
High Availability (HA) | Requires additional software | Yes (Designed for 99.999% uptime) |
Node Failure Recovery and Failover | Requires additional software | Automatic (Key element in NDB Cluster architecture) |
Time for Node Failure Recovery | 30 seconds or longer | Typically < 1 second |
Real-Time Performance | No | Yes |
In-Memory Tables | No | Yes (Some data can optionally be stored on disk; both in-memory and disk data storage are durable) |
NoSQL Access to Storage Engine | Native memcached interface in development (see the MySQL Dev Zone article NDB Cluster 7.2 (DMR2): NoSQL, Key/Value, Memcached) | Yes Multiple APIs, including Memcached, Node.js/JavaScript, Java, JPA, C++, and HTTP/REST |
Concurrent and Parallel Writes | Not supported | Up to 48 writers, optimized for concurrent writes |
Conflict Detection and Resolution (Multiple Replication Masters) | No | Yes |
Hash Indexes | No | Yes |
Online Addition of Nodes | Read-only replicas using MySQL Replication | Yes (all node types) |
Online Upgrades | No | Yes |
Online Schema Modifications | No. | Yes. |
NDB Cluster has a range of unique attributes that make it ideal
to serve applications requiring high availability, fast
failover, high throughput, and low latency. Due to its
distributed architecture and multi-node implementation, NDB
Cluster also has specific constraints that may keep some
workloads from performing well. A number of major differences in
behavior between the NDB and
InnoDB storage engines with regard
to some common types of database-driven application workloads
are shown in the following table::
Workload |
NDB Cluster ( | |
|---|---|---|
High-Volume OLTP Applications | Yes | Yes |
DSS Applications (data marts, analytics) | Yes | Limited (Join operations across OLTP datasets not exceeding 3TB in size) |
Custom Applications | Yes | Yes |
Packaged Applications | Yes | Limited (should be mostly primary key access).
Note
NDB Cluster 7.3 supports foreign keys. |
In-Network Telecoms Applications (HLR, HSS, SDP) | No | Yes |
Session Management and Caching | Yes | Yes |
E-Commerce Applications | Yes | Yes |
User Profile Management, AAA Protocol | Yes | Yes |
When comparing application feature requirements to the
capabilities of InnoDB with
NDB, some are clearly more
compatible with one storage engine than the other.
The following table lists supported application features according to the storage engine to which each feature is typically better suited.
Preferred application requirements for
|
Preferred application requirements for
|
|---|---|
|
|
In the sections that follow, we discuss known limitations in
current releases of NDB Cluster as compared with the features
available when using the MyISAM and
InnoDB storage engines. If you check the
“Cluster” category in the MySQL bugs database at
http://bugs.mysql.com, you can find known bugs in
the following categories under “MySQL Server:” in the
MySQL bugs database at http://bugs.mysql.com, which
we intend to correct in upcoming releases of NDB Cluster:
NDB Cluster
Cluster Direct API (NDBAPI)
Cluster Disk Data
Cluster Replication
ClusterJ
This information is intended to be complete with respect to the conditions just set forth. You can report any discrepancies that you encounter to the MySQL bugs database using the instructions given in How to Report Bugs or Problems. If we do not plan to fix the problem in NDB Cluster 7.2, we will add it to the list.
See Section 3.6.11, “Previous NDB Cluster Issues Resolved in MySQL 5.1, NDB Cluster 6.x, and NDB Cluster 7.x” for a list of issues in NDB Cluster in MySQL 5.1 that have been resolved in the current version.
Limitations and other issues specific to NDB Cluster Replication are described in Section 8.3, “Known Issues in NDB Cluster Replication”.
Some SQL statements relating to certain MySQL features produce
errors when used with NDB tables,
as described in the following list:
Temporary tables.
Temporary tables are not supported. Trying either to
create a temporary table that uses the
NDB storage engine or to
alter an existing temporary table to use
NDB fails with the error
Table storage engine 'ndbcluster' does not
support the create option 'TEMPORARY'.
Indexes and keys in NDB tables. Keys and indexes on NDB Cluster tables are subject to the following limitations:
Column width.
Attempting to create an index on an
NDB table column whose width is
greater than 3072 bytes succeeds, but only the first
3072 bytes are actually used for the index. In such
cases, a warning Specified key was too
long; max key length is 3072 bytes is
issued, and a SHOW CREATE
TABLE statement shows the length of the
index as 3072.
TEXT and BLOB columns.
You cannot create indexes on
NDB table columns that
use any of the TEXT or
BLOB data types.
FULLTEXT indexes.
The NDB storage engine
does not support FULLTEXT indexes,
which are possible for MyISAM
tables only.
However, you can create indexes on
VARCHAR columns of
NDB tables.
USING HASH keys and NULL.
Using nullable columns in unique keys and primary keys
means that queries using these columns are handled as
full table scans. To work around this issue, make the
column NOT NULL, or re-create the
index without the USING HASH
option.
Prefixes.
There are no prefix indexes; only entire columns can
be indexed. (The size of an NDB
column index is always the same as the width of the
column in bytes, up to and including 3072 bytes, as
described earlier in this section. Also see
Section 3.6.6, “Unsupported or Missing Features in NDB Cluster”,
for additional information.)
BIT columns.
A BIT column cannot be
a primary key, unique key, or index, nor can it be
part of a composite primary key, unique key, or index.
AUTO_INCREMENT columns.
Like other MySQL storage engines, the
NDB storage engine can
handle a maximum of one
AUTO_INCREMENT column per table,
and this column must be indexed. However, in the case
of an NDB Cluster table with no explicit primary key,
an AUTO_INCREMENT column is
automatically defined and used as a
“hidden” primary key. For this reason,
you cannot create an NDB table
having an AUTO_INCREMENT column and
no explicit primary key.
NDB Cluster and geometry data types.
Geometry data types (WKT and
WKB) are supported for
NDB tables. However, spatial
indexes are not supported.
Character sets and binary log files.
Currently, the ndb_apply_status and
ndb_binlog_index tables are created
using the latin1 (ASCII) character set.
Because names of binary logs are recorded in this table,
binary log files named using non-Latin characters are not
referenced correctly in these tables. This is a known
issue, which we are working to fix. (Bug #50226)
To work around this problem, use only Latin-1 characters
when naming binary log files or setting any the
--basedir,
--log-bin, or
--log-bin-index options.
Creating NDB tables with user-defined partitioning.
Support for user-defined partitioning in NDB Cluster is
restricted to [LINEAR]
KEY partitioning. Using any other
partitioning type with ENGINE=NDB or
ENGINE=NDBCLUSTER in a
CREATE TABLE statement
results in an error.
It is possible to override this restriction, but doing so is not supported for use in production settings. For details, see User-defined partitioning and the NDB storage engine (MySQL Cluster).
Default partitioning scheme.
All NDB Cluster tables are by default partitioned by
KEY using the table's primary key
as the partitioning key. If no primary key is explicitly
set for the table, the “hidden” primary key
automatically created by the
NDB storage engine is used
instead. For additional discussion of these and related
issues, see KEY Partitioning.
CREATE TABLE and
ALTER TABLE statements that
would cause a user-partitioned
NDBCLUSTER table not to meet
either or both of the following two requirements are not
permitted, and fail with an error:
The table must have an explicit primary key.
All columns listed in the table's partitioning expression must be part of the primary key.
Exception.
If a user-partitioned
NDBCLUSTER table is created
using an empty column-list (that is, using
PARTITION BY [LINEAR] KEY()), then no
explicit primary key is required.
Maximum number of partitions for NDBCLUSTER tables.
The maximum number of partitions that can defined for a
NDBCLUSTER table when
employing user-defined partitioning is 8 per node group.
(See Section 3.2, “NDB Cluster Nodes, Node Groups, Replicas, and Partitions”, for
more information about NDB Cluster node groups.
DROP PARTITION not supported.
It is not possible to drop partitions from
NDB tables using
ALTER TABLE ... DROP PARTITION. The
other partitioning extensions to
ALTER
TABLE—ADD PARTITION,
REORGANIZE PARTITION, and
COALESCE PARTITION—are supported
for Cluster tables, but use copying and so are not
optimized. See
Management of RANGE and LIST Partitions and
ALTER TABLE Syntax.
Row-based replication.
When using row-based replication with NDB Cluster, binary
logging cannot be disabled. That is, the
NDB storage engine ignores
the value of sql_log_bin.
(Bug #16680)
In this section, we list limits found in NDB Cluster that either differ from limits found in, or that are not found in, standard MySQL.
Memory usage and recovery.
Memory consumed when data is inserted into an
NDB table is not automatically
recovered when deleted, as it is with other storage engines.
Instead, the following rules hold true:
A DELETE statement on an
NDB table makes the memory
formerly used by the deleted rows available for re-use by
inserts on the same table only. However, this memory can be
made available for general re-use by performing
OPTIMIZE TABLE.
A rolling restart of the cluster also frees any memory used by deleted rows. See Section 7.5, “Performing a Rolling Restart of an NDB Cluster”.
A DROP TABLE or
TRUNCATE TABLE operation on
an NDB table frees the memory
that was used by this table for re-use by any
NDB table, either by the same
table or by another NDB table.
Recall that TRUNCATE TABLE
drops and re-creates the table. See
TRUNCATE TABLE Syntax.
Limits imposed by the cluster's configuration. A number of hard limits exist which are configurable, but available main memory in the cluster sets limits. See the complete list of configuration parameters in Section 5.3, “NDB Cluster Configuration Files”. Most configuration parameters can be upgraded online. These hard limits include:
Database memory size and index memory size
(DataMemory and
IndexMemory,
respectively).
DataMemory is
allocated as 32KB pages. As each
DataMemory page
is used, it is assigned to a specific table; once
allocated, this memory cannot be freed except by
dropping the table.
See Section 5.3.6, “Defining NDB Cluster Data Nodes”, for more information.
The maximum number of operations that can be performed
per transaction is set using the configuration
parameters
MaxNoOfConcurrentOperations
and
MaxNoOfLocalOperations.
Bulk loading, TRUNCATE
TABLE, and ALTER
TABLE are handled as special cases by
running multiple transactions, and so are not subject
to this limitation.
Different limits related to tables and indexes. For
example, the maximum number of ordered indexes in the
cluster is determined by
MaxNoOfOrderedIndexes,
and the maximum number of ordered indexes per table is
16.
Node and data object maximums. The following limits apply to numbers of cluster nodes and metadata objects:
The maximum number of data nodes is 48.
A data node must have a node ID in the range of 1 to 48, inclusive. (Management and API nodes may use node IDs in the range 1 to 255, inclusive.)
The total maximum number of nodes in an NDB Cluster is 255. This number includes all SQL nodes (MySQL Servers), API nodes (applications accessing the cluster other than MySQL servers), data nodes, and management servers.
The maximum number of metadata objects in current versions of NDB Cluster is 20320. This limit is hard-coded.
See Section 3.6.11, “Previous NDB Cluster Issues Resolved in MySQL 5.1, NDB Cluster 6.x, and NDB Cluster 7.x”, for more information.
A number of limitations exist in NDB Cluster with regard to the handling of transactions. These include the following:
Transaction isolation level.
The NDBCLUSTER storage engine
supports only the READ
COMMITTED transaction isolation level.
(InnoDB, for example, supports
READ COMMITTED,
READ UNCOMMITTED,
REPEATABLE READ, and
SERIALIZABLE.) You
should keep in mind that NDB implements
READ COMMITTED on a per-row basis; when
a read request arrives at the data node storing the row,
what is returned is the last committed version of the row
at that time.
Uncommitted data is never returned, but when a transaction modifying a number of rows commits concurrently with a transaction reading the same rows, the transaction performing the read can observe “before” values, “after” values, or both, for different rows among these, due to the fact that a given row read request can be processed either before or after the commit of the other transaction.
To ensure that a given transaction reads only before or
after values, you can impose row locks using
SELECT ... LOCK IN
SHARE MODE. In such cases, the lock is held until
the owning transaction is committed. Using row locks can
also cause the following issues:
Increased frequency of lock wait timeout errors, and reduced concurrency
Increased transaction processing overhead due to reads requiring a commit phase
Possibility of exhausting the available number of
concurrent locks, which is limited by
MaxNoOfConcurrentOperations
NDB uses READ
COMMITTED for all reads unless a modifier such as
LOCK IN SHARE MODE or FOR
UPDATE is used. LOCK IN SHARE
MODE causes shared row locks to be used;
FOR UPDATE causes exclusive row locks to
be used. Unique key reads have their locks upgraded
automatically by NDB to ensure a
self-consistent read; BLOB reads also
employ extra locking for consistency.
See Section 7.3.4, “NDB Cluster Backup Troubleshooting”,
for information on how NDB Cluster's implementation of
transaction isolation level can affect backup and
restoration of NDB databases.
Transactions and BLOB or TEXT columns.
NDBCLUSTER stores only part
of a column value that uses any of MySQL's
BLOB or
TEXT data types in the
table visible to MySQL; the remainder of the
BLOB or
TEXT is stored in a
separate internal table that is not accessible to MySQL.
This gives rise to two related issues of which you should
be aware whenever executing
SELECT statements on tables
that contain columns of these types:
For any SELECT from an
NDB Cluster table: If the
SELECT includes a
BLOB or
TEXT column, the
READ COMMITTED
transaction isolation level is converted to a read with
read lock. This is done to guarantee consistency.
For any SELECT which uses
a unique key lookup to retrieve any columns that use any
of the BLOB or
TEXT data types and that
is executed within a transaction, a shared read lock is
held on the table for the duration of the
transaction—that is, until the transaction is
either committed or aborted.
This issue does not occur for queries that use index or
table scans, even against
NDB tables having
BLOB or
TEXT columns.
For example, consider the table t
defined by the following CREATE
TABLE statement:
CREATE TABLE t (
a INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
b INT NOT NULL,
c INT NOT NULL,
d TEXT,
INDEX i(b),
UNIQUE KEY u(c)
) ENGINE = NDB,
Either of the following queries on t
causes a shared read lock, because the first query uses
a primary key lookup and the second uses a unique key
lookup:
SELECT * FROM t WHERE a = 1; SELECT * FROM t WHERE c = 1;
However, none of the four queries shown here causes a shared read lock:
SELECT * FROM t WHERE b = 1; SELECT * FROM t WHERE d = '1'; SELECT * FROM t; SELECT b,c WHERE a = 1;
This is because, of these four queries, the first uses
an index scan, the second and third use table scans, and
the fourth, while using a primary key lookup, does not
retrieve the value of any
BLOB or
TEXT columns.
You can help minimize issues with shared read locks by
avoiding queries that use unique key lookups that
retrieve BLOB or
TEXT columns, or, in
cases where such queries are not avoidable, by
committing transactions as soon as possible afterward.
Rollbacks. There are no partial transactions, and no partial rollbacks of transactions. A duplicate key or similar error causes the entire transaction to be rolled back.
This behavior differs from that of other transactional
storage engines such as InnoDB
that may roll back individual statements.
Transactions and memory usage. As noted elsewhere in this chapter, NDB Cluster does not handle large transactions well; it is better to perform a number of small transactions with a few operations each than to attempt a single large transaction containing a great many operations. Among other considerations, large transactions require very large amounts of memory. Because of this, the transactional behavior of a number of MySQL statements is effected as described in the following list:
TRUNCATE TABLE is not
transactional when used on
NDB tables. If a
TRUNCATE TABLE fails to
empty the table, then it must be re-run until it is
successful.
DELETE FROM (even with no
WHERE clause) is
transactional. For tables containing a great many rows,
you may find that performance is improved by using
several DELETE FROM ... LIMIT ...
statements to “chunk” the delete operation.
If your objective is to empty the table, then you may
wish to use TRUNCATE
TABLE instead.
LOAD DATA statements.
LOAD DATA
INFILE is not transactional when used on
NDB tables.
When executing a
LOAD DATA
INFILE statement, the
NDB engine performs
commits at irregular intervals that enable better
utilization of the communication network. It is not
possible to know ahead of time when such commits take
place.
ALTER TABLE and transactions.
When copying an NDB table
as part of an ALTER
TABLE, the creation of the copy is
nontransactional. (In any case, this operation is
rolled back when the copy is deleted.)
Transactions and the COUNT() function.
When using NDB Cluster Replication, it is not possible to
guarantee the transactional consistency of the
COUNT() function on the slave. In other
words, when performing on the master a series of
statements (INSERT,
DELETE, or both) that
changes the number of rows in a table within a single
transaction, executing SELECT COUNT(*) FROM
queries on the
slave may yield intermediate results. This is due to the
fact that tableSELECT COUNT(...) may perform
dirty reads, and is not a bug in the
NDB storage engine. (See Bug
#31321 for more information.)
Starting, stopping, or restarting a node may give rise to temporary errors causing some transactions to fail. These include the following cases:
Temporary errors. When first starting a node, it is possible that you may see Error 1204 Temporary failure, distribution changed and similar temporary errors.
Errors due to node failure. The stopping or failure of any data node can result in a number of different node failure errors. (However, there should be no aborted transactions when performing a planned shutdown of the cluster.)
In either of these cases, any errors that are generated must be handled within the application. This should be done by retrying the transaction.
See also Section 3.6.2, “Limits and Differences of NDB Cluster from Standard MySQL Limits”.
Some database objects such as tables and indexes have different
limitations when using the
NDBCLUSTER storage engine:
Database and table names.
When using the NDB storage engine, the
maximum allowed length both for database names and for
table names is 63 characters.
Number of database objects.
The maximum number of all
NDB database objects in a
single NDB Cluster—including databases, tables, and
indexes—is limited to 20320.
Attributes per table. The maximum number of attributes (that is, columns and indexes) that can belong to a given table is 512.
Attributes per key. The maximum number of attributes per key is 32.
Row size.
The maximum permitted size of any one row is 14000 bytes
(as of NDB Cluster 7.0). Each
BLOB or
TEXT column contributes 256
+ 8 = 264 bytes to this total.
BIT column storage per table.
The maximum combined width for all
BIT columns used in a given
NDB table is 4096.
FIXED column storage.
NDB Cluster supports a maximum of 16 GB per fragment of
data in FIXED columns.
A number of features supported by other storage engines are not
supported for NDB tables. Trying to
use any of these features in NDB Cluster does not cause errors
in or of itself; however, errors may occur in applications that
expects the features to be supported or enforced. Statements
referencing such features, even if effectively ignored by
NDB, must be syntactically and otherwise
valid.
Foreign key constraints.
Prior to NDB Cluster 7.3, the foreign key construct is
ignored, just as it is by MyISAM
tables. Foreign keys are supported in NDB Cluster 7.3 and
later.
Index prefixes.
Prefixes on indexes are not supported for
NDB tables. If a prefix is used as part
of an index specification in a statement such as
CREATE TABLE,
ALTER TABLE, or
CREATE INDEX, the prefix is
not created by NDB.
A statement containing an index prefix, and creating or
modifying an NDB table, must still be
syntactically valid. For example, the following statement
always fails with Error 1089 Incorrect prefix
key; the used key part isn't a string, the used length is
longer than the key part, or the storage engine doesn't
support unique prefix keys, regardless of
storage engine:
CREATE TABLE t1 (
c1 INT NOT NULL,
c2 VARCHAR(100),
INDEX i1 (c2(500))
);This happens on account of the SQL syntax rule that no index may have a prefix larger than itself.
Savepoints and rollbacks.
Savepoints and rollbacks to savepoints are ignored as in
MyISAM.
Durability of commits. There are no durable commits on disk. Commits are replicated, but there is no guarantee that logs are flushed to disk on commit.
Replication.
Statement-based replication is not supported. Use
--binlog-format=ROW (or
--binlog-format=MIXED) when
setting up cluster replication. See
Chapter 8, NDB Cluster Replication, for more
information.
See Section 3.6.3, “Limits Relating to Transaction Handling in NDB Cluster”,
for more information relating to limitations on transaction
handling in NDB.
The following performance issues are specific to or especially pronounced in NDB Cluster:
Range scans.
There are query performance issues due to sequential
access to the NDB storage
engine; it is also relatively more expensive to do many
range scans than it is with either
MyISAM or InnoDB.
Reliability of Records in range.
The Records in range statistic is
available but is not completely tested or officially
supported. This may result in nonoptimal query plans in
some cases. If necessary, you can employ USE
INDEX or FORCE INDEX to alter
the execution plan. See Index Hints, for
more information on how to do this.
Unique hash indexes.
Unique hash indexes created with USING
HASH cannot be used for accessing a table if
NULL is given as part of the key.
The following are limitations specific to the
NDBCLUSTER storage engine:
Machine architecture. All machines used in the cluster must have the same architecture. That is, all machines hosting nodes must be either big-endian or little-endian, and you cannot use a mixture of both. For example, you cannot have a management node running on a PowerPC which directs a data node that is running on an x86 machine. This restriction does not apply to machines simply running mysql or other clients that may be accessing the cluster's SQL nodes.
Binary logging. NDB Cluster has the following limitations or restrictions with regard to binary logging:
sql_log_bin has no
effect on data operations; however, it is supported for
schema operations.
NDB Cluster cannot produce a binary log for tables
having BLOB columns but
no primary key.
Only the following schema operations are logged in a cluster binary log which is not on the mysqld executing the statement:
Schema operations (DDL statements) are rejected while any data node restarts.
See also Section 3.6.10, “Limitations Relating to Multiple NDB Cluster Nodes”.
Disk Data object maximums and minimums. Disk data objects are subject to the following maximums and minimums:
Maximum number of tablespaces: 232 (4294967296)
Maximum number of data files per tablespace: 216 (65536)
Maximum data file size: The theoretical limit is 64G; however, the practical upper limit is 32G. This is equivalent to 32768 extents of 1M each.
Since an NDB Cluster Disk Data table can use at most 1
tablespace, this means that the theoretical upper limit to
the amount of data (in bytes) that can be stored on disk by
a single NDB table is 32G * 65536 =
2251799813685248, or approximately 2 petabytes.
The theoretical maximum number of extents per tablespace data file is 216 (65536); however, for practical purposes, the recommended maximum number of extents per data file is 215 (32768).
The minimum and maximum possible sizes of extents for tablespace data files are 32K and 2G, respectively. See CREATE TABLESPACE Syntax, for more information.
Disk Data tables and diskless mode. Use of Disk Data tables is not supported when running the cluster in diskless mode. Beginning with MySQL 5.1.12, it is prohibited altogether. (Bug #20008)
Multiple SQL nodes.
The following are issues relating to the use of multiple MySQL
servers as NDB Cluster SQL nodes, and are specific to the
NDBCLUSTER storage engine:
No distributed table locks.
A LOCK TABLES works only
for the SQL node on which the lock is issued; no other SQL
node in the cluster “sees” this lock. This is
also true for a lock issued by any statement that locks
tables as part of its operations. (See next item for an
example.)
ALTER TABLE operations.
ALTER TABLE is not fully
locking when running multiple MySQL servers (SQL nodes).
(As discussed in the previous item, NDB Cluster does not
support distributed table locks.)
Multiple management nodes. When using multiple management servers:
If any of the management servers are running on the same host, you must give nodes explicit IDs in connection strings because automatic allocation of node IDs does not work across multiple management servers on the same host. This is not required if every management server resides on a different host.
When a management server starts, it first checks for any
other management server in the same NDB Cluster, and upon
successful connection to the other management server uses
its configuration data. This means that the management
server --reload and
--initial startup options
are ignored unless the management server is the only one
running. It also means that, when performing a rolling
restart of an NDB Cluster with multiple management nodes,
the management server reads its own configuration file if
(and only if) it is the only management server running in
this NDB Cluster. See
Section 7.5, “Performing a Rolling Restart of an NDB Cluster”, for more
information.
Multiple network addresses. Multiple network addresses per data node are not supported. Use of these is liable to cause problems: In the event of a data node failure, an SQL node waits for confirmation that the data node went down but never receives it because another route to that data node remains open. This can effectively make the cluster inoperable.
It is possible to use multiple network hardware
interfaces (such as Ethernet cards) for a
single data node, but these must be bound to the same address.
This also means that it not possible to use more than one
[tcp] section per connection in the
config.ini file. See
Section 5.3.9, “NDB Cluster TCP/IP Connections”, for more
information.
A number of limitations and related issues existing in earlier versions of NDB Cluster have been resolved:
Variable-length column support.
The NDBCLUSTER storage engine
now supports variable-length column types for in-memory
tables.
Previously, for example, any Cluster table having one or
more VARCHAR fields which
contained only relatively small values, much more memory and
disk space were required when using the
NDBCLUSTER storage engine than
would have been the case for the same table and data using
the MyISAM engine. In other words, in the
case of a VARCHAR column,
such a column required the same amount of storage as a
CHAR column of the same size.
In MySQL 5.1, this is no longer the case for in-memory
tables, where storage requirements for variable-length
column types such as VARCHAR
and BINARY are comparable to those for
these column types when used in MyISAM
tables (see Data Type Storage Requirements).
For NDB Cluster Disk Data tables, the fixed-width limitation continues to apply. See Section 7.12, “NDB Cluster Disk Data Tables”.
Replication with NDB Cluster. It is now possible to use MySQL replication with Cluster databases. For details, see Chapter 8, NDB Cluster Replication.
Circular Replication. Circular replication is also supported with NDB Cluster, beginning with MySQL 5.1.18. See Section 8.10, “NDB Cluster Replication: Multi-Master and Circular Replication”.
auto_increment_increment and auto_increment_offset.
The
auto_increment_increment
and auto_increment_offset
server system variables are supported for NDB Cluster
Replication.
Backup and restore between architectures. It is possible to perform a Cluster backup and restore between different architectures. Previously—for example—you could not back up a cluster running on a big-endian platform and then restore from that backup to a cluster running on a little-endian system. (Bug #19255)
Multiple data nodes, multi-threaded data nodes. NDB Cluster 7.2 supports multiple data node processes on a single host as well as multi-threaded data node processes. See Section 6.3, “ndbmtd — The NDB Cluster Data Node Daemon (Multi-Threaded)”, for more information.
Identifiers.
Formerly (in MySQL 5.0 and earlier), database names, table
names and attribute names could not be as long for
NDB tables as tables using
other storage engines, because attribute names were
truncated internally. In MySQL 5.1 and later, names of NDB
Cluster databases, tables, and table columns follow the
same rules regarding length as they do for any other
storage engine.
Length of CREATE TABLE statements.
CREATE TABLE statements may
be no more than 4096 characters in length. This
limitation affects MySQL 5.1.6, 5.1.7, and 5.1.8
only. (See Bug #17813)
IGNORE and REPLACE functionality.
In MySQL 5.1.7 and earlier,
INSERT
IGNORE,
UPDATE
IGNORE, and
REPLACE were supported only
for primary keys, but not for unique keys. It was possible
to work around this issue by removing the constraint, then
dropping the unique index, performing any inserts, and
then adding the unique index again.
This limitation was removed for
INSERT
IGNORE and REPLACE
in MySQL 5.1.8. (See Bug #17431.)
AUTO_INCREMENT columns.
In MySQL 5.1.10 and earlier versions, the maximum number
of tables having AUTO_INCREMENT
columns—including those belonging to hidden primary
keys—was 2048.
This limitation was lifted in MySQL 5.1.11.
Maximum number of cluster nodes. The total maximum number of nodes in an NDB Cluster is 255, including all SQL nodes (MySQL Servers), API nodes (applications accessing the cluster other than MySQL servers), data nodes, and management servers. The total number of data nodes and management nodes is 63, of which up to 48 can be data nodes.
A data node cannot have a node ID greater than 49.
Recovery of memory from deleted rows.
Memory can be reclaimed from an
NDB table for reuse with any
NDB table by employing
OPTIMIZE TABLE, subject to
the following limitations:
Only in-memory tables are supported; the
OPTIMIZE TABLE statement
has no effect on NDB Cluster Disk Data tables.
Only variable-length columns (such as those declared as
VARCHAR,
TEXT, or
BLOB) are supported.
However, you can force columns defined using
fixed-length data types (such as
CHAR) to be dynamic using
the ROW_FORMAT or
COLUMN_FORMAT option with a
CREATE TABLE or
ALTER TABLE statement.
See CREATE TABLE Syntax, and ALTER TABLE Syntax, for information on these options.
You can regulate the effects of OPTIMIZE
on performance by adjusting the value of the global system
variable ndb_optimization_delay, which
sets the number of milliseconds to wait between batches of
rows being processed by OPTIMIZE. The
default value is 10 milliseconds. It is possible to set a
lower value (to a minimum of 0), but not
recommended. The maximum is 100000 milliseconds (that is,
100 seconds).
Number of tables.
The maximum number of
NDBCLUSTER tables in a single
NDB Cluster is included in the total maximum number of
NDBCLUSTER database objects
(20320). (See
Section 3.6.5, “Limits Associated with Database Objects in NDB Cluster”.)
Adding and dropping of data nodes. In NDB Cluster 7.2 (NDB Cluster 7.0 and later), it is possible to add new data nodes to a running NDB Cluster by performing a rolling restart, so that the cluster and the data stored in it remain available to applications.
When planning to increase the number of data nodes in the cluster online, you should be aware of and take into account the following issues:
New data nodes can be added online to an NDB Cluster only as part of a new node group.
New data nodes can be added online, but cannot be dropped online. Reducing the number of data nodes requires a system restart of the cluster.
As in previous NDB Cluster releases, it is not possible
to change online either the number of replicas
(NoOfReplicas
configuration parameter) or the number of data nodes per
node group. These changes require a system restart.
Redistribution of existing cluster data using the new data nodes is not automatic; however, this can be accomplished using simple SQL statements in the mysql client or other MySQL client application once the nodes have been added. During this procedure, it is not possible to perform DDL operations, although DML operations can continue as normal.
The distribution of new cluster data (that is, data stored in the cluster after the new nodes have been added) uses the new nodes without manual intervention.
For more information, see Section 7.13, “Adding NDB Cluster Data Nodes Online”.
Native support for default column values.
Starting with NDB 7.1.0, default values for table columns
are stored by NDBCLUSTER,
rather than by the MySQL server as was previously the
case. Because less data must be sent from an SQL node to
the data nodes, inserts on tables having column value
defaults can be performed more efficiently than before.
Tables created using previous NDB Cluster releases can still
be used in NDB 7.1.0 and later, although they do not support
native default values and continue to use defaults supplied
by the MySQL server until they are upgraded. This can be
done by means of an offline ALTER
TABLE statement.
You cannot set or change a table column's default
value using an online ALTER
TABLE operation
Distribution of MySQL users and privileges.
Previously, MySQL users and privileges created on one SQL
node were unique to that SQL node, due to the fact that
the MySQL grant tables were restricted to using the
MyISAM storage engine.
Beginning with NDB 7.2.0, it is possible, following
installation of the NDB Cluster software and setup of the
desired users and privileges on one SQL node, to convert
the grant tables to use NDB
and thus to distribute the users and privileges across all
SQL nodes connected to the cluster. You can do this by
loading and making use of a set of stored procedures
defined in an SQL script supplied with the NDB Cluster
distribution. For more information, see
Section 7.14, “Distributed MySQL Privileges for NDB Cluster”.
Number of rows per partition. Previously, a single NDB Cluster partition could hold a maximum of 46137488 rows. This limitation was removed in NDB 7.2.9. (Bug #13844405, Bug #14000373)
If you are still using a previous NDB Cluster release, you can work around this limitation by taking advantage of the fact that the number of partitions is the same as the number of data nodes in the cluster (see Section 3.2, “NDB Cluster Nodes, Node Groups, Replicas, and Partitions”). This means that, by increasing the number of data nodes, you can increase the available space for storing data.
NDB Cluster 7.2 also supports increasing the number of data nodes in the cluster while the cluster remains in operation. See Section 7.13, “Adding NDB Cluster Data Nodes Online”, for more information.
It is also possible to increase the number of partitions for
NDB tables by using explicit
KEY or LINEAR KEY
partitioning (see KEY Partitioning).