Chapter 3 NDB Cluster Overview

Table of Contents

3.1 NDB Cluster Core Concepts
3.2 NDB Cluster Nodes, Node Groups, Replicas, and Partitions
3.3 NDB Cluster Hardware, Software, and Networking Requirements
3.4 What is New in NDB Cluster in NDB Cluster 7.2
3.5 MySQL Server Using InnoDB Compared with NDB Cluster
3.5.1 Differences Between the NDB and InnoDB Storage Engines
3.5.2 NDB and InnoDB Workloads
3.5.3 NDB and InnoDB Feature Usage Summary
3.6 Known Limitations of NDB Cluster
3.6.1 Noncompliance with SQL Syntax in NDB Cluster
3.6.2 Limits and Differences of NDB Cluster from Standard MySQL Limits
3.6.3 Limits Relating to Transaction Handling in NDB Cluster
3.6.4 NDB Cluster Error Handling
3.6.5 Limits Associated with Database Objects in NDB Cluster
3.6.6 Unsupported or Missing Features in NDB Cluster
3.6.7 Limitations Relating to Performance in NDB Cluster
3.6.8 Issues Exclusive to NDB Cluster
3.6.9 Limitations Relating to NDB Cluster Disk Data Storage
3.6.10 Limitations Relating to Multiple NDB Cluster Nodes
3.6.11 Previous NDB Cluster Issues Resolved in MySQL 5.1, NDB Cluster 6.x, and NDB Cluster 7.x

NDB Cluster is a technology that enables clustering of in-memory databases in a shared-nothing system. The shared-nothing architecture enables the system to work with very inexpensive hardware, and with a minimum of specific requirements for hardware or software.

NDB Cluster is designed not to have any single point of failure. In a shared-nothing system, each component is expected to have its own memory and disk, and the use of shared storage mechanisms such as network shares, network file systems, and SANs is not recommended or supported.

NDB Cluster integrates the standard MySQL server with an in-memory clustered storage engine called NDB (which stands for Network DataBase). In our documentation, the term NDB refers to the part of the setup that is specific to the storage engine, whereas NDB Cluster refers to the combination of one or more MySQL servers with the NDB storage engine.

An NDB Cluster consists of a set of computers, known as hosts, each running one or more processes. These processes, known as nodes, may include MySQL servers (for access to NDB data), data nodes (for storage of the data), one or more management servers, and possibly other specialized data access programs. The relationship of these components in an NDB Cluster is shown here:

Figure 3.1 NDB Cluster Components

NDB Cluster Components

All these programs work together to form an NDB Cluster (see Chapter 6, NDB Cluster Programs. When data is stored by the NDB storage engine, the tables (and table data) are stored in the data nodes. Such tables are directly accessible from all other MySQL servers (SQL nodes) in the cluster. Thus, in a payroll application storing data in a cluster, if one application updates the salary of an employee, all other MySQL servers that query this data can see this change immediately.

Although an NDB Cluster SQL node uses the mysqld server daemon, it differs in a number of critical respects from the mysqld binary supplied with the MySQL 5.5 distributions, and the two versions of mysqld are not interchangeable.

In addition, a MySQL server that is not connected to an NDB Cluster cannot use the NDB storage engine and cannot access any NDB Cluster data.

The data stored in the data nodes for NDB Cluster can be mirrored; the cluster can handle failures of individual data nodes with no other impact than that a small number of transactions are aborted due to losing the transaction state. Because transactional applications are expected to handle transaction failure, this should not be a source of problems.

Individual nodes can be stopped and restarted, and can then rejoin the system (cluster). Rolling restarts (in which all nodes are restarted in turn) are used in making configuration changes and software upgrades (see Section 7.5, “Performing a Rolling Restart of an NDB Cluster”). Rolling restarts are also used as part of the process of adding new data nodes online (see Section 7.13, “Adding NDB Cluster Data Nodes Online”). For more information about data nodes, how they are organized in an NDB Cluster, and how they handle and store NDB Cluster data, see Section 3.2, “NDB Cluster Nodes, Node Groups, Replicas, and Partitions”.

Backing up and restoring NDB Cluster databases can be done using the NDB-native functionality found in the NDB Cluster management client and the ndb_restore program included in the NDB Cluster distribution. For more information, see Section 7.3, “Online Backup of NDB Cluster”, and Section 6.20, “ndb_restore — Restore an NDB Cluster Backup”. You can also use the standard MySQL functionality provided for this purpose in mysqldump and the MySQL server. See mysqldump — A Database Backup Program, for more information.

NDB Cluster nodes can employ different transport mechanisms for inter-node communications; TCP/IP over standard 100 Mbps or faster Ethernet hardware is used in most real-world deployments.

3.1 NDB Cluster Core Concepts

NDBCLUSTER (also known as NDB) is an in-memory storage engine offering high-availability and data-persistence features.

The NDBCLUSTER storage engine can be configured with a range of failover and load-balancing options, but it is easiest to start with the storage engine at the cluster level. NDB Cluster's NDB storage engine contains a complete set of data, dependent only on other data within the cluster itself.

The Cluster portion of NDB Cluster is configured independently of the MySQL servers. In an NDB Cluster, each part of the cluster is considered to be a node.

Note

In many contexts, the term node is used to indicate a computer, but when discussing NDB Cluster it means a process. It is possible to run multiple nodes on a single computer; for a computer on which one or more cluster nodes are being run we use the term cluster host.

There are three types of cluster nodes, and in a minimal NDB Cluster configuration, there will be at least three nodes, one of each of these types:

  • Management node: The role of this type of node is to manage the other nodes within the NDB Cluster, performing such functions as providing configuration data, starting and stopping nodes, and running backups. Because this node type manages the configuration of the other nodes, a node of this type should be started first, before any other node. An MGM node is started with the command ndb_mgmd.

  • Data node: This type of node stores cluster data. There are as many data nodes as there are replicas, times the number of fragments (see Section 3.2, “NDB Cluster Nodes, Node Groups, Replicas, and Partitions”). For example, with two replicas, each having two fragments, you need four data nodes. One replica is sufficient for data storage, but provides no redundancy; therefore, it is recommended to have 2 (or more) replicas to provide redundancy, and thus high availability. A data node is started with the command ndbd (see Section 6.1, “ndbd — The NDB Cluster Data Node Daemon”) or ndbmtd (see Section 6.3, “ndbmtd — The NDB Cluster Data Node Daemon (Multi-Threaded)”).

    NDB Cluster tables are normally stored completely in memory rather than on disk (this is why we refer to NDB Cluster as an in-memory database). However, some NDB Cluster data can be stored on disk; see Section 7.12, “NDB Cluster Disk Data Tables”, for more information.

  • SQL node: This is a node that accesses the cluster data. In the case of NDB Cluster, an SQL node is a traditional MySQL server that uses the NDBCLUSTER storage engine. An SQL node is a mysqld process started with the --ndbcluster and --ndb-connectstring options, which are explained elsewhere in this chapter, possibly with additional MySQL server options as well.

    An SQL node is actually just a specialized type of API node, which designates any application which accesses NDB Cluster data. Another example of an API node is the ndb_restore utility that is used to restore a cluster backup. It is possible to write such applications using the NDB API. For basic information about the NDB API, see Getting Started with the NDB API.

Important

It is not realistic to expect to employ a three-node setup in a production environment. Such a configuration provides no redundancy; to benefit from NDB Cluster's high-availability features, you must use multiple data and SQL nodes. The use of multiple management nodes is also highly recommended.

For a brief introduction to the relationships between nodes, node groups, replicas, and partitions in NDB Cluster, see Section 3.2, “NDB Cluster Nodes, Node Groups, Replicas, and Partitions”.

Configuration of a cluster involves configuring each individual node in the cluster and setting up individual communication links between nodes. NDB Cluster is currently designed with the intention that data nodes are homogeneous in terms of processor power, memory space, and bandwidth. In addition, to provide a single point of configuration, all configuration data for the cluster as a whole is located in one configuration file.

The management server manages the cluster configuration file and the cluster log. Each node in the cluster retrieves the configuration data from the management server, and so requires a way to determine where the management server resides. When interesting events occur in the data nodes, the nodes transfer information about these events to the management server, which then writes the information to the cluster log.

In addition, there can be any number of cluster client processes or applications. These include standard MySQL clients, NDB-specific API programs, and management clients. These are described in the next few paragraphs.

Standard MySQL clients.  NDB Cluster can be used with existing MySQL applications written in PHP, Perl, C, C++, Java, Python, Ruby, and so on. Such client applications send SQL statements to and receive responses from MySQL servers acting as NDB Cluster SQL nodes in much the same way that they interact with standalone MySQL servers.

MySQL clients using an NDB Cluster as a data source can be modified to take advantage of the ability to connect with multiple MySQL servers to achieve load balancing and failover. For example, Java clients using Connector/J 5.0.6 and later can use jdbc:mysql:loadbalance:// URLs (improved in Connector/J 5.1.7) to achieve load balancing transparently; for more information about using Connector/J with NDB Cluster, see Using Connector/J with NDB Cluster.

NDB client programs.  Client programs can be written that access NDB Cluster data directly from the NDBCLUSTER storage engine, bypassing any MySQL Servers that may be connected to the cluster, using the NDB API, a high-level C++ API. Such applications may be useful for specialized purposes where an SQL interface to the data is not needed. For more information, see The NDB API.

NDB-specific Java applications can also be written for NDB Cluster using the NDB Cluster Connector for Java. This NDB Cluster Connector includes ClusterJ, a high-level database API similar to object-relational mapping persistence frameworks such as Hibernate and JPA that connect directly to NDBCLUSTER, and so does not require access to a MySQL Server. Support is also provided in NDB Cluster 7.1 and later for ClusterJPA, an OpenJPA implementation for NDB Cluster that leverages the strengths of ClusterJ and JDBC; ID lookups and other fast operations are performed using ClusterJ (bypassing the MySQL Server), while more complex queries that can benefit from MySQL's query optimizer are sent through the MySQL Server, using JDBC. See Java and NDB Cluster, and The ClusterJ API and Data Object Model, for more information.

The Memcache API for NDB Cluster, implemented as the loadable ndbmemcache storage engine for memcached version 1.6 and later, is available beginning with NDB 7.2.2. This API can be used to provide a persistent NDB Cluster data store, accessed using the memcache protocol.

The standard memcached caching engine is included in the NDB Cluster 7.2 distribution (7.2.2 and later). Each memcached server has direct access to data stored in NDB Cluster, but is also able to cache data locally and to serve (some) requests from this local cache.

For more information, see ndbmemcache—Memcache API for NDB Cluster.

Management clients.  These clients connect to the management server and provide commands for starting and stopping nodes gracefully, starting and stopping message tracing (debug versions only), showing node versions and status, starting and stopping backups, and so on. An example of this type of program is the ndb_mgm management client supplied with NDB Cluster (see Section 6.5, “ndb_mgm — The NDB Cluster Management Client”). Such applications can be written using the MGM API, a C-language API that communicates directly with one or more NDB Cluster management servers. For more information, see The MGM API.

Oracle also makes available MySQL Cluster Manager, which provides an advanced command-line interface simplifying many complex NDB Cluster management tasks, such restarting an NDB Cluster with a large number of nodes. The MySQL Cluster Manager client also supports commands for getting and setting the values of most node configuration parameters as well as mysqld server options and variables relating to NDB Cluster. See MySQL™ Cluster Manager 1.4.1 User Manual, for more information.

Event logs.  NDB Cluster logs events by category (startup, shutdown, errors, checkpoints, and so on), priority, and severity. A complete listing of all reportable events may be found in Section 7.6, “Event Reports Generated in NDB Cluster”. Event logs are of the two types listed here:

  • Cluster log: Keeps a record of all desired reportable events for the cluster as a whole.

  • Node log: A separate log which is also kept for each individual node.

Note

Under normal circumstances, it is necessary and sufficient to keep and examine only the cluster log. The node logs need be consulted only for application development and debugging purposes.

Checkpoint.  Generally speaking, when data is saved to disk, it is said that a checkpoint has been reached. More specific to NDB Cluster, a checkpoint is a point in time where all committed transactions are stored on disk. With regard to the NDB storage engine, there are two types of checkpoints which work together to ensure that a consistent view of the cluster's data is maintained. These are shown in the following list:

  • Local Checkpoint (LCP): This is a checkpoint that is specific to a single node; however, LCPs take place for all nodes in the cluster more or less concurrently. An LCP involves saving all of a node's data to disk, and so usually occurs every few minutes. The precise interval varies, and depends upon the amount of data stored by the node, the level of cluster activity, and other factors.

  • Global Checkpoint (GCP): A GCP occurs every few seconds, when transactions for all nodes are synchronized and the redo-log is flushed to disk.

For more information about the files and directories created by local checkpoints and global checkpoints, see NDB Cluster Data Node File System Directory Files.

3.2 NDB Cluster Nodes, Node Groups, Replicas, and Partitions

This section discusses the manner in which NDB Cluster divides and duplicates data for storage.

A number of concepts central to an understanding of this topic are discussed in the next few paragraphs.

(Data) Node.  An ndbd or ndbmtd process, which stores one or more replicas —that is, copies of the partitions (discussed later in this section) assigned to the node group of which the node is a member.

Each data node should be located on a separate computer. While it is also possible to host multiple data node processes on a single computer, such a configuration is not usually recommended.

It is common for the terms node and data node to be used interchangeably when referring to an ndbd or ndbmtd process; where mentioned, management nodes (ndb_mgmd processes) and SQL nodes (mysqld processes) are specified as such in this discussion.

Node Group.  A node group consists of one or more nodes, and stores partitions, or sets of replicas (see next item).

The number of node groups in an NDB Cluster is not directly configurable; it is a function of the number of data nodes and of the number of replicas (NoOfReplicas configuration parameter), as shown here:

[number_of_node_groups] = number_of_data_nodes / NoOfReplicas

Thus, an NDB Cluster with 4 data nodes has 4 node groups if NoOfReplicas is set to 1 in the config.ini file, 2 node groups if NoOfReplicas is set to 2, and 1 node group if NoOfReplicas is set to 4. Replicas are discussed later in this section; for more information about NoOfReplicas, see Section 5.3.6, “Defining NDB Cluster Data Nodes”.

Note

All node groups in an NDB Cluster must have the same number of data nodes.

You can add new node groups (and thus new data nodes) online, to a running NDB Cluster; see Section 7.13, “Adding NDB Cluster Data Nodes Online”, for more information.

Partition.  This is a portion of the data stored by the cluster. There are as many cluster partitions as nodes participating in the cluster. Each node is responsible for keeping at least one copy of any partitions assigned to it (that is, at least one replica) available to the cluster.

A replica belongs entirely to a single node; a node can (and usually does) store several replicas.

NDB and user-defined partitioning.  NDB Cluster normally partitions NDBCLUSTER tables automatically. However, it is also possible to employ user-defined partitioning with NDBCLUSTER tables. This is subject to the following limitations:

  1. Only the KEY and LINEAR KEY partitioning schemes are supported in production with NDB tables.

  2. The maximum number of partitions that may be defined explicitly for any NDB table is 8 * MaxNoOfExecutionThreads * [number of node groups], the number of node groups in an NDB Cluster being determined as discussed previously in this section. When using ndbd for data node processes, setting MaxNoOfExecutionThreads has no effect; in such a case, it can be treated as though it were equal to 1 for purposes of performing this calculation.

    See Section 6.3, “ndbmtd — The NDB Cluster Data Node Daemon (Multi-Threaded)”, for more information.

For more information relating to NDB Cluster and user-defined partitioning, see Section 3.6, “Known Limitations of NDB Cluster”, and Partitioning Limitations Relating to Storage Engines.

Replica.  This is a copy of a cluster partition. Each node in a node group stores a replica. Also sometimes known as a partition replica. The number of replicas is equal to the number of nodes per node group.

The following diagram illustrates an NDB Cluster with four data nodes, arranged in two node groups of two nodes each; nodes 1 and 2 belong to node group 0, and nodes 3 and 4 belong to node group 1.

Note

Only data (ndbd) nodes are shown here; although a working cluster requires an ndb_mgm process for cluster management and at least one SQL node to access the data stored by the cluster, these have been omitted in the figure for clarity.

Figure 3.2 NDB Cluster with Two Node Groups

An NDB Cluster, with 2 node groups having 2 nodes each

The data stored by the cluster is divided into four partitions, numbered 0, 1, 2, and 3. Each partition is stored—in multiple copies—on the same node group. Partitions are stored on alternate node groups as follows:

  • Partition 0 is stored on node group 0; a primary replica (primary copy) is stored on node 1, and a backup replica (backup copy of the partition) is stored on node 2.

  • Partition 1 is stored on the other node group (node group 1); this partition's primary replica is on node 3, and its backup replica is on node 4.

  • Partition 2 is stored on node group 0. However, the placing of its two replicas is reversed from that of Partition 0; for Partition 2, the primary replica is stored on node 2, and the backup on node 1.

  • Partition 3 is stored on node group 1, and the placement of its two replicas are reversed from those of partition 1. That is, its primary replica is located on node 4, with the backup on node 3.

What this means regarding the continued operation of an NDB Cluster is this: so long as each node group participating in the cluster has at least one node operating, the cluster has a complete copy of all data and remains viable. This is illustrated in the next diagram.

Figure 3.3 Nodes Required for a 2x2 Cluster

Nodes required to keep a 2x2 cluster viable

In this example, where the cluster consists of two node groups of two nodes each, any combination of at least one node in node group 0 and at least one node in node group 1 is sufficient to keep the cluster alive (indicated by arrows in the diagram). However, if both nodes from either node group fail, the remaining two nodes are not sufficient (shown by the arrows marked out with an X); in either case, the cluster has lost an entire partition and so can no longer provide access to a complete set of all cluster data.

3.3 NDB Cluster Hardware, Software, and Networking Requirements

One of the strengths of NDB Cluster is that it can be run on commodity hardware and has no unusual requirements in this regard, other than for large amounts of RAM, due to the fact that all live data storage is done in memory. (It is possible to reduce this requirement using Disk Data tables—see Section 7.12, “NDB Cluster Disk Data Tables”, for more information about these.) Naturally, multiple and faster CPUs can enhance performance. Memory requirements for other NDB Cluster processes are relatively small.

The software requirements for NDB Cluster are also modest. Host operating systems do not require any unusual modules, services, applications, or configuration to support NDB Cluster. For supported operating systems, a standard installation should be sufficient. The MySQL software requirements are simple: all that is needed is a production release of NDB Cluster. It is not strictly necessary to compile MySQL yourself merely to be able to use NDB Cluster. We assume that you are using the binaries appropriate to your platform, available from the NDB Cluster software downloads page at http://dev.mysql.com/downloads/cluster/.

For communication between nodes, NDB Cluster supports TCP/IP networking in any standard topology, and the minimum expected for each host is a standard 100 Mbps Ethernet card, plus a switch, hub, or router to provide network connectivity for the cluster as a whole. We strongly recommend that an NDB Cluster be run on its own subnet which is not shared with machines not forming part of the cluster for the following reasons:

  • Security.  Communications between NDB Cluster nodes are not encrypted or shielded in any way. The only means of protecting transmissions within an NDB Cluster is to run your NDB Cluster on a protected network. If you intend to use NDB Cluster for Web applications, the cluster should definitely reside behind your firewall and not in your network's De-Militarized Zone (DMZ) or elsewhere.

    See Section 7.11.1, “NDB Cluster Security and Networking Issues”, for more information.

  • Efficiency.  Setting up an NDB Cluster on a private or protected network enables the cluster to make exclusive use of bandwidth between cluster hosts. Using a separate switch for your NDB Cluster not only helps protect against unauthorized access to NDB Cluster data, it also ensures that NDB Cluster nodes are shielded from interference caused by transmissions between other computers on the network. For enhanced reliability, you can use dual switches and dual cards to remove the network as a single point of failure; many device drivers support failover for such communication links.

Network communication and latency.  NDB Cluster requires communication between data nodes and API nodes (including SQL nodes), as well as between data nodes and other data nodes, to execute queries and updates. Communication latency between these processes can directly affect the observed performance and latency of user queries. In addition, to maintain consistency and service despite the silent failure of nodes, NDB Cluster uses heartbeating and timeout mechanisms which treat an extended loss of communication from a node as node failure. This can lead to reduced redundancy. Recall that, to maintain data consistency, an NDB Cluster shuts down when the last node in a node group fails. Thus, to avoid increasing the risk of a forced shutdown, breaks in communication between nodes should be avoided wherever possible.

The failure of a data or API node results in the abort of all uncommitted transactions involving the failed node. Data node recovery requires synchronization of the failed node's data from a surviving data node, and re-establishment of disk-based redo and checkpoint logs, before the data node returns to service. This recovery can take some time, during which the Cluster operates with reduced redundancy.

Heartbeating relies on timely generation of heartbeat signals by all nodes. This may not be possible if the node is overloaded, has insufficient machine CPU due to sharing with other programs, or is experiencing delays due to swapping. If heartbeat generation is sufficiently delayed, other nodes treat the node that is slow to respond as failed.

This treatment of a slow node as a failed one may or may not be desirable in some circumstances, depending on the impact of the node's slowed operation on the rest of the cluster. When setting timeout values such as HeartbeatIntervalDbDb and HeartbeatIntervalDbApi for NDB Cluster, care must be taken care to achieve quick detection, failover, and return to service, while avoiding potentially expensive false positives.

Where communication latencies between data nodes are expected to be higher than would be expected in a LAN environment (on the order of 100 µs), timeout parameters must be increased to ensure that any allowed periods of latency periods are well within configured timeouts. Increasing timeouts in this way has a corresponding effect on the worst-case time to detect failure and therefore time to service recovery.

LAN environments can typically be configured with stable low latency, and such that they can provide redundancy with fast failover. Individual link failures can be recovered from with minimal and controlled latency visible at the TCP level (where NDB Cluster normally operates). WAN environments may offer a range of latencies, as well as redundancy with slower failover times. Individual link failures may require route changes to propagate before end-to-end connectivity is restored. At the TCP level this can appear as large latencies on individual channels. The worst-case observed TCP latency in these scenarios is related to the worst-case time for the IP layer to reroute around the failures.

SCI support.  It is also possible to use the high-speed Scalable Coherent Interface (SCI) with NDB Cluster, but this is not a requirement. See Section 5.4, “Using High-Speed Interconnects with NDB Cluster”, for more about this protocol and its use with NDB Cluster.

3.4 What is New in NDB Cluster in NDB Cluster 7.2

In this section, we discuss changes in the implementation of NDB Cluster in MySQL NDB Cluster 7.2, as compared to NDB Cluster 7.1 and earlier releases. Changes and features most likely to be of interest are shown in the following table:

NDB Cluster 7.2
NDB Cluster 7.2 is based on MySQL 5.5. For more information about new features in MySQL Server 5.5, see What Is New in MySQL 5.5.
Version 2 binary log row events, to provide support for improvements in NDB Cluster Replication conflict detection (see next item). A given mysqld can be made to use Version 1 or Version 2 binary logging row events with the --log-bin-use-v1-row-events option.
Two new primary wins conflict detection and resolution functions NDB$EPOCH() and NDB$EPOCH_TRANS() for use in replication setups with 2 NDB Clusters. For more information, see Chapter 8, NDB Cluster Replication.
Distribution of MySQL users and privileges across NDB Cluster SQL nodes is now supported—see Section 7.14, “Distributed MySQL Privileges for NDB Cluster”.
Improved support for distributed pushed-down joins, which greatly improve performance for many joins that can be executed in parallel on the data nodes.
Default values for a number of data node configuration parameters such as HeartbeatIntervalDbDb and ArbitrationTimeout have been improved.
Support for the Memcache API using the loadable ndbmemcache storage engine. See ndbmemcache—Memcache API for NDB Cluster.

This section contains information about NDB Cluster 7.2 releases through 5.5.53-ndb-7.2.27, which is a previous GA release but still supported, as is NDB Cluster 7.3. NDB Cluster 7.1, NDB Cluster 7.0, and NDB Cluster 6.3 are previous GA release series which are no longer supported. We recommend that new deployments use NDB Cluster 7.4 or NDB Cluster 7.5, both of which are available as General Availability releases. For information about NDB Cluster 7.1 and previous releases, see the MySQL 5.1 Reference Manual.

The following improvements to NDB Cluster have been made in NDB Cluster 7.2:

  • Based on MySQL Server 5.5.  Previous NDB Cluster release series, including NDB Cluster 7.1, used MySQL 5.1 as a base. Beginning with NDB 7.2.1, NDB Cluster 7.2 is based on MySQL Server 5.5, so that NDB Cluster users can benefit from MySQL 5.5's improvements in scalability and performance monitoring. As with MySQL 5.5, NDB 7.2.1 and later use CMake for configuring and building from source in place of GNU Autotools (used in MySQL 5.1 and NDB Cluster releases based on MySQL 5.1). For more information about changes and improvements in MySQL 5.5, see What Is New in MySQL 5.5.

  • Conflict detection using GCI Reflection.  NDB Cluster Replication implements a new primary wins conflict detection and resolution mechanism. GCI Reflection applies in two-cluster circulation active-active replication setups, tracking the order in which changes are applied on the NDB Cluster designated as primary relative to changes originating on the other NDB Cluster (referred to as the secondary). This relative ordering is used to determine whether changes originating on the slave are concurrent with any changes that originate locally, and are therefore potentially in conflict. Two new conflict detection functions are added: When using NDB$EPOCH(), rows that are out of sync on the secondary are realigned with those on the primary; with NDB$EPOCH_TRANS(), this realignment is applied to transactions. For more information, see Section 8.11, “NDB Cluster Replication Conflict Resolution”.

  • Version 2 binary log row events.  A new format for binary log row events, known as Version 2 binary log row events, provides support for improvements in NDB Cluster Replication conflict detection (see previous item) and is intended to facilitate further improvements in MySQL Replication. You can cause a given mysqld use Version 1 or Version 2 binary logging row events with the --log-bin-use-v1-row-events option. For backward compatibility, Version 2 binary log row events are also available in NDB Cluster 7.0 (7.0.27 and later) and NDB Cluster 7.1 (7.1.16 and later). However, NDB Cluster 7.0 and NDB Cluster 7.1 continue to use Version 1 binary log row events as the default, whereas the default in NDB 7.2.1 and later is use Version 2 row events for binary logging.

  • Distribution of MySQL users and privileges.  Automatic distribution of MySQL users and privileges across all SQL nodes in a given NDB Cluster is now supported. To enable this support, you must first import an SQL script share/mysql/ndb_dist_priv.sql that is included with the NDB Cluster 7.2 distribution. This script creates several stored procedures which you can use to enable privilege distribution and perform related tasks.

    When a new MySQL Server joins an NDB Cluster where privilege distribution is in effect, it also participates in the privilege distribution automatically.

    Once privilege distribution is enabled, all changes to the grant tables made on any mysqld attached to the cluster are immediately available on any other attached MySQL Servers. This is true whether the changes are made using CREATE USER, GRANT, or any of the other statements described elsewhere in this Manual (see Account Management Statements.) This includes privileges relating to stored routines and views; however, automatic distribution of the views or stored routines themselves is not currently supported.

    For more information, see Section 7.14, “Distributed MySQL Privileges for NDB Cluster”.

  • Distributed pushed-down joins.  Many joins can now be pushed down to the NDB kernel for processing on NDB Cluster data nodes. Previously, a join was handled in NDB Cluster by means of repeated accesses of NDB by the SQL node; however, when pushed-down joins are enabled, a pushable join is sent in its entirety to the data nodes, where it can be distributed among the data nodes and executed in parallel on multiple copies of the data, with a single, merged result being returned to mysqld. This can reduce greatly the number of round trips between an SQL node and the data nodes required to handle such a join, leading to greatly improved performance of join processing.

    It is possible to determine when joins can be pushed down to the data nodes by examining the join with EXPLAIN. A number of new system status variables (Ndb_pushed_queries_defined, Ndb_pushed_queries_dropped, Ndb_pushed_queries_executed, and Ndb_pushed_reads) and additions to the counters table (in the ndbinfo information database) can also be helpful in determining when and how well joins are being pushed down.

    More information and examples are available in the description of the ndb_join_pushdown server system variable. See also the description of the status variables referenced in the previous paragraph, as well as Section 7.10.7, “The ndbinfo counters Table”.

  • Improved default values for data node configuration parameters.  In order to provide more resiliency to environmental issues and better handling of some potential failure scenarios, and to perform more reliably with increases in memory and other resource requirements brought about by recent improvements in join handling by NDB, the default values for a number of NDB Cluster data node configuration parameters have been changed. The parameters and changes are described in the following list:

    In addition, the value computed for MaxNoOfLocalScans when this parameter is not set in config.ini has been increased by a factor of 4.

  • Fail-fast data nodes.  Beginning with NDB 7.2.1, data nodes handle corrupted tuples in a fail-fast manner by default. This is a change from previous versions of NDB Cluster where this behavior had to be enabled explicitly by enabling the CrashOnCorruptedTuple configuration parameter. In NDB 7.2.1 and later, this parameter is enabled by default and must be explicitly disabled, in which case data nodes merely log a warning whenever they detect a corrupted tuple.

  • Memcache API support (ndbmemcache).  The Memcached server is a distributed in-memory caching server that uses a simple text-based protocol. It is often employed with key-value stores. The Memcache API for NDB Cluster, available beginning with NDB 7.2.2, is implemented as a loadable storage engine for memcached version 1.6 and later. This API can be used to access a persistent NDB Cluster data store employing the memcache protocol. It is also possible for the memcached server to provide a strictly defined interface to existing NDB Cluster tables.

    Each memcache server can both cache data locally and access data stored in NDB Cluster directly. Caching policies are configurable. For more information, see ndbmemcache—Memcache API for NDB Cluster, in the NDB Cluster API Developers Guide.

  • Rows per partition limit removed.  Previously it was possible to store a maximum of 46137488 rows in a single NDB Cluster partition—that is, per data node. Beginning with NDB 7.2.9, this limitation has been lifted, and there is no longer any practical upper limit to this number. (Bug #13844405, Bug #14000373)

NDB Cluster 7.2 is also supported by MySQL Cluster Manager, which provides an advanced command-line interface that can simplify many complex NDB Cluster management tasks. See MySQL™ Cluster Manager 1.4.1 User Manual, for more information.

3.5 MySQL Server Using InnoDB Compared with NDB Cluster

MySQL Server offers a number of choices in storage engines. Since both NDBCLUSTER and InnoDB can serve as transactional MySQL storage engines, users of MySQL Server sometimes become interested in NDB Cluster. They see NDB as a possible alternative or upgrade to the default InnoDB storage engine in MySQL 5.5. While NDB and InnoDB share common characteristics, there are differences in architecture and implementation, so that some existing MySQL Server applications and usage scenarios can be a good fit for NDB Cluster, but not all of them.

In this section, we discuss and compare some characteristics of the NDB storage engine used by NDB Cluster 7.2 with InnoDB used in MySQL 5.5. The next few sections provide a technical comparison. In many instances, decisions about when and where to use NDB Cluster must be made on a case-by-case basis, taking all factors into consideration. While it is beyond the scope of this documentation to provide specifics for every conceivable usage scenario, we also attempt to offer some very general guidance on the relative suitability of some common types of applications for NDB as opposed to InnoDB backends.

Recent NDB Cluster 7.2 releases use a mysqld based on MySQL 5.5, including support for InnoDB 1.1. While it is possible to use InnoDB tables with NDB Cluster, such tables are not clustered. It is also not possible to use programs or libraries from an NDB Cluster 7.2 distribution with MySQL Server 5.5, or the reverse.

While it is also true that some types of common business applications can be run either on NDB Cluster or on MySQL Server (most likely using the InnoDB storage engine), there are some important architectural and implementation differences. Section 3.5.1, “Differences Between the NDB and InnoDB Storage Engines”, provides a summary of the these differences. Due to the differences, some usage scenarios are clearly more suitable for one engine or the other; see Section 3.5.2, “NDB and InnoDB Workloads”. This in turn has an impact on the types of applications that better suited for use with NDB or InnoDB. See Section 3.5.3, “NDB and InnoDB Feature Usage Summary”, for a comparison of the relative suitability of each for use in common types of database applications.

For information about the relative characteristics of the NDB and MEMORY storage engines, see When to Use MEMORY or MySQL Cluster.

See Alternative Storage Engines, for additional information about MySQL storage engines.

3.5.1 Differences Between the NDB and InnoDB Storage Engines

The NDB Cluster NDB storage engine is implemented using a distributed, shared-nothing architecture, which causes it to behave differently from InnoDB in a number of ways. For those unaccustomed to working with NDB, unexpected behaviors can arise due to its distributed nature with regard to transactions, foreign keys, table limits, and other characteristics. These are shown in the following table:

Feature

InnoDB 1.1

NDB Cluster NDB 7.2

MySQL Server Version

5.5

NDB Cluster 7.2: 5.5

NDB Cluster 7.3: 5.6

InnoDB Version

InnoDB 1.1

InnoDB 1.1

NDB Cluster Version

N/A

NDB 7.2.27

Storage Limits

64TB

3TB

(Practical upper limit based on 48 data nodes with 64GB RAM each; can be increased with disk-based data and BLOBs)

Foreign Keys

Yes

Available in NDB Cluster 7.3 and later.

(Prior to NDB Cluster 7.3: Ignored, as with MyISAM.)

Transactions

All standard types

READ COMMITTED

MVCC

Yes

No

Data Compression

Yes

No

(NDB Cluster checkpoint and backup files can be compressed)

Large Row Support (> 14K)

Supported for VARBINARY, VARCHAR, BLOB, and TEXT columns

Supported for BLOB and TEXT columns only

(Using these types to store very large amounts of data can lower NDB Cluster performance)

Replication Support

Asynchronous and semisynchronous replication using MySQL Replication

Automatic synchronous replication within an NDB Cluster.

Asynchronous replication between NDB Clusters, using MySQL Replication

Scaleout for Read Operations

Yes (MySQL Replication)

Yes (Automatic partitioning in NDB Cluster; MySQL Replication)

Scaleout for Write Operations

Requires application-level partitioning (sharding)

Yes (Automatic partitioning in NDB Cluster is transparent to applications)

High Availability (HA)

Requires additional software

Yes (Designed for 99.999% uptime)

Node Failure Recovery and Failover

Requires additional software

Automatic

(Key element in NDB Cluster architecture)

Time for Node Failure Recovery

30 seconds or longer

Typically < 1 second

Real-Time Performance

No

Yes

In-Memory Tables

No

Yes

(Some data can optionally be stored on disk; both in-memory and disk data storage are durable)

NoSQL Access to Storage Engine

Native memcached interface in development (see the MySQL Dev Zone article NDB Cluster 7.2 (DMR2): NoSQL, Key/Value, Memcached)

Yes

Multiple APIs, including Memcached, Node.js/JavaScript, Java, JPA, C++, and HTTP/REST

Concurrent and Parallel Writes

Not supported

Up to 48 writers, optimized for concurrent writes

Conflict Detection and Resolution (Multiple Replication Masters)

No

Yes

Hash Indexes

No

Yes

Online Addition of Nodes

Read-only replicas using MySQL Replication

Yes (all node types)

Online Upgrades

No

Yes

Online Schema Modifications

No.

Yes.

3.5.2 NDB and InnoDB Workloads

NDB Cluster has a range of unique attributes that make it ideal to serve applications requiring high availability, fast failover, high throughput, and low latency. Due to its distributed architecture and multi-node implementation, NDB Cluster also has specific constraints that may keep some workloads from performing well. A number of major differences in behavior between the NDB and InnoDB storage engines with regard to some common types of database-driven application workloads are shown in the following table::

Workload

InnoDB

NDB Cluster (NDB)

High-Volume OLTP Applications

Yes

Yes

DSS Applications (data marts, analytics)

Yes

Limited (Join operations across OLTP datasets not exceeding 3TB in size)

Custom Applications

Yes

Yes

Packaged Applications

Yes

Limited (should be mostly primary key access).

Note

NDB Cluster 7.3 supports foreign keys.

In-Network Telecoms Applications (HLR, HSS, SDP)

No

Yes

Session Management and Caching

Yes

Yes

E-Commerce Applications

Yes

Yes

User Profile Management, AAA Protocol

Yes

Yes

3.5.3 NDB and InnoDB Feature Usage Summary

When comparing application feature requirements to the capabilities of InnoDB with NDB, some are clearly more compatible with one storage engine than the other.

The following table lists supported application features according to the storage engine to which each feature is typically better suited.

Preferred application requirements for InnoDB

Preferred application requirements for NDB

  • Foreign keys

    Note

    NDB Cluster 7.3 supports foreign keys.

  • Full table scans

  • Very large databases, rows, or transactions

  • Transactions other than READ COMMITTED

3.6 Known Limitations of NDB Cluster

In the sections that follow, we discuss known limitations in current releases of NDB Cluster as compared with the features available when using the MyISAM and InnoDB storage engines. If you check the Cluster category in the MySQL bugs database at http://bugs.mysql.com, you can find known bugs in the following categories under MySQL Server: in the MySQL bugs database at http://bugs.mysql.com, which we intend to correct in upcoming releases of NDB Cluster:

  • NDB Cluster

  • Cluster Direct API (NDBAPI)

  • Cluster Disk Data

  • Cluster Replication

  • ClusterJ

This information is intended to be complete with respect to the conditions just set forth. You can report any discrepancies that you encounter to the MySQL bugs database using the instructions given in How to Report Bugs or Problems. If we do not plan to fix the problem in NDB Cluster 7.2, we will add it to the list.

See Section 3.6.11, “Previous NDB Cluster Issues Resolved in MySQL 5.1, NDB Cluster 6.x, and NDB Cluster 7.x” for a list of issues in NDB Cluster in MySQL 5.1 that have been resolved in the current version.

Note

Limitations and other issues specific to NDB Cluster Replication are described in Section 8.3, “Known Issues in NDB Cluster Replication”.

3.6.1 Noncompliance with SQL Syntax in NDB Cluster

Some SQL statements relating to certain MySQL features produce errors when used with NDB tables, as described in the following list:

  • Temporary tables.  Temporary tables are not supported. Trying either to create a temporary table that uses the NDB storage engine or to alter an existing temporary table to use NDB fails with the error Table storage engine 'ndbcluster' does not support the create option 'TEMPORARY'.

  • Indexes and keys in NDB tables.  Keys and indexes on NDB Cluster tables are subject to the following limitations:

    • Column width.  Attempting to create an index on an NDB table column whose width is greater than 3072 bytes succeeds, but only the first 3072 bytes are actually used for the index. In such cases, a warning Specified key was too long; max key length is 3072 bytes is issued, and a SHOW CREATE TABLE statement shows the length of the index as 3072.

    • TEXT and BLOB columns.  You cannot create indexes on NDB table columns that use any of the TEXT or BLOB data types.

    • FULLTEXT indexes.  The NDB storage engine does not support FULLTEXT indexes, which are possible for MyISAM tables only.

      However, you can create indexes on VARCHAR columns of NDB tables.

    • USING HASH keys and NULL.  Using nullable columns in unique keys and primary keys means that queries using these columns are handled as full table scans. To work around this issue, make the column NOT NULL, or re-create the index without the USING HASH option.

    • Prefixes.  There are no prefix indexes; only entire columns can be indexed. (The size of an NDB column index is always the same as the width of the column in bytes, up to and including 3072 bytes, as described earlier in this section. Also see Section 3.6.6, “Unsupported or Missing Features in NDB Cluster”, for additional information.)

    • BIT columns.  A BIT column cannot be a primary key, unique key, or index, nor can it be part of a composite primary key, unique key, or index.

    • AUTO_INCREMENT columns.  Like other MySQL storage engines, the NDB storage engine can handle a maximum of one AUTO_INCREMENT column per table, and this column must be indexed. However, in the case of an NDB Cluster table with no explicit primary key, an AUTO_INCREMENT column is automatically defined and used as a hidden primary key. For this reason, you cannot create an NDB table having an AUTO_INCREMENT column and no explicit primary key.

  • NDB Cluster and geometry data types.  Geometry data types (WKT and WKB) are supported for NDB tables. However, spatial indexes are not supported.

  • Character sets and binary log files.  Currently, the ndb_apply_status and ndb_binlog_index tables are created using the latin1 (ASCII) character set. Because names of binary logs are recorded in this table, binary log files named using non-Latin characters are not referenced correctly in these tables. This is a known issue, which we are working to fix. (Bug #50226)

    To work around this problem, use only Latin-1 characters when naming binary log files or setting any the --basedir, --log-bin, or --log-bin-index options.

  • Creating NDB tables with user-defined partitioning.  Support for user-defined partitioning in NDB Cluster is restricted to [LINEAR] KEY partitioning. Using any other partitioning type with ENGINE=NDB or ENGINE=NDBCLUSTER in a CREATE TABLE statement results in an error.

    It is possible to override this restriction, but doing so is not supported for use in production settings. For details, see User-defined partitioning and the NDB storage engine (MySQL Cluster).

    Default partitioning scheme.  All NDB Cluster tables are by default partitioned by KEY using the table's primary key as the partitioning key. If no primary key is explicitly set for the table, the hidden primary key automatically created by the NDB storage engine is used instead. For additional discussion of these and related issues, see KEY Partitioning.

    CREATE TABLE and ALTER TABLE statements that would cause a user-partitioned NDBCLUSTER table not to meet either or both of the following two requirements are not permitted, and fail with an error:

    1. The table must have an explicit primary key.

    2. All columns listed in the table's partitioning expression must be part of the primary key.

    Exception.  If a user-partitioned NDBCLUSTER table is created using an empty column-list (that is, using PARTITION BY [LINEAR] KEY()), then no explicit primary key is required.

    Maximum number of partitions for NDBCLUSTER tables.  The maximum number of partitions that can defined for a NDBCLUSTER table when employing user-defined partitioning is 8 per node group. (See Section 3.2, “NDB Cluster Nodes, Node Groups, Replicas, and Partitions”, for more information about NDB Cluster node groups.

    DROP PARTITION not supported.  It is not possible to drop partitions from NDB tables using ALTER TABLE ... DROP PARTITION. The other partitioning extensions to ALTER TABLEADD PARTITION, REORGANIZE PARTITION, and COALESCE PARTITION—are supported for Cluster tables, but use copying and so are not optimized. See Management of RANGE and LIST Partitions and ALTER TABLE Syntax.

  • Row-based replication.  When using row-based replication with NDB Cluster, binary logging cannot be disabled. That is, the NDB storage engine ignores the value of sql_log_bin. (Bug #16680)

3.6.2 Limits and Differences of NDB Cluster from Standard MySQL Limits

In this section, we list limits found in NDB Cluster that either differ from limits found in, or that are not found in, standard MySQL.

Memory usage and recovery.  Memory consumed when data is inserted into an NDB table is not automatically recovered when deleted, as it is with other storage engines. Instead, the following rules hold true:

  • A DELETE statement on an NDB table makes the memory formerly used by the deleted rows available for re-use by inserts on the same table only. However, this memory can be made available for general re-use by performing OPTIMIZE TABLE.

    A rolling restart of the cluster also frees any memory used by deleted rows. See Section 7.5, “Performing a Rolling Restart of an NDB Cluster”.

  • A DROP TABLE or TRUNCATE TABLE operation on an NDB table frees the memory that was used by this table for re-use by any NDB table, either by the same table or by another NDB table.

    Note

    Recall that TRUNCATE TABLE drops and re-creates the table. See TRUNCATE TABLE Syntax.

  • Limits imposed by the cluster's configuration.  A number of hard limits exist which are configurable, but available main memory in the cluster sets limits. See the complete list of configuration parameters in Section 5.3, “NDB Cluster Configuration Files”. Most configuration parameters can be upgraded online. These hard limits include:

  • Node and data object maximums.  The following limits apply to numbers of cluster nodes and metadata objects:

    • The maximum number of data nodes is 48.

      A data node must have a node ID in the range of 1 to 48, inclusive. (Management and API nodes may use node IDs in the range 1 to 255, inclusive.)

    • The total maximum number of nodes in an NDB Cluster is 255. This number includes all SQL nodes (MySQL Servers), API nodes (applications accessing the cluster other than MySQL servers), data nodes, and management servers.

    • The maximum number of metadata objects in current versions of NDB Cluster is 20320. This limit is hard-coded.

    See Section 3.6.11, “Previous NDB Cluster Issues Resolved in MySQL 5.1, NDB Cluster 6.x, and NDB Cluster 7.x”, for more information.

3.6.3 Limits Relating to Transaction Handling in NDB Cluster

A number of limitations exist in NDB Cluster with regard to the handling of transactions. These include the following:

  • Transaction isolation level.  The NDBCLUSTER storage engine supports only the READ COMMITTED transaction isolation level. (InnoDB, for example, supports READ COMMITTED, READ UNCOMMITTED, REPEATABLE READ, and SERIALIZABLE.) You should keep in mind that NDB implements READ COMMITTED on a per-row basis; when a read request arrives at the data node storing the row, what is returned is the last committed version of the row at that time.

    Uncommitted data is never returned, but when a transaction modifying a number of rows commits concurrently with a transaction reading the same rows, the transaction performing the read can observe before values, after values, or both, for different rows among these, due to the fact that a given row read request can be processed either before or after the commit of the other transaction.

    To ensure that a given transaction reads only before or after values, you can impose row locks using SELECT ... LOCK IN SHARE MODE. In such cases, the lock is held until the owning transaction is committed. Using row locks can also cause the following issues:

    • Increased frequency of lock wait timeout errors, and reduced concurrency

    • Increased transaction processing overhead due to reads requiring a commit phase

    • Possibility of exhausting the available number of concurrent locks, which is limited by MaxNoOfConcurrentOperations

    NDB uses READ COMMITTED for all reads unless a modifier such as LOCK IN SHARE MODE or FOR UPDATE is used. LOCK IN SHARE MODE causes shared row locks to be used; FOR UPDATE causes exclusive row locks to be used. Unique key reads have their locks upgraded automatically by NDB to ensure a self-consistent read; BLOB reads also employ extra locking for consistency.

    See Section 7.3.4, “NDB Cluster Backup Troubleshooting”, for information on how NDB Cluster's implementation of transaction isolation level can affect backup and restoration of NDB databases.

  • Transactions and BLOB or TEXT columns.  NDBCLUSTER stores only part of a column value that uses any of MySQL's BLOB or TEXT data types in the table visible to MySQL; the remainder of the BLOB or TEXT is stored in a separate internal table that is not accessible to MySQL. This gives rise to two related issues of which you should be aware whenever executing SELECT statements on tables that contain columns of these types:

    1. For any SELECT from an NDB Cluster table: If the SELECT includes a BLOB or TEXT column, the READ COMMITTED transaction isolation level is converted to a read with read lock. This is done to guarantee consistency.

    2. For any SELECT which uses a unique key lookup to retrieve any columns that use any of the BLOB or TEXT data types and that is executed within a transaction, a shared read lock is held on the table for the duration of the transaction—that is, until the transaction is either committed or aborted.

      This issue does not occur for queries that use index or table scans, even against NDB tables having BLOB or TEXT columns.

      For example, consider the table t defined by the following CREATE TABLE statement:

      CREATE TABLE t (
          a INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
          b INT NOT NULL,
          c INT NOT NULL,
          d TEXT,
          INDEX i(b),
          UNIQUE KEY u(c)
      ) ENGINE = NDB,
      

      Either of the following queries on t causes a shared read lock, because the first query uses a primary key lookup and the second uses a unique key lookup:

      SELECT * FROM t WHERE a = 1;
      SELECT * FROM t WHERE c = 1;
      

      However, none of the four queries shown here causes a shared read lock:

      SELECT * FROM t WHERE b = 1;
      SELECT * FROM t WHERE d = '1';
      SELECT * FROM t;
      SELECT b,c WHERE a = 1;
      

      This is because, of these four queries, the first uses an index scan, the second and third use table scans, and the fourth, while using a primary key lookup, does not retrieve the value of any BLOB or TEXT columns.

      You can help minimize issues with shared read locks by avoiding queries that use unique key lookups that retrieve BLOB or TEXT columns, or, in cases where such queries are not avoidable, by committing transactions as soon as possible afterward.

  • Rollbacks.  There are no partial transactions, and no partial rollbacks of transactions. A duplicate key or similar error causes the entire transaction to be rolled back.

    This behavior differs from that of other transactional storage engines such as InnoDB that may roll back individual statements.

  • Transactions and memory usage.  As noted elsewhere in this chapter, NDB Cluster does not handle large transactions well; it is better to perform a number of small transactions with a few operations each than to attempt a single large transaction containing a great many operations. Among other considerations, large transactions require very large amounts of memory. Because of this, the transactional behavior of a number of MySQL statements is effected as described in the following list:

    • TRUNCATE TABLE is not transactional when used on NDB tables. If a TRUNCATE TABLE fails to empty the table, then it must be re-run until it is successful.

    • DELETE FROM (even with no WHERE clause) is transactional. For tables containing a great many rows, you may find that performance is improved by using several DELETE FROM ... LIMIT ... statements to chunk the delete operation. If your objective is to empty the table, then you may wish to use TRUNCATE TABLE instead.

    • LOAD DATA statements.  LOAD DATA INFILE is not transactional when used on NDB tables.

      Important

      When executing a LOAD DATA INFILE statement, the NDB engine performs commits at irregular intervals that enable better utilization of the communication network. It is not possible to know ahead of time when such commits take place.

    • ALTER TABLE and transactions.  When copying an NDB table as part of an ALTER TABLE, the creation of the copy is nontransactional. (In any case, this operation is rolled back when the copy is deleted.)

  • Transactions and the COUNT() function.  When using NDB Cluster Replication, it is not possible to guarantee the transactional consistency of the COUNT() function on the slave. In other words, when performing on the master a series of statements (INSERT, DELETE, or both) that changes the number of rows in a table within a single transaction, executing SELECT COUNT(*) FROM table queries on the slave may yield intermediate results. This is due to the fact that SELECT COUNT(...) may perform dirty reads, and is not a bug in the NDB storage engine. (See Bug #31321 for more information.)

3.6.4 NDB Cluster Error Handling

Starting, stopping, or restarting a node may give rise to temporary errors causing some transactions to fail. These include the following cases:

  • Temporary errors.  When first starting a node, it is possible that you may see Error 1204 Temporary failure, distribution changed and similar temporary errors.

  • Errors due to node failure.  The stopping or failure of any data node can result in a number of different node failure errors. (However, there should be no aborted transactions when performing a planned shutdown of the cluster.)

In either of these cases, any errors that are generated must be handled within the application. This should be done by retrying the transaction.

See also Section 3.6.2, “Limits and Differences of NDB Cluster from Standard MySQL Limits”.

3.6.5 Limits Associated with Database Objects in NDB Cluster

Some database objects such as tables and indexes have different limitations when using the NDBCLUSTER storage engine:

  • Database and table names.  When using the NDB storage engine, the maximum allowed length both for database names and for table names is 63 characters.

  • Number of database objects.  The maximum number of all NDB database objects in a single NDB Cluster—including databases, tables, and indexes—is limited to 20320.

  • Attributes per table.  The maximum number of attributes (that is, columns and indexes) that can belong to a given table is 512.

  • Attributes per key.  The maximum number of attributes per key is 32.

  • Row size.  The maximum permitted size of any one row is 14000 bytes (as of NDB Cluster 7.0). Each BLOB or TEXT column contributes 256 + 8 = 264 bytes to this total.

  • BIT column storage per table.  The maximum combined width for all BIT columns used in a given NDB table is 4096.

  • FIXED column storage.  NDB Cluster supports a maximum of 16 GB per fragment of data in FIXED columns.

3.6.6 Unsupported or Missing Features in NDB Cluster

A number of features supported by other storage engines are not supported for NDB tables. Trying to use any of these features in NDB Cluster does not cause errors in or of itself; however, errors may occur in applications that expects the features to be supported or enforced. Statements referencing such features, even if effectively ignored by NDB, must be syntactically and otherwise valid.

  • Foreign key constraints.  Prior to NDB Cluster 7.3, the foreign key construct is ignored, just as it is by MyISAM tables. Foreign keys are supported in NDB Cluster 7.3 and later.

  • Index prefixes.  Prefixes on indexes are not supported for NDB tables. If a prefix is used as part of an index specification in a statement such as CREATE TABLE, ALTER TABLE, or CREATE INDEX, the prefix is not created by NDB.

    A statement containing an index prefix, and creating or modifying an NDB table, must still be syntactically valid. For example, the following statement always fails with Error 1089 Incorrect prefix key; the used key part isn't a string, the used length is longer than the key part, or the storage engine doesn't support unique prefix keys, regardless of storage engine:

    CREATE TABLE t1 (
        c1 INT NOT NULL,
        c2 VARCHAR(100),
        INDEX i1 (c2(500))
    );

    This happens on account of the SQL syntax rule that no index may have a prefix larger than itself.

  • Savepoints and rollbacks.  Savepoints and rollbacks to savepoints are ignored as in MyISAM.

  • Durability of commits.  There are no durable commits on disk. Commits are replicated, but there is no guarantee that logs are flushed to disk on commit.

  • Replication.  Statement-based replication is not supported. Use --binlog-format=ROW (or --binlog-format=MIXED) when setting up cluster replication. See Chapter 8, NDB Cluster Replication, for more information.

Note

See Section 3.6.3, “Limits Relating to Transaction Handling in NDB Cluster”, for more information relating to limitations on transaction handling in NDB.

3.6.7 Limitations Relating to Performance in NDB Cluster

The following performance issues are specific to or especially pronounced in NDB Cluster:

  • Range scans.  There are query performance issues due to sequential access to the NDB storage engine; it is also relatively more expensive to do many range scans than it is with either MyISAM or InnoDB.

  • Reliability of Records in range.  The Records in range statistic is available but is not completely tested or officially supported. This may result in nonoptimal query plans in some cases. If necessary, you can employ USE INDEX or FORCE INDEX to alter the execution plan. See Index Hints, for more information on how to do this.

  • Unique hash indexes.  Unique hash indexes created with USING HASH cannot be used for accessing a table if NULL is given as part of the key.

3.6.8 Issues Exclusive to NDB Cluster

The following are limitations specific to the NDBCLUSTER storage engine:

  • Machine architecture.  All machines used in the cluster must have the same architecture. That is, all machines hosting nodes must be either big-endian or little-endian, and you cannot use a mixture of both. For example, you cannot have a management node running on a PowerPC which directs a data node that is running on an x86 machine. This restriction does not apply to machines simply running mysql or other clients that may be accessing the cluster's SQL nodes.

  • Binary logging.  NDB Cluster has the following limitations or restrictions with regard to binary logging:

See also Section 3.6.10, “Limitations Relating to Multiple NDB Cluster Nodes”.

3.6.9 Limitations Relating to NDB Cluster Disk Data Storage

Disk Data object maximums and minimums.  Disk data objects are subject to the following maximums and minimums:

  • Maximum number of tablespaces: 232 (4294967296)

  • Maximum number of data files per tablespace: 216 (65536)

  • Maximum data file size: The theoretical limit is 64G; however, the practical upper limit is 32G. This is equivalent to 32768 extents of 1M each.

    Since an NDB Cluster Disk Data table can use at most 1 tablespace, this means that the theoretical upper limit to the amount of data (in bytes) that can be stored on disk by a single NDB table is 32G * 65536 = 2251799813685248, or approximately 2 petabytes.

  • The theoretical maximum number of extents per tablespace data file is 216 (65536); however, for practical purposes, the recommended maximum number of extents per data file is 215 (32768).

    The minimum and maximum possible sizes of extents for tablespace data files are 32K and 2G, respectively. See CREATE TABLESPACE Syntax, for more information.

Disk Data tables and diskless mode.  Use of Disk Data tables is not supported when running the cluster in diskless mode. Beginning with MySQL 5.1.12, it is prohibited altogether. (Bug #20008)

3.6.10 Limitations Relating to Multiple NDB Cluster Nodes

Multiple SQL nodes.  The following are issues relating to the use of multiple MySQL servers as NDB Cluster SQL nodes, and are specific to the NDBCLUSTER storage engine:

  • No distributed table locks.  A LOCK TABLES works only for the SQL node on which the lock is issued; no other SQL node in the cluster sees this lock. This is also true for a lock issued by any statement that locks tables as part of its operations. (See next item for an example.)

  • ALTER TABLE operations.  ALTER TABLE is not fully locking when running multiple MySQL servers (SQL nodes). (As discussed in the previous item, NDB Cluster does not support distributed table locks.)

Multiple management nodes.  When using multiple management servers:

  • If any of the management servers are running on the same host, you must give nodes explicit IDs in connection strings because automatic allocation of node IDs does not work across multiple management servers on the same host. This is not required if every management server resides on a different host.

  • When a management server starts, it first checks for any other management server in the same NDB Cluster, and upon successful connection to the other management server uses its configuration data. This means that the management server --reload and --initial startup options are ignored unless the management server is the only one running. It also means that, when performing a rolling restart of an NDB Cluster with multiple management nodes, the management server reads its own configuration file if (and only if) it is the only management server running in this NDB Cluster. See Section 7.5, “Performing a Rolling Restart of an NDB Cluster”, for more information.

Multiple network addresses.  Multiple network addresses per data node are not supported. Use of these is liable to cause problems: In the event of a data node failure, an SQL node waits for confirmation that the data node went down but never receives it because another route to that data node remains open. This can effectively make the cluster inoperable.

Note

It is possible to use multiple network hardware interfaces (such as Ethernet cards) for a single data node, but these must be bound to the same address. This also means that it not possible to use more than one [tcp] section per connection in the config.ini file. See Section 5.3.9, “NDB Cluster TCP/IP Connections”, for more information.

3.6.11 Previous NDB Cluster Issues Resolved in MySQL 5.1, NDB Cluster 6.x, and NDB Cluster 7.x

A number of limitations and related issues existing in earlier versions of NDB Cluster have been resolved:

  • Variable-length column support.  The NDBCLUSTER storage engine now supports variable-length column types for in-memory tables.

    Previously, for example, any Cluster table having one or more VARCHAR fields which contained only relatively small values, much more memory and disk space were required when using the NDBCLUSTER storage engine than would have been the case for the same table and data using the MyISAM engine. In other words, in the case of a VARCHAR column, such a column required the same amount of storage as a CHAR column of the same size. In MySQL 5.1, this is no longer the case for in-memory tables, where storage requirements for variable-length column types such as VARCHAR and BINARY are comparable to those for these column types when used in MyISAM tables (see Data Type Storage Requirements).

    Important

    For NDB Cluster Disk Data tables, the fixed-width limitation continues to apply. See Section 7.12, “NDB Cluster Disk Data Tables”.

  • Replication with NDB Cluster.  It is now possible to use MySQL replication with Cluster databases. For details, see Chapter 8, NDB Cluster Replication.

    Circular Replication.  Circular replication is also supported with NDB Cluster, beginning with MySQL 5.1.18. See Section 8.10, “NDB Cluster Replication: Multi-Master and Circular Replication”.

  • auto_increment_increment and auto_increment_offset.  The auto_increment_increment and auto_increment_offset server system variables are supported for NDB Cluster Replication.

  • Backup and restore between architectures.  It is possible to perform a Cluster backup and restore between different architectures. Previously—for example—you could not back up a cluster running on a big-endian platform and then restore from that backup to a cluster running on a little-endian system. (Bug #19255)

  • Multiple data nodes, multi-threaded data nodes.  NDB Cluster 7.2 supports multiple data node processes on a single host as well as multi-threaded data node processes. See Section 6.3, “ndbmtd — The NDB Cluster Data Node Daemon (Multi-Threaded)”, for more information.

  • Identifiers.  Formerly (in MySQL 5.0 and earlier), database names, table names and attribute names could not be as long for NDB tables as tables using other storage engines, because attribute names were truncated internally. In MySQL 5.1 and later, names of NDB Cluster databases, tables, and table columns follow the same rules regarding length as they do for any other storage engine.

  • Length of CREATE TABLE statements.  CREATE TABLE statements may be no more than 4096 characters in length. This limitation affects MySQL 5.1.6, 5.1.7, and 5.1.8 only. (See Bug #17813)

  • IGNORE and REPLACE functionality.  In MySQL 5.1.7 and earlier, INSERT IGNORE, UPDATE IGNORE, and REPLACE were supported only for primary keys, but not for unique keys. It was possible to work around this issue by removing the constraint, then dropping the unique index, performing any inserts, and then adding the unique index again.

    This limitation was removed for INSERT IGNORE and REPLACE in MySQL 5.1.8. (See Bug #17431.)

  • AUTO_INCREMENT columns.  In MySQL 5.1.10 and earlier versions, the maximum number of tables having AUTO_INCREMENT columns—including those belonging to hidden primary keys—was 2048.

    This limitation was lifted in MySQL 5.1.11.

  • Maximum number of cluster nodes.  The total maximum number of nodes in an NDB Cluster is 255, including all SQL nodes (MySQL Servers), API nodes (applications accessing the cluster other than MySQL servers), data nodes, and management servers. The total number of data nodes and management nodes is 63, of which up to 48 can be data nodes.

    Note

    A data node cannot have a node ID greater than 49.

  • Recovery of memory from deleted rows.  Memory can be reclaimed from an NDB table for reuse with any NDB table by employing OPTIMIZE TABLE, subject to the following limitations:

    • Only in-memory tables are supported; the OPTIMIZE TABLE statement has no effect on NDB Cluster Disk Data tables.

    • Only variable-length columns (such as those declared as VARCHAR, TEXT, or BLOB) are supported.

      However, you can force columns defined using fixed-length data types (such as CHAR) to be dynamic using the ROW_FORMAT or COLUMN_FORMAT option with a CREATE TABLE or ALTER TABLE statement.

      See CREATE TABLE Syntax, and ALTER TABLE Syntax, for information on these options.

    You can regulate the effects of OPTIMIZE on performance by adjusting the value of the global system variable ndb_optimization_delay, which sets the number of milliseconds to wait between batches of rows being processed by OPTIMIZE. The default value is 10 milliseconds. It is possible to set a lower value (to a minimum of 0), but not recommended. The maximum is 100000 milliseconds (that is, 100 seconds).

  • Number of tables.  The maximum number of NDBCLUSTER tables in a single NDB Cluster is included in the total maximum number of NDBCLUSTER database objects (20320). (See Section 3.6.5, “Limits Associated with Database Objects in NDB Cluster”.)

  • Adding and dropping of data nodes.  In NDB Cluster 7.2 (NDB Cluster 7.0 and later), it is possible to add new data nodes to a running NDB Cluster by performing a rolling restart, so that the cluster and the data stored in it remain available to applications.

    When planning to increase the number of data nodes in the cluster online, you should be aware of and take into account the following issues:

    • New data nodes can be added online to an NDB Cluster only as part of a new node group.

    • New data nodes can be added online, but cannot be dropped online. Reducing the number of data nodes requires a system restart of the cluster.

    • As in previous NDB Cluster releases, it is not possible to change online either the number of replicas (NoOfReplicas configuration parameter) or the number of data nodes per node group. These changes require a system restart.

    • Redistribution of existing cluster data using the new data nodes is not automatic; however, this can be accomplished using simple SQL statements in the mysql client or other MySQL client application once the nodes have been added. During this procedure, it is not possible to perform DDL operations, although DML operations can continue as normal.

      The distribution of new cluster data (that is, data stored in the cluster after the new nodes have been added) uses the new nodes without manual intervention.

    For more information, see Section 7.13, “Adding NDB Cluster Data Nodes Online”.

  • Native support for default column values.  Starting with NDB 7.1.0, default values for table columns are stored by NDBCLUSTER, rather than by the MySQL server as was previously the case. Because less data must be sent from an SQL node to the data nodes, inserts on tables having column value defaults can be performed more efficiently than before.

    Tables created using previous NDB Cluster releases can still be used in NDB 7.1.0 and later, although they do not support native default values and continue to use defaults supplied by the MySQL server until they are upgraded. This can be done by means of an offline ALTER TABLE statement.

    Important

    You cannot set or change a table column's default value using an online ALTER TABLE operation

  • Distribution of MySQL users and privileges.  Previously, MySQL users and privileges created on one SQL node were unique to that SQL node, due to the fact that the MySQL grant tables were restricted to using the MyISAM storage engine. Beginning with NDB 7.2.0, it is possible, following installation of the NDB Cluster software and setup of the desired users and privileges on one SQL node, to convert the grant tables to use NDB and thus to distribute the users and privileges across all SQL nodes connected to the cluster. You can do this by loading and making use of a set of stored procedures defined in an SQL script supplied with the NDB Cluster distribution. For more information, see Section 7.14, “Distributed MySQL Privileges for NDB Cluster”.

  • Number of rows per partition.  Previously, a single NDB Cluster partition could hold a maximum of 46137488 rows. This limitation was removed in NDB 7.2.9. (Bug #13844405, Bug #14000373)

    If you are still using a previous NDB Cluster release, you can work around this limitation by taking advantage of the fact that the number of partitions is the same as the number of data nodes in the cluster (see Section 3.2, “NDB Cluster Nodes, Node Groups, Replicas, and Partitions”). This means that, by increasing the number of data nodes, you can increase the available space for storing data.

    NDB Cluster 7.2 also supports increasing the number of data nodes in the cluster while the cluster remains in operation. See Section 7.13, “Adding NDB Cluster Data Nodes Online”, for more information.

    It is also possible to increase the number of partitions for NDB tables by using explicit KEY or LINEAR KEY partitioning (see KEY Partitioning).