Chapter 10 MySQL Differences from Standard SQL

Table of Contents

10.1 SELECT INTO TABLE
10.2 UPDATE
10.3 Transactions and Atomic Operations
10.4 Foreign Key Differences
10.5 '--' as the Start of a Comment

We try to make MySQL Server follow the ANSI SQL standard and the ODBC SQL standard, but MySQL Server performs operations differently in some cases:

10.1 SELECT INTO TABLE

MySQL Server doesn't support the SELECT ... INTO TABLE Sybase SQL extension. Instead, MySQL Server supports the INSERT INTO ... SELECT standard SQL syntax, which is basically the same thing. See INSERT ... SELECT Syntax. For example:

INSERT INTO tbl_temp2 (fld_id)
    SELECT tbl_temp1.fld_order_id
    FROM tbl_temp1 WHERE tbl_temp1.fld_order_id > 100;

Alternatively, you can use SELECT ... INTO OUTFILE or CREATE TABLE ... SELECT.

You can use SELECT ... INTO with user-defined variables. The same syntax can also be used inside stored routines using cursors and local variables. See SELECT ... INTO Syntax.

10.2 UPDATE

If you access a column from the table to be updated in an expression, UPDATE uses the current value of the column. The second assignment in the following statement sets col2 to the current (updated) col1 value, not the original col1 value. The result is that col1 and col2 have the same value. This behavior differs from standard SQL.

UPDATE t1 SET col1 = col1 + 1, col2 = col1;

10.3 Transactions and Atomic Operations

MySQL Server (version 3.23-max and all versions 4.0 and above) supports transactions with the InnoDB transactional storage engine. InnoDB provides full ACID compliance. See Storage Engines. For information about InnoDB differences from standard SQL with regard to treatment of transaction errors, see InnoDB Error Handling.

The other nontransactional storage engines in MySQL Server (such as MyISAM) follow a different paradigm for data integrity called atomic operations. In transactional terms, MyISAM tables effectively always operate in autocommit = 1 mode. Atomic operations often offer comparable integrity with higher performance.

Because MySQL Server supports both paradigms, you can decide whether your applications are best served by the speed of atomic operations or the use of transactional features. This choice can be made on a per-table basis.

As noted, the tradeoff for transactional versus nontransactional storage engines lies mostly in performance. Transactional tables have significantly higher memory and disk space requirements, and more CPU overhead. On the other hand, transactional storage engines such as InnoDB also offer many significant features. MySQL Server's modular design enables the concurrent use of different storage engines to suit different requirements and deliver optimum performance in all situations.

But how do you use the features of MySQL Server to maintain rigorous integrity even with the nontransactional MyISAM tables, and how do these features compare with the transactional storage engines?

  • If your applications are written in a way that is dependent on being able to call ROLLBACK rather than COMMIT in critical situations, transactions are more convenient. Transactions also ensure that unfinished updates or corrupting activities are not committed to the database; the server is given the opportunity to do an automatic rollback and your database is saved.

    If you use nontransactional tables, you must resolve potential problems at the application level by including simple checks before updates and by running simple scripts that check the databases for inconsistencies and automatically repair or warn if such an inconsistency occurs. You can normally fix tables perfectly with no data integrity loss just by using the MySQL log or even adding one extra log.

  • More often than not, critical transactional updates can be rewritten to be atomic. Generally speaking, all integrity problems that transactions solve can be done with LOCK TABLES or atomic updates, ensuring that there are no automatic aborts from the server, which is a common problem with transactional database systems.

  • To be safe with MySQL Server, regardless of whether you use transactional tables, you only need to have backups and have binary logging turned on. When that is true, you can recover from any situation that you could with any other transactional database system. It is always good to have backups, regardless of which database system you use.

The transactional paradigm has its advantages and disadvantages. Many users and application developers depend on the ease with which they can code around problems where an abort appears to be necessary, or is necessary. However, even if you are new to the atomic operations paradigm, or more familiar with transactions, do consider the speed benefit that nontransactional tables can offer on the order of three to five times the speed of the fastest and most optimally tuned transactional tables.

In situations where integrity is of highest importance, MySQL Server offers transaction-level reliability and integrity even for nontransactional tables. If you lock tables with LOCK TABLES, all updates stall until integrity checks are made. If you obtain a READ LOCAL lock (as opposed to a write lock) for a table that enables concurrent inserts at the end of the table, reads are permitted, as are inserts by other clients. The newly inserted records are not be seen by the client that has the read lock until it releases the lock. With INSERT DELAYED, you can write inserts that go into a local queue until the locks are released, without having the client wait for the insert to complete. See Concurrent Inserts, and INSERT DELAYED Syntax.

Atomic, in the sense that we mean it, is nothing magical. It only means that you can be sure that while each specific update is running, no other user can interfere with it, and there can never be an automatic rollback (which can happen with transactional tables if you are not very careful). MySQL Server also guarantees that there are no dirty reads.

Following are some techniques for working with nontransactional tables:

  • Loops that need transactions normally can be coded with the help of LOCK TABLES, and you don't need cursors to update records on the fly.

  • To avoid using ROLLBACK, you can employ the following strategy:

    1. Use LOCK TABLES to lock all the tables you want to access.

    2. Test the conditions that must be true before performing the update.

    3. Update if the conditions are satisfied.

    4. Use UNLOCK TABLES to release your locks.

    This is usually a much faster method than using transactions with possible rollbacks, although not always. The only situation this solution doesn't handle is when someone kills the threads in the middle of an update. In that case, all locks are released but some of the updates may not have been executed.

  • You can also use functions to update records in a single operation. You can get a very efficient application by using the following techniques:

    • Modify columns relative to their current value.

    • Update only those columns that actually have changed.

    For example, when we are updating customer information, we update only the customer data that has changed and test only that none of the changed data, or data that depends on the changed data, has changed compared to the original row. The test for changed data is done with the WHERE clause in the UPDATE statement. If the record wasn't updated, we give the client a message: Some of the data you have changed has been changed by another user. Then we show the old row versus the new row in a window so that the user can decide which version of the customer record to use.

    This gives us something that is similar to column locking but is actually even better because we only update some of the columns, using values that are relative to their current values. This means that typical UPDATE statements look something like these:

    UPDATE tablename SET pay_back=pay_back+125;
    UPDATE customer
      SET
        customer_date='current_date',
        address='new address',
        phone='new phone',
        money_owed_to_us=money_owed_to_us-125
      WHERE
        customer_id=id AND address='old address' AND phone='old phone';
    

    This is very efficient and works even if another client has changed the values in the pay_back or money_owed_to_us columns.

  • When managing unique identifiers, you can avoid statements such as LOCK TABLES or ROLLBACK by using an AUTO_INCREMENT column and either the LAST_INSERT_ID() SQL function or the mysql_insert_id() C API function. See Information Functions, and mysql_insert_id().

    For situations that require row-level locking, use InnoDB tables. Otherwise, with MyISAM tables, you can use a flag column in the table and do something like the following:

    UPDATE tbl_name SET row_flag=1 WHERE id=ID;
    

    MySQL returns 1 for the number of affected rows if the row was found and row_flag wasn't 1 in the original row. You can think of this as though MySQL Server changed the preceding statement to:

    UPDATE tbl_name SET row_flag=1 WHERE id=ID AND row_flag <> 1;
    

10.4 Foreign Key Differences

MySQL's implementation of foreign keys differs from the SQL standard in the following key respects:

  • If there are several rows in the parent table that have the same referenced key value, InnoDB acts in foreign key checks as if the other parent rows with the same key value do not exist. For example, if you have defined a RESTRICT type constraint, and there is a child row with several parent rows, InnoDB does not permit the deletion of any of those parent rows.

    InnoDB performs cascading operations through a depth-first algorithm, based on records in the indexes corresponding to the foreign key constraints.

  • A FOREIGN KEY constraint that references a non-UNIQUE key is not standard SQL but rather an InnoDB extension.

  • If ON UPDATE CASCADE or ON UPDATE SET NULL recurses to update the same table it has previously updated during the same cascade, it acts like RESTRICT. This means that you cannot use self-referential ON UPDATE CASCADE or ON UPDATE SET NULL operations. This is to prevent infinite loops resulting from cascaded updates. A self-referential ON DELETE SET NULL, on the other hand, is possible, as is a self-referential ON DELETE CASCADE. Cascading operations may not be nested more than 15 levels deep.

  • In an SQL statement that inserts, deletes, or updates many rows, foreign key constraints (like unique constraints) are checked row-by-row. When performing foreign key checks, InnoDB sets shared row-level locks on child or parent records that it must examine. MySQL checks foreign key constraints immediately; the check is not deferred to transaction commit. According to the SQL standard, the default behavior should be deferred checking. That is, constraints are only checked after the entire SQL statement has been processed. This means that it is not possible to delete a row that refers to itself using a foreign key.

For information about how the InnoDB storage engine handles foreign keys, see InnoDB and FOREIGN KEY Constraints.

10.5 '--' as the Start of a Comment

Standard SQL uses the C syntax /* this is a comment */ for comments, and MySQL Server supports this syntax as well. MySQL also support extensions to this syntax that enable MySQL-specific SQL to be embedded in the comment, as described in Comment Syntax.

Standard SQL uses -- as a start-comment sequence. MySQL Server uses # as the start comment character. MySQL Server 3.23.3 and up also supports a variant of the -- comment style. That is, the -- start-comment sequence must be followed by a space (or by a control character such as a newline). The space is required to prevent problems with automatically generated SQL queries that use constructs such as the following, where we automatically insert the value of the payment for payment:

UPDATE account SET credit=credit-payment

Consider about what happens if payment has a negative value such as -1:

UPDATE account SET credit=credit--1

credit--1 is a valid expression in SQL, but -- is interpreted as the start of a comment, part of the expression is discarded. The result is a statement that has a completely different meaning than intended:

UPDATE account SET credit=credit

The statement produces no change in value at all. This illustrates that permitting comments to start with -- can have serious consequences.

Using our implementation requires a space following the -- for it to be recognized as a start-comment sequence in MySQL Server 3.23.3 and newer. Therefore, credit--1 is safe to use.

Another safe feature is that the mysql command-line client ignores lines that start with --.

The following information is relevant only if you are running a MySQL version earlier than 3.23.3:

If you have an SQL script in a text file that contains -- comments, you should use the replace utility as follows to convert the comments to use # characters before executing the script:

shell> replace " --" " #" < text-file-with-funny-comments.sql \
         | mysql db_name

That is safer than executing the script in the usual way:

shell> mysql db_name < text-file-with-funny-comments.sql

You can also edit the script file in place to change the -- comments to # comments:

shell> replace " --" " #" -- text-file-with-funny-comments.sql

Change them back with this command:

shell> replace " #" " --" -- text-file-with-funny-comments.sql

See replace — A String-Replacement Utility.