Bug #76233 XA prepare is logged ahead of engine prepare
Submitted: 9 Mar 2015 20:10
Reporter: Andrei Elkin Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.7.7 OS:Any
Assigned to:

[9 Mar 2015 20:10] Andrei Elkin
Description:
Fixes to bug11745231 are inaccurate in ordering of two critical to crash-safety operations.
A xa prepared transaction is logged before it was actually prepared in engine.
Therefore in case of a crash after the xa-prepared got logged, the engine may lose
the transaction so at recovery it would be found in binlog but not in engine.

This case should be fixed with calling engine.prepare() first, and logging afterwards.
A typical crash scenario of

  step 1. XA gets prepared in the engine
  step 2. *Crash*
  step 3. Would be logged, but it never happened.

would lead to engine having prepared trx, but binlog not. That fact should
detected in augmented recovery, similarly how it's done to internal XA.
When the xid event, internal or the user's one, is not found the prepared
transaction is rolled back.
However when the user's xid is found this XA is not yet to commit,
as it has to wait for explicit conclusive query.

How to repeat:
See sources code and run failure simulation to witness improper ordering.

Suggested fix:
Correct ordering.
[11 Mar 2015 11:54] Sven Sandberg
Posted by developer:
 
The scope of this bug is to make each of XA PREPARE, XA COMMIT, and XA ROLLBACK be correctly logged and recovered in case there is a crash. The state of the binary log should agree with the state of the binary log, and this should happen automatically.

There are four sub-tasks:

1. Fix the logging order of XA PREPARE as reported above.
2. Extend Previous_gtids_log_event with a field containing the list of XA identifiers of all transactions that are in xa-prepare state.
3. Implement a recovery routine for XA PREPARE.
4. Implement a recovery routine for XA COMMIT and XA ROLLBACK.
[11 Mar 2015 12:41] Sven Sandberg
Posted by developer:
 
Clarification of execution order:
- XA PREPARE is first prepared in the engine and then written to the binary log
- XA COMMIT is first written to the binary log and then committed in the engine
- XA ROLLBACK is first written to the binary log and then committed in the engine

XA COMMIT and XA ROLLBACK are already executed in this order. We need to change XA PREPARE to execute in this order.

Proposed recovery routine (pseudocode):

  # This recovery procedure is designed so that it works in the
  # following corner cases:
  # - there are two consecutive transactions having the same XID (this
  #   is allowed as long as one commits before the other starts).
  # - there is a binlog rotation (and possibly even purge) between
  #   XA PREPARE and XA COMMIT/XA ROLLBACK.
  # - a single transaction involves multiple storage engines.

  # set of transactions that were in prepare state when the server
  # stopped, according to the binary log
  HASH prepared_in_binlog
  # set of transactions that were committed in the binary log, and not
  # followed by a prepare of a different transaction with the same xid.
  HASH committed_in_binlog
  # set of transactions that were rolled back in the binary log, and not
  # followed by a prepare of a different transaction with the same xid
  HASH rolledback_in_binlog
  # set of transactions that are in prepared state in any storage engine
  HASH prepared_in_engine = get_prepared_from_engine()

  for event in binlog:
    if event.is_previous_gtids_log_event():
      prepared_in_binlog.add(event.get_xid_set())
    else if event.is_xa_prepare():
      prepared_in_binlog.add(event.xid)
      committed_in_binlog.remove(event.xid)
      rolledback_in_binlog.remove(event.xid)
    else if event.is_xa_commit():
      prepared_in_binlog.remove(event.xid)
      committed_in_binlog.add(event.xid)
    else if event.is_xa_rollback():
      prepared_in_binlog.remove(event.xid)
      rolledback_in_binlog.add(event.xid)

  # recover from crash between binlog commit and engine commit
  HASH to_commit = intersection(committed_in_binlog, prepared_in_engine)
  # recover from crash between binlog rollback and engine rollback
  HASH to_rollback = intersection(rolledback_in_binlog, prepared_in_engine)
  # recover from crash between engine prepare and binlog prepare
  to_rollback.add(prepared_in_engine - prepared_in_binlog)

  rollback(to_rollback)
  commit(to_commit)