Chapter 3 Performance Schema Configuration

Table of Contents

3.1 Performance Schema Build Configuration
3.2 Performance Schema Startup Configuration
3.3 Performance Schema Runtime Configuration
3.3.1 Performance Schema Event Timing
3.3.2 Performance Schema Event Filtering
3.3.3 Event Pre-Filtering
3.3.4 Naming Instruments or Consumers for Filtering Operations
3.3.5 Determining What Is Instrumented

To use the MySQL Performance Schema, these configuration considerations apply:

3.1 Performance Schema Build Configuration

For the Performance Schema to be available, it must be configured into the MySQL server at build time. Binary MySQL distributions provided by Oracle Corporation are configured to support the Performance Schema. If you use a binary MySQL distribution from another provider, check with the provider whether the distribution has been appropriately configured.

If you build MySQL from a source distribution, enable the Performance Schema by running CMake with the WITH_PERFSCHEMA_STORAGE_ENGINE option enabled:

shell> cmake . -DWITH_PERFSCHEMA_STORAGE_ENGINE=1

Configuring MySQL with the -DWITHOUT_PERFSCHEMA_STORAGE_ENGINE=1 option prevents inclusion of the Performance Schema, so if you want it included, do not use this option. See MySQL Source-Configuration Options.

If you install MySQL over a previous installation that was configured without the Performance Schema (or with an older version of the Performance Schema that may not have all the current tables), run mysql_upgrade after starting the server to ensure that the performance_schema database exists with all current tables. Then restart the server. One indication that you need to do this is the presence of messages such as the following in the error log:

[ERROR] Native table 'performance_schema'.'events_waits_history'
has the wrong structure
[ERROR] Native table 'performance_schema'.'events_waits_history_long'
has the wrong structure
...

To verify whether a server was built with Performance Schema support, check its help output. If the Performance Schema is available, the output will mention several variables with names that begin with performance_schema:

shell> mysqld --verbose --help
...
  --performance_schema
                      Enable the performance schema.
  --performance_schema_events_waits_history_long_size=#
                      Number of rows in events_waits_history_long.
...

You can also connect to the server and look for a line that names the PERFORMANCE_SCHEMA storage engine in the output from SHOW ENGINES:

mysql> SHOW ENGINES\G
...
      Engine: PERFORMANCE_SCHEMA
     Support: YES
     Comment: Performance Schema
Transactions: NO
          XA: NO
  Savepoints: NO
...

If the Performance Schema was not configured into the server at build time, no row for PERFORMANCE_SCHEMA will appear in the output from SHOW ENGINES. You might see performance_schema listed in the output from SHOW DATABASES, but it will have no tables and you will not be able to use it.

A line for PERFORMANCE_SCHEMA in the SHOW ENGINES output means that the Performance Schema is available, not that it is enabled. To enable it, you must do so at server startup, as described in the next section.

3.2 Performance Schema Startup Configuration

The Performance Schema is disabled by default. To enable it, start the server with the performance_schema variable enabled. For example, use these lines in your my.cnf file:

[mysqld]
performance_schema

If the server is unable to allocate any internal buffer during Performance Schema initialization, the Performance Schema disables itself and sets performance_schema to OFF, and the server runs without instrumentation.

The Performance Schema includes several system variables that provide configuration information:

mysql> SHOW VARIABLES LIKE 'perf%';
+---------------------------------------------------+---------+
| Variable_name                                     | Value   |
+---------------------------------------------------+---------+
| performance_schema                                | ON      |
| performance_schema_events_waits_history_long_size | 10000   |
| performance_schema_events_waits_history_size      | 10      |
| performance_schema_max_cond_classes               | 80      |
| performance_schema_max_cond_instances             | 1000    |
| performance_schema_max_file_classes               | 50      |
| performance_schema_max_file_handles               | 32768   |
| performance_schema_max_file_instances             | 10000   |
| performance_schema_max_mutex_classes              | 200     |
| performance_schema_max_mutex_instances            | 1000000 |
| performance_schema_max_rwlock_classes             | 30      |
| performance_schema_max_rwlock_instances           | 1000000 |
| performance_schema_max_table_handles              | 100000  |
| performance_schema_max_table_instances            | 50000   |
| performance_schema_max_thread_classes             | 50      |
| performance_schema_max_thread_instances           | 1000    |
+---------------------------------------------------+---------+

The performance_schema variable is ON or OFF to indicate whether the Performance Schema is enabled or disabled. The other variables indicate table sizes (number of rows) or memory allocation values.

Note

With the Performance Schema enabled, the number of Performance Schema instances affects the server memory footprint, perhaps to a large extent. It may be necessary to tune the values of Performance Schema system variables to find the number of instances that balances insufficient instrumentation against excessive memory consumption.

To change the value of Performance Schema system variables, set them at server startup. For example, put the following lines in a my.cnf file to change the sizes of the history tables for wait events:

[mysqld]
performance_schema
performance_schema_events_waits_history_size=20
performance_schema_events_waits_history_long_size=15000

3.3 Performance Schema Runtime Configuration

Performance Schema setup tables contain information about monitoring configuration:

mysql> SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES
    -> WHERE TABLE_SCHEMA = 'performance_schema'
    -> AND TABLE_NAME LIKE 'setup%';
+-------------------+
| TABLE_NAME        |
+-------------------+
| setup_consumers   |
| setup_instruments |
| setup_timers      |
+-------------------+

You can examine the contents of these tables to obtain information about Performance Schema monitoring characteristics. If you have the UPDATE privilege, you can change Performance Schema operation by modifying setup tables to affect how monitoring occurs. For additional details about these tables, see Section 8.2, “Performance Schema Setup Tables”.

To see which event timer is selected, query the setup_timers tables:

mysql> SELECT * FROM setup_timers;
+------+------------+
| NAME | TIMER_NAME |
+------+------------+
| wait | CYCLE      |
+------+------------+

The NAME value indicates the type of instrument to which the timer applies, and TIMER_NAME indicates which timer applies to those instruments. The timer applies to instruments where their name begins with a component matching the NAME value. There are only wait instruments, so this table has only one row and the timer applies to all instruments.

To change the timer, update the NAME value. For example, to use the NANOSECOND timer:

mysql> UPDATE setup_timers SET TIMER_NAME = 'NANOSECOND'
    -> WHERE NAME = 'wait';
mysql> SELECT * FROM setup_timers;
+------+------------+
| NAME | TIMER_NAME |
+------+------------+
| wait | NANOSECOND |
+------+------------+

For discussion of timers, see Section 3.3.1, “Performance Schema Event Timing”.

The setup_instruments and setup_consumers tables list the instruments for which events can be collected and the types of consumers for which event information actually is collected, respectively. Section 3.3.2, “Performance Schema Event Filtering”, discusses how you can modify these tables to affect event collection.

If there are Performance Schema configuration changes that must be made at runtime using SQL statements and you would like these changes to take effect each time the server starts, put the statements in a file and start the server with the --init-file=file_name option. This strategy can also be useful if you have multiple monitoring configurations, each tailored to produce a different kind of monitoring, such as casual server health monitoring, incident investigation, application behavior troubleshooting, and so forth. Put the statements for each monitoring configuration into their own file and specify the appropriate file as the --init-file argument when you start the server.

3.3.1 Performance Schema Event Timing

Events are collected by means of instrumentation added to the server source code. Instruments time events, which is how the Performance Schema provides an idea of how long events take. It is also possible to configure instruments not to collect timing information. This section discusses the available timers and their characteristics, and how timing values are represented in events.

Performance Schema Timers

Two Performance Schema tables provide timer information:

Each timer row in setup_timers must refer to one of the timers listed in performance_timers.

Timers vary in precision and amount of overhead. To see what timers are available and their characteristics, check the performance_timers table:

mysql> SELECT * FROM performance_timers;
+-------------+-----------------+------------------+----------------+
| TIMER_NAME  | TIMER_FREQUENCY | TIMER_RESOLUTION | TIMER_OVERHEAD |
+-------------+-----------------+------------------+----------------+
| CYCLE       |      2389029850 |                1 |             72 |
| NANOSECOND  |      1000000000 |                1 |            112 |
| MICROSECOND |         1000000 |                1 |            136 |
| MILLISECOND |            1036 |                1 |            168 |
| TICK        |             105 |                1 |           2416 |
+-------------+-----------------+------------------+----------------+

The columns have these meanings:

  • The TIMER_NAME column shows the names of the available timers. CYCLE refers to the timer that is based on the CPU (processor) cycle counter. The timers in setup_timers that you can use are those that do not have NULL in the other columns. If the values associated with a given timer name are NULL, that timer is not supported on your platform.

  • TIMER_FREQUENCY indicates the number of timer units per second. For a cycle timer, the frequency is generally related to the CPU speed. The value shown was obtained on a system with a 2.4GHz processor. The other timers are based on fixed fractions of seconds. For TICK, the frequency may vary by platform (for example, some use 100 ticks/second, others 1000 ticks/second).

  • TIMER_RESOLUTION indicates the number of timer units by which timer values increase at a time. If a timer has a resolution of 10, its value increases by 10 each time.

  • TIMER_OVERHEAD is the minimal number of cycles of overhead to obtain one timing with the given timer. The overhead per event is twice the value displayed because the timer is invoked at the beginning and end of the event.

To see which timer is in effect or to change the timer, access the setup_timers table:

mysql> SELECT * FROM setup_timers;
+------+------------+
| NAME | TIMER_NAME |
+------+------------+
| wait | CYCLE      |
+------+------------+
mysql> UPDATE setup_timers SET TIMER_NAME = 'MICROSECOND'
    -> WHERE NAME = 'wait';
mysql> SELECT * FROM setup_timers;
+------+-------------+
| NAME | TIMER_NAME  |
+------+-------------+
| wait | MICROSECOND |
+------+-------------+

By default, the Performance Schema uses the best timer available for each instrument type, but you can select a different one. Generally the best timer is CYCLE, which uses the CPU cycle counter whenever possible to provide high precision and low overhead.

The precision offered by the cycle counter depends on processor speed. If the processor runs at 1 GHz (one billion cycles/second) or higher, the cycle counter delivers sub-nanosecond precision. Using the cycle counter is much cheaper than getting the actual time of day. For example, the standard gettimeofday() function can take hundreds of cycles, which is an unacceptable overhead for data gathering that may occur thousands or millions of times per second.

Cycle counters also have disadvantages:

  • End users expect to see timings in wall-clock units, such as fractions of a second. Converting from cycles to fractions of seconds can be expensive. For this reason, the conversion is a quick and fairly rough multiplication operation.

  • Processor cycle rate might change, such as when a laptop goes into power-saving mode or when a CPU slows down to reduce heat generation. If a processor's cycle rate fluctuates, conversion from cycles to real-time units is subject to error.

  • Cycle counters might be unreliable or unavailable depending on the processor or the operating system. For example, on Pentiums, the instruction is RDTSC (an assembly-language rather than a C instruction) and it is theoretically possible for the operating system to prevent user-mode programs from using it.

  • Some processor details related to out-of-order execution or multiprocessor synchronization might cause the counter to seem fast or slow by up to 1000 cycles.

MySQL works with cycle counters on x386 (Windows, OS X, Linux, Solaris, and other Unix flavors), PowerPC, and IA-64.

Performance Schema Timer Representation in Events

Rows in Performance Schema tables that store current events and historical events have three columns to represent timing information: TIMER_START and TIMER_END indicate when the event started and finished, and TIMER_WAIT indicates the event duration.

The setup_instruments table has an ENABLED column to indicate the instruments for which to collect events. The table also has a TIMED column to indicate which instruments are timed. If an instrument is not enabled, it produces no events. If an enabled instrument is not timed, events produced by the instrument have NULL for the TIMER_START, TIMER_END, and TIMER_WAIT timer values. This in turn causes those values to be ignored when calculating the sum, minimum, maximum, and average time values in summary tables.

Within events, times are stored in picoseconds (trillionths of a second) to normalize them to a standard unit, regardless of which timer is selected. The timer used for an event is the one in effect when event timing begins. This timer is used to convert start and end values to picoseconds for storage in the event.

Modifications to the setup_timers table affect monitoring immediately. Events already measured are stored using the original timer unit, and events in progress may use the original timer for the begin time and the new timer for the end time. To avoid unpredictable results if you make timer changes, use TRUNCATE TABLE to reset Performance Schema statistics.

The timer baseline (time zero) occurs at Performance Schema initialization during server startup. TIMER_START and TIMER_END values in events represent picoseconds since the baseline. TIMER_WAIT values are durations in picoseconds.

Picosecond values in events are approximate. Their accuracy is subject to the usual forms of error associated with conversion from one unit to another. If the CYCLE timer is used and the processor rate varies, there might be drift. For these reasons, it is not reasonable to look at the TIMER_START value for an event as an accurate measure of time elapsed since server startup. On the other hand, it is reasonable to use TIMER_START or TIMER_WAIT values in ORDER BY clauses to order events by start time or duration.

The choice of picoseconds in events rather than a value such as microseconds has a performance basis. One implementation goal was to show results in a uniform time unit, regardless of the timer. In an ideal world this time unit would look like a wall-clock unit and be reasonably precise; in other words, microseconds. But to convert cycles or nanoseconds to microseconds, it would be necessary to perform a division for every instrumentation. Division is expensive on many platforms. Multiplication is not expensive, so that is what is used. Therefore, the time unit is an integer multiple of the highest possible TIMER_FREQUENCY value, using a multiplier large enough to ensure that there is no major precision loss. The result is that the time unit is picoseconds. This precision is spurious, but the decision enables overhead to be minimized.

3.3.2 Performance Schema Event Filtering

Events are processed in a producer/consumer fashion:

  • Instrumented code is the source for events and produces events to be collected. The setup_instruments table lists the instruments for which events can be collected, whether they are enabled, and (for enabled instruments) whether to collect timing information:

    mysql> SELECT * FROM setup_instruments;
    +------------------------------------------------------------+---------+-------+
    | NAME                                                       | ENABLED | TIMED |
    +------------------------------------------------------------+---------+-------+
    ...
    | wait/synch/mutex/sql/LOCK_global_read_lock                 | YES     | YES   |
    | wait/synch/mutex/sql/LOCK_global_system_variables          | YES     | YES   |
    | wait/synch/mutex/sql/LOCK_lock_db                          | YES     | YES   |
    | wait/synch/mutex/sql/LOCK_manager                          | YES     | YES   |
    ...
    | wait/synch/rwlock/sql/LOCK_grant                           | YES     | YES   |
    | wait/synch/rwlock/sql/LOGGER::LOCK_logger                  | YES     | YES   |
    | wait/synch/rwlock/sql/LOCK_sys_init_connect                | YES     | YES   |
    | wait/synch/rwlock/sql/LOCK_sys_init_slave                  | YES     | YES   |
    ...
    | wait/io/file/sql/binlog                                    | YES     | YES   |
    | wait/io/file/sql/binlog_index                              | YES     | YES   |
    | wait/io/file/sql/casetest                                  | YES     | YES   |
    | wait/io/file/sql/dbopt                                     | YES     | YES   |
    ...
    
  • Performance Schema tables are the destinations for events and consume events. The setup_consumers table lists the types of consumers to which event information can be sent:

    mysql> SELECT * FROM setup_consumers;
    +----------------------------------------------+---------+
    | NAME                                         | ENABLED |
    +----------------------------------------------+---------+
    | events_waits_current                         | YES     |
    | events_waits_history                         | YES     |
    | events_waits_history_long                    | YES     |
    | events_waits_summary_by_thread_by_event_name | YES     |
    | events_waits_summary_by_event_name           | YES     |
    | events_waits_summary_by_instance             | YES     |
    | file_summary_by_event_name                   | YES     |
    | file_summary_by_instance                     | YES     |
    +----------------------------------------------+---------+
    

Filtering can be done at different stages of performance monitoring:

  • Pre-filtering. This is done by modifying Performance Schema configuration so that only certain types of events are collected from producers, and collected events update only certain consumers. To do this, enable or disable instruments or consumers. Pre-filtering is done by the Performance Schema and has a global effect that applies to all users.

    Reasons to use pre-filtering:

    • To reduce overhead. Performance Schema overhead should be minimal even with all instruments enabled, but perhaps you want to reduce it further. Or you do not care about timing events and want to disable the timing code to eliminate timing overhead.

    • To avoid filling the current-events or history tables with events in which you have no interest. Pre-filtering leaves more room in these tables for instances of rows for enabled instrument types. If you enable only file instruments with pre-filtering, no rows are collected for nonfile instruments. With post-filtering, nonfile events are collected, leaving fewer rows for file events.

    • To avoid maintaining some kinds of event tables. If you disable a consumer, the server does not spend time maintaining destinations for that consumer. For example, if you do not care about event histories, you can disable the history table consumers to improve performance.

  • Post-filtering. This involves the use of WHERE clauses in queries that select information from Performance Schema tables, to specify which of the available events you want to see. Post-filtering is performed on a per-user basis because individual users select which of the available events are of interest.

    Reasons to use post-filtering:

    • To avoid making decisions for individual users about which event information is of interest.

    • To use the Performance Schema to investigate a performance issue when the restrictions to impose using pre-filtering are not known in advance.

The following sections provide more detail about pre-filtering and provide guidelines for naming instruments or consumers in filtering operations. For information about writing queries to retrieve information (post-filtering), see Chapter 4, Performance Schema Queries.

3.3.3 Event Pre-Filtering

Pre-filtering is done by modifying Performance Schema configuration so that only certain types of events are collected from producers, and collected events update only certain consumers. This type of filtering is done by the Performance Schema and has a global effect that applies to all users.

Pre-filtering can be applied to either the producer or consumer stage of event processing:

  • To affect pre-filtering at the producer stage, modify the setup_instruments table. An instrument can be enabled or disabled by setting its ENABLED value to YES or NO. An instrument can be configured whether to collect timing information by setting its TIMED value to YES or NO.

  • To affect pre-filtering at the consumer stage, modify the setup_consumers table. A consumer can be enabled or disabled by setting its ENABLED value to YES or NO.

Here are some examples that show the types of pre-filtering operations available:

  • Disable all instruments:

    mysql> UPDATE setup_instruments SET ENABLED = 'NO';
    

    Now no events will be collected. This change, like other pre-filtering operations, affects other users as well, even if they want to see event information.

  • Disable all file instruments, adding them to the current set of disabled instruments:

    mysql> UPDATE setup_instruments SET ENABLED = 'NO'
        -> WHERE NAME LIKE 'wait/io/file/%';
    
  • Disable only file instruments, enable all other instruments:

    mysql> UPDATE setup_instruments
        -> SET ENABLED = IF(NAME LIKE 'wait/io/file/%', 'NO', 'YES');
    

    The preceding queries use the LIKE operator and the pattern 'wait/io/file/%' to match all instrument names that begin with 'wait/io/file/. For additional information about specifying patterns to select instruments, see Section 3.3.4, “Naming Instruments or Consumers for Filtering Operations”.

  • Enable all but those instruments in the mysys library:

    mysql> UPDATE setup_instruments
        -> SET ENABLED = CASE WHEN NAME LIKE '%/mysys/%' THEN 'YES' ELSE 'NO' END;
    
  • Disable a specific instrument:

    mysql> UPDATE setup_instruments SET ENABLED = 'NO'
        -> WHERE NAME = 'wait/synch/mutex/mysys/TMPDIR_mutex';
    
  • To toggle the state of an instrument, flip its ENABLED value:

    mysql> UPDATE setup_instruments
        -> SET ENABLED = IF(ENABLED = 'YES', 'NO', 'YES')
        -> WHERE NAME = 'wait/synch/mutex/mysys/TMPDIR_mutex';
    
  • Disable timing for all events:

    mysql> UPDATE setup_instruments SET TIMED = 'NO';
    

Setting the TIMED column for instruments affects Performance Schema table contents as described in Section 3.3.1, “Performance Schema Event Timing”.

When you change the monitoring configuration, the Performance Schema does not flush the history tables. Events already collected remain in the current-events and history tables until displaced by newer events. If you disable instruments, you might need to wait a while before events for them are displaced by newer events of interest. Alternatively, use TRUNCATE TABLE to empty the history tables.

After making instrumentation changes, you might want to truncate the summary tables to clear aggregate information for previously collected events. The effect of TRUNCATE TABLE for summary tables is to reset the summary columns to 0 or NULL, not to remove rows.

If you disable a consumer, the server does not spend time maintaining destinations for that consumer. For example, if you do not care about historical event information, disable the history consumers:

mysql> UPDATE setup_consumers
    -> SET ENABLED = 'NO' WHERE NAME LIKE '%history%';

3.3.4 Naming Instruments or Consumers for Filtering Operations

Names given for filtering operations can be as specific or general as required. To indicate a single instrument or consumer, specify its name in full:

mysql> UPDATE setup_instruments
    -> SET ENABLED = 'NO'
    -> WHERE NAME = 'wait/synch/mutex/myisammrg/MYRG_INFO::mutex';
mysql> UPDATE setup_consumers
    -> SET ENABLED = 'NO' WHERE NAME = 'file_summary_by_instance';

To specify a group of instruments or consumers, use a pattern that matches the group members:

mysql> UPDATE setup_instruments
    -> SET ENABLED = 'NO'
    -> WHERE NAME LIKE 'wait/synch/mutex/%';
mysql> UPDATE setup_consumers
    -> SET ENABLED = 'NO' WHERE NAME LIKE '%history%';

If you use a pattern, it should be chosen so that it matches all the items of interest and no others. For example, to select all file I/O instruments, it is better to use a pattern that includes the entire instrument name prefix:

... WHERE NAME LIKE 'wait/io/file/%';

A pattern of '%/file/%' will match other instruments that have a component of '/file/' anywhere in the name. Even less suitable is the pattern '%file%' because it will match instruments with 'file' anywhere in the name, such as wait/synch/mutex/sql/LOCK_des_key_file.

To check which instrument or consumer names a pattern matches, perform a simple test:

mysql> SELECT NAME FROM setup_instruments WHERE NAME LIKE 'pattern';
mysql> SELECT NAME FROM setup_consumers WHERE NAME LIKE 'pattern';

For information about the types of names that are supported, see Chapter 5, Performance Schema Instrument Naming Conventions.

3.3.5 Determining What Is Instrumented

It is always possible to determine what instruments the Performance Schema includes by checking the setup_instruments table. For example, to see what file-related events are instrumented for the InnoDB storage engine, use this query:

mysql> SELECT * FROM setup_instruments WHERE NAME LIKE 'wait/io/file/innodb/%';
+--------------------------------------+---------+-------+
| NAME                                 | ENABLED | TIMED |
+--------------------------------------+---------+-------+
| wait/io/file/innodb/innodb_data_file | YES     | YES   |
| wait/io/file/innodb/innodb_log_file  | YES     | YES   |
| wait/io/file/innodb/innodb_temp_file | YES     | YES   |
+--------------------------------------+---------+-------+

An exhaustive description of precisely what is instrumented is not given in this documentation, for several reasons:

  • What is instrumented is the server code. Changes to this code occur often, which also affects the set of instruments.

  • It is not practical to list all the instruments because there are hundreds of them.

  • As described earlier, it is possible to find out by querying the setup_instruments table. This information is always up to date for your version of MySQL, also includes instrumentation for instrumented plugins you might have installed that are not part of the core server, and can be used by automated tools.