MySQL :: MySQL 5.5 Reference Manual :: 14.9.2.1 The InnoDB Buffer Pool

MySQL 5.5 Reference Manual / ... / The InnoDB Buffer Pool

14.9.2.1 The InnoDB Buffer Pool

InnoDB maintains a storage area called the buffer pool for caching data and indexes in memory. Knowing how the InnoDB buffer pool works, and taking advantage of it to keep frequently accessed data in memory, is an important aspect of MySQL tuning. For information about how the InnoDB buffer pool works, see InnoDB Buffer Pool LRU Algorithm.

You can configure the various aspects of the InnoDB buffer pool to improve performance.

Ideally, you set the size of the buffer pool to as large a value as practical, leaving enough memory for other processes on the server to run without excessive paging. The larger the buffer pool, the more InnoDB acts like an in-memory database, reading data from disk once and then accessing the data from memory during subsequent reads. Buffer pool size is configured using the innodb_buffer_pool_size configuration option.
With 64-bit systems with large memory sizes, you can split the buffer pool into multiple parts, to minimize contention for the memory structures among concurrent operations. For details, see Section 14.9.2.2, “Configuring Multiple Buffer Pool Instances”.
You can keep frequently accessed data in memory despite sudden spikes of activity for operations such as backups or reporting. For details, see Section 14.9.2.3, “Making the Buffer Pool Scan Resistant”.
You can control when and how InnoDB performs read-ahead requests to prefetch pages into the buffer pool asynchronously, in anticipation that the pages will be needed soon. For details, see Section 14.9.2.4, “Configuring InnoDB Buffer Pool Prefetching (Read-Ahead)”.
You can control when background flushing of dirty pages occurs and whether or not InnoDB dynamically adjusts the rate of flushing based on workload. For details, see Section 14.9.2.5, “Configuring InnoDB Buffer Pool Flushing”.

InnoDB Buffer Pool LRU Algorithm

InnoDB manages the buffer pool as a list, using a variation of the least recently used (LRU) algorithm. When room is needed to add a new page to the pool, InnoDB evicts the least recently used page and adds the new page to the middle of the list. This “midpoint insertion strategy” treats the list as two sublists:

At the head, a sublist of “new” (or “young”) pages that were accessed recently.
At the tail, a sublist of “old” pages that were accessed less recently.

This algorithm keeps pages that are heavily used by queries in the new sublist. The old sublist contains less-used pages; these pages are candidates for eviction.

The LRU algorithm operates as follows by default:

3/8 of the buffer pool is devoted to the old sublist.
The midpoint of the list is the boundary where the tail of the new sublist meets the head of the old sublist.
When InnoDB reads a page into the buffer pool, it initially inserts it at the midpoint (the head of the old sublist). A page can be read in because it is required for a user-specified operation such as an SQL query, or as part of a read-ahead operation performed automatically by InnoDB.
Accessing a page in the old sublist makes it “young”, moving it to the head of the buffer pool (the head of the new sublist). If the page was read in because it was required, the first access occurs immediately and the page is made young. If the page was read in due to read-ahead, the first access does not occur immediately (and might not occur at all before the page is evicted).
As the database operates, pages in the buffer pool that are not accessed “age” by moving toward the tail of the list. Pages in both the new and old sublists age as other pages are made new. Pages in the old sublist also age as pages are inserted at the midpoint. Eventually, a page that remains unused for long enough reaches the tail of the old sublist and is evicted.

By default, pages read by queries immediately move into the new sublist, meaning they will stay in the buffer pool for a long time. A table scan (such as performed for a mysqldump operation, or a SELECT statement with no WHERE clause) can bring a large amount of data into the buffer pool and evict an equivalent amount of older data, even if the new data is never used again. Similarly, pages that are loaded by the read-ahead background thread and then accessed only once move to the head of the new list. These situations can push frequently used pages to the old sublist, where they become subject to eviction. For information about optimizing this behavior, see Section 14.9.2.3, “Making the Buffer Pool Scan Resistant”, and Section 14.9.2.4, “Configuring InnoDB Buffer Pool Prefetching (Read-Ahead)”.

InnoDB Standard Monitor output contains several fields in the BUFFER POOL AND MEMORY section that pertain to operation of the buffer pool LRU algorithm. For details, see Section 14.9.2.6, “Monitoring the Buffer Pool Using the InnoDB Standard Monitor”.

InnoDB Buffer Pool Configuration Options

Several configuration options affect different aspects of the InnoDB buffer pool.

innodb_buffer_pool_size
Specifies the size of the buffer pool. If the buffer pool is small and you have sufficient memory, making the buffer pool larger can improve performance by reducing the amount of disk I/O needed as queries access InnoDB tables.
innodb_buffer_pool_instances
Divides the buffer pool into a user-specified number of separate regions, each with its own LRU list and related data structures, to reduce contention during concurrent memory read and write operations. This option only takes effect when you set innodb_buffer_pool_size to a value of 1GB or more. The total size you specify is divided among all the buffer pools. For best efficiency, specify a combination of innodb_buffer_pool_instances and innodb_buffer_pool_size so that each buffer pool instance is at least 1 gigabyte. See Section 14.9.2.2, “Configuring Multiple Buffer Pool Instances” for more information.
innodb_old_blocks_pct
Specifies the approximate percentage of the buffer pool that InnoDB uses for the old block sublist. The range of values is 5 to 95. The default value is 37 (that is, 3/8 of the pool). See Section 14.9.2.3, “Making the Buffer Pool Scan Resistant” for more information.
innodb_old_blocks_time
Specifies how long in milliseconds (ms) a page inserted into the old sublist must stay there after its first access before it can be moved to the new sublist. If the value is 0, a page inserted into the old sublist moves immediately to the new sublist the first time it is accessed, no matter how soon after insertion the access occurs. If the value is greater than 0, pages remain in the old sublist until an access occurs at least that many milliseconds after the first access. For example, a value of 1000 causes pages to stay in the old sublist for 1 second after the first access before they become eligible to move to the new sublist.
Setting innodb_old_blocks_time greater than 0 prevents one-time table scans from flooding the new sublist with pages used only for the scan. Rows in a page read in for a scan are accessed many times in rapid succession, but the page is unused after that. If innodb_old_blocks_time is set to a value greater than time to process the page, the page remains in the “old” sublist and ages to the tail of the list to be evicted quickly. This way, pages used only for a one-time scan do not act to the detriment of heavily used pages in the new sublist.
innodb_old_blocks_time can be set at runtime, so you can change it temporarily while performing operations such as table scans and dumps:
```
SET GLOBAL innodb_old_blocks_time = 1000;
... perform queries that scan tables ...
SET GLOBAL innodb_old_blocks_time = 0;
```
This strategy does not apply if your intent is to “warm up” the buffer pool by filling it with a table's content. For example, benchmark tests often perform a table or index scan at server startup, because that data would normally be in the buffer pool after a period of normal use. In this case, leave innodb_old_blocks_time set to 0, at least until the warmup phase is complete.
See Section 14.9.2.3, “Making the Buffer Pool Scan Resistant” for more information.
innodb_read_ahead_threshold
Controls the sensitivity of linear read-ahead that InnoDB uses to prefetch pages into the buffer pool.
See Section 14.9.2.4, “Configuring InnoDB Buffer Pool Prefetching (Read-Ahead)” for more information.
innodb_random_read_ahead
Enables random read-ahead technique for prefetching pages into the buffer pool. Random read-ahead is a technique that predicts when pages might be needed soon based on pages already in the buffer pool, regardless of the order in which those pages were read. innodb_random_read_ahead is disabled by default.
See Section 14.9.2.4, “Configuring InnoDB Buffer Pool Prefetching (Read-Ahead)” for more information.
innodb_adaptive_flushing
Specifies whether to dynamically adjust the rate of flushing dirty pages in the buffer pool based on workload. Adjusting the flush rate dynamically is intended to avoid bursts of I/O activity. This setting is enabled by default.
See Section 14.9.2.5, “Configuring InnoDB Buffer Pool Flushing” for more information.
innodb_max_dirty_pages_pct
InnoDB tries to flush data from the buffer pool so that the percentage of dirty pages does not exceed this value. Specify an integer in the range from 0 to 99. The default value is 75.
See Section 14.9.2.5, “Configuring InnoDB Buffer Pool Flushing” for more information.

PREV HOME UP NEXT

User Comments

Posted by Scott Simmons on October 5, 2015

I think this may be subjective based on the systems in place, "Ideally, you set the size of the buffer pool to as large a value as practical, leaving enough memory for other processes on the server to run without excessive paging. The larger the buffer pool, the more InnoDB acts like an in-memory database"

Actually the opposite may be true in some scenarios. What if I am running Solaris, OpenSolaris or an OS using a technology such as ZFS? With that I want innodb buffer cache to be as small as possible to allow the underlying FS to cache all operations that this process will require. This is also more efficient as ZFS uses a combination of MFU & MRU lists using an adaptive caching policy. Disk writes are handled by txg's (transaction groups) to batch writes together by the FS as well. LRU is less adaptive therefore less effective.

Posted by Tomasz Korwel on October 16, 2015

To the contrary - even with ZFS increasing innodb pool and decreasing ZFS arc pool is the way to go since innodb pool has much shorter "code path" to the data in cache hence reducing number of CPU ticks needed to retrieve the data.

Posted by John Cen on January 7, 2017

InnoDB buffer pool size completely depends on your production environment. Initially, the recommended values does not fit into my environment, later by searching internet, somehow I managed to get best performance out of buffer pool. This page (http://www.rathishkumar.in/2017/01/how-to-allocate-innodb-buffer-pool-size-in-mysql.html) suggests some practical consideration while choosing innodb buffer pool. Still way to go..working to get the best from my server.