Slow performance due to SQL Server 2014 Buffer Pool Extension (BPE) and serial queries (MAXDOP 1)

The SQL Server 2014 Buffer Pool Extension feature can extend the buffer pool space on to Solid State Drive (SSD), where data and index pages can be cached. There is no possibility of data loss in the event of SSD failure as the cached data is clean (unmodified pages).

Buffer Pool Extension can provide better random I/O performance by decreasing the latency for such I/O operations using SSD.
Buffer Pool Extension is not suitable for large sequential reads with lesser degree of parallelism. When the required data pages are in BPE, single threads are performing only 1 I/O operation at a given time and that I/O operation is limited to 1 page (8KB), leading to too many small I/Os and decreased throughput. The wait type used while waiting for BPE I/Os to complete is ‘EC’.

To receive the example scripts used in the video, subscribe to our mailing list:

If a table’s size is 16GB and, let’s say, about half of that table is in BPE, a serial query (MAXDOP = 1) will access this 8GB, 1 page at a time. The reason being that there is no purpose in copying the 8GB that is in the BPE back to the actual buffer pool as it will have to replace useful pages that are already there. To read pages from the BPE without affecting the actual buffer pool, every thread is using 1 page of memory buffer to read 1 page from the BPE, processes that page and then issues the next read request to the BPE and so on. If you observe the read queue length of the SSD volume, you will notice, while such a serial query is executing, the read queue length will never exceed 1 and average bytes per read will always be 8KB.

Such queries will be many times faster without Buffer Pool Extension; BPE can slow down such queries. In our example, with 8GB total memory for SQL Server and 16GB BPE, a range scan covering half of a 16GB table without BPE takes about 15 seconds, whereas with BPE it takes about a minute with MAXDOP = 1. A full scan on such a table without BPE takes about 30 seconds, whereas with BPE it takes more than 5 to 10 minutes in some instances with MAXDOP = 1. The reason being that SQL Server’s access to spinning disks leads to better throughput as the I/Os are not limited to 8KB, they can be up to 512KB (provided the data pages are not fragmented) and outstanding I/Os are not limited to 1.

The data in the Buffer Pool Extension related to a table might not be contiguous and hence it might not be possible to issue larger I/Os.

The best way to mitigate this issue is to use parallel query execution with a large number of worker threads when performing large table or range scans when BPE is enabled. This effectively increases the number of outstanding I/Os from 1 to the number of parallel threads executing the query and keeps the SSD busy with many outstanding I/Os, even though the I/O size does not change and remains at 8KB.