Summary
A database system is first and foremost responsible for the accurate storage and retrieval of data, even in the event of unexpected system failures.
The system must guarantee the atomicity and durability of transactions, while accounting for current execution, multiple transactions, and various failure points. This is often referred to as the ACID (Atomicity, Consistency, Isolation, and Durability) properties. This article addresses the implications of a disk drive caches. It is recommended that you read the following articles in the Microsoft Knowledge Base for further clarification on caching and alternate failure mode discussions:86903 SQL Server and caching disk controllers
230785 SQL Server 7.0 and SQL Server 2000 logging and data storage algorithms extend data reliability The following documents are also recommended:
-
SQL Server 2000 I/O Basics:http://technet.microsoft.com/en-us/library/cc966500.aspx
-
SQL Server I/O Basics Chapter 2:http://www.microsoft.com/technet/prodtechnol/sql/2005/iobasics.mspx
Note These two documents apply to all currently supported versions of SQL Server.
More Information
Microsoft SQL Server and many mainstream database products on the market today use the Write-Ahead Logging (WAL) protocol.
Write-Ahead Logging (WAL) Protocol
The term protocol is an excellent way to describe WAL. It is a specific and defined set of implementation steps necessary to ensure that data is stored and exchanged properly and can be recovered to a known state in the event of a failure. Just as a network contains a defined protocol to exchange data in a consistent and protected manner, so too does the WAL describe the protocol to protect data. All versions of SQL Server open the log and data files using the Win32 CreateFile function. The dwFlagsAndAttributes member includes the FILE_FLAG_WRITE_THROUGH option when opened by SQL Server.FILE_FLAG_WRITE_THROUGH
This option instructs the system to write through any intermediate cache and go directly to disk. The system can still cache write operations, but cannot lazily flush them. The FILE_FLAG_WRITE_THROUGH option ensures that when a write operation returns successful completion the data is correctly stored in stable storage. This aligns with the Write Ahead Logging (WAL) protocol specification to ensure the data. Many disk drives (SATA, ATA, SCSI and IDE based) contain onboard caches of 512 KB, 1 MB, and larger. Drive caches usually rely on a capacitor and not a battery-backed solution. These caching mechanisms cannot guarantee writes across a power cycle or similar failure point. They only guarantee the completion of the sector write operations. As the drives continue to grow in size, the caches become larger, and they can expose larger amounts of data during a failure. Enhanced caching controller systems disable on-disk cache and provide a functional battery-backed caching solution. These caches can maintain the data in the cache for several days and even allow the caching card to be placed in a second computer. When power is properly restored, the unwritten data is completely flushed before any further data access is allowed. Many of them allow percentage of read versus write cache to be established for optimal performance. Some contain large memory storage areas. In fact, for a very specific segment of the market, some hardware vendors provide high-end battery-backed disk caching systems with multiple gigabytes of cache. These can significantly improve database performance. I/O transfers that are performed without the use of a cache can be significantly longer because of hard drive spin rates, the mechanical time that is needed to move the drive heads, and other limiting factors. SQL Server installations are targeted at systems that provide caching controllers. These controllers disable the on-disk caches and provide stable media caches to satisfy SQL Server I/O requirements. They avoid performance issues related to disk seek and write times by using the various optimizations of the caching controller. There are many types of subsystem implementations. RAID and SAN are two examples of these types of subsystem implementations. These systems are typically built with SCSI-based drives. There are several reasons for this. The following section generically describes high level drive considerations. SCSI drives:-
Are typically manufactured for heavy duty use.
-
Are typically targeted at multiuser, server-based implementations.
-
Typically have better meantime to failure rates than other implementations.
-
Contain sophisticated heuristics to help predict imminent failures.
Other drive implementations, such as IDE, ATA, and SATA:
-
Are typically manufactured for light and medium duty use.
-
Are typically targeted at single user-based applications.
-
Some newer implementations contain sophisticated heuristics to help predict imminent failures.
Non-SCSI, desktop-based controllers require more main processor (CPU) bandwidth, and are frequently limited by a single active command. For example, when a non-SCSI drive is adjusting a bad block, the drive requires that the host commands wait. The ATA bus presents another example. The ATA bus supports 2 devices, but only a single command can be active. This leaves one drive idle while the other drive services the pending command. Raid systems built on desktop technologies can all experience these symptoms and be greatly affected by the slowest responder. Unless these systems use advanced designs, their performance will not be as efficient as the performance of SCSI-based systems.
There are situations in which a desktop-based drive or array is an appropriate low cost solution. For example, if you set up a read-only database for reporting, you should not encounter many of the performance factors of an OLTP database when disk caching is disabled. Drive sizes continue to increase. Low cost, high capacity drives can be very appealing. But when you configure the drive for SQL Server and your business response time needs, you should carefully consider the following issues:-
Access path design
-
The requirement to disable the on-disk cache
The following table provides high level comments. The comment information is based on common manufacture configurations.
Drive type System base |
Comments |
IDE and ATA |
|
IDE/Portable computer |
|
SATA |
|
SCSI |
|
Serial Attached SCSI (SAS) includes advanced queuing up to 256 levels. SAS also includes features such as head of queue and out of order queuing. The SAS backplane is designed in a way that enables the use of both SAS and SATA drives within the same system. The benefits of the SAS design are obvious.
Your SQL Server installation depends on the controller's ability to disable the on-disk cache and to provide a stable I/O cache. Writing data out of order to various disks is not a hindrance to SQL Server as long as the controller provides the correct stable media caching capabilities. The complexity of the controller design increases with advanced data security techniques such as mirroring.To fully secure your data, you should ensure that all data caching is properly handled. In many situations, this means you must disable the write caching of the disk drive. NOTE Ensure that any alternate caching mechanism can properly handle multiple types of failure. Microsoft has performed testing on several SCSI and IDE drives using the SQLIOSim utility. This utility simulates heavy asynchronous read/write activity to a simulated data device and log device. Test performance statistics show the average write operations per second between 50 and 70 for a drive with disabled write caching and an RPM range between 5,200 and 7,200. For additional information and details about SQLIOSim, see the following article in the Microsoft Knowledge Base:
231619 How to use the SQLIOSim utility to simulate SQL Server activity on a disk subsystem Many PC manufactures (for example, Compaq, Dell, Gateway, or HP) order the drives with the write cache disabled. However, testing shows that this may not always be the case so you should always test it completely. Note If you have any question about the caching status of your disk drive, contact the manufacturer and obtain the appropriate utility or jumper settings to disable write caching operations.
SQL Server requires systems to support 'guaranteed delivery to stable media' as outlined as outlined under the SQL Server I/O Reliability Program Requirements. For more information about the input and output requirements for the SQL Server database engine, click the following article number to view the article in the Microsoft Knowledge Base:
967576 Microsoft SQL Server Database Engine Input/Output Requirements