Description of disaster recovery options for Microsoft SQL Server
Disaster recovery is a process that you can use to help recover information systems and data, if a disaster occurs.
Some examples of disasters include a natural or a man-made disaster such as a fire, or a technical disaster such as a two-disk failure in a Redundant Array of Independent Disks (RAID) 5 array.
Disaster recovery planning is the work that is devoted to preparing all the actions that must occur in response to a disaster. The planning includes the selection of a strategy to help recover valuable data. The selection of the appropriate disaster recovery strategy depends on your business requirements.
Note The solutions that are discussed in this article only provide general descriptions of the technologies that you can use. These general descriptions are for comparing the various disaster recovery methods and the disaster recovery plans. Before you decide on which disaster recovery solution is best for you, make sure that you look at each of the suggested disaster recovery solutions in more detail. After discussing each disaster recovery solution, this article contains links where you can find additional information about that solution.
Failover clusteringMicrosoft SQL Server 2000 failover clustering is designed to failover automatically if a hardware failure or a software failure occurs. You can use SQL Server 2000 failover clustering to create a failover cluster for a single instance of SQL Server 2000 or for multiple instances of SQL Server 2000. Failover clustering allows a database system to automatically switch the processing of an instance of SQL Server from a failed server to a working server. Therefore, failover clustering is helpful if an operating system failure occurs or if you perform a planned upgrade of the database system resources. Also, failover clustering increases server availability with no downtime.
Because failover clustering is designed for high server availability with almost no server downtime, the clustered nodes should be geographically close to each other. Failover clustering may not be useful if a disk array failure occurs.
Note To implement failover clustering, you must install Microsoft SQL Server 2000 Enterprise Edition.
The following operating systems support failover clustering:
- Microsoft Windows NT 4.0, Enterprise Edition
- Microsoft Windows 2000 Advanced Server
- Microsoft Windows 2000 Datacenter Server
- Microsoft Windows Server 2003, Enterprise Edition
- Microsoft Windows Server 2003, Datacenter Edition
For more information about MSCS and its installation, click the following article number to view the article in the Microsoft Knowledge Base:
Advantage and disadvantages of using failover clustering
- You incur a greater expense. The maintenance of two servers is two times the cost of maintaining a single server. Because you have to maintain two servers at the same time, it is more expensive to install and maintain clustered nodes.
- Servers should be in the same location. If the branches of the organization are across the globe and the Active/Active clusters must be implemented in the branches, the networking and the storage infrastructure that you have to use is very different from a standard quorum device server cluster. Therefore, although it is possible, it is best not to use geographically distant servers.
- You have no protection against a disk array failure.
- Failover clustering does not allow you to create failover clusters at the database level or at the database object level, such as the table level.
Database mirroringDatabase mirroring is a primarily software solution for increasing database availability. You can only implement mirroring on a per-database basis. Mirroring only works with databases that use the full recovery model. The simple and bulk-logged recovery models do not support database mirroring. Therefore, all bulk operations are always fully logged. Database mirroring works with any supported database compatibility level.
Advantage and disadvantages of using database mirroring
- Database mirroring increases data protection.
- Database mirroring increases availability of a database.
- Database mirroring improves the availability of the production database during upgrades.
- The mirror database should be identical to the principal database. For example, all objects, logins, and permissions should be identical.
- Database mirroring involves the transfer of information from one computer to another computer over a network. Therefore, the security of the information that SQL Server transfers is very important.
Peer-to-peer transactional replicationPeer-to-peer transactional replication is designed for applications that might read or might modify the data in any database that participates in replication. Additionally, if any servers that host the databases are unavailable, you can modify the application to route traffic to the remaining servers. The remaining servers contain identical copies of the data.
Advantage and disadvantages of using peer-to-peer transactional replication
- Read performance is improved because you can spread activity across all nodes.
- Aggregate update performance, insert performance, and delete performance for the topology resembles the performance of a single node because all changes are propagated to all nodes.
- Peer-to-peer replication is available only in SQL Server 2005 Enterprise Edition.
- All participating databases must contain identical schemas and data.
- We recommend that each node use its own distribution database. This configuration eliminates the potential for SQL Server 2005 to have a single point of failure.
- You cannot include tables and other objects in multiple peer-to-peer publications within a single publication database.
- You must have a publication enabled for peer-to-peer replication before you create any subscriptions.
- You must initialize subscriptions by using a backup or by setting the value of the subscription synchronization type to replication support only.
- Peer-to-peer transactional replication does not provide conflict detection or conflict resolution.
- We recommend that you do not use identity columns.
Maintenance of a warm standby serverYou can create and maintain a warm standby server by using either of the following methods:
- Log shipping
- Transactional replication
Log shippingLog shipping is included in the resource kit for Microsoft SQL Server 7.0, and it is fully incorporated in the Microsoft SQL Server 2000 Enterprise Edition and in the Microsoft SQL Server 2000 Developer Edition. Log shipping uses a standby server that is not used during regular operations. A standby server is useful to help recover data if a disaster occurs. You can only use log shipping at the database level. You cannot use it at the instance level.
When a standby server is restoring transaction logs, the database is in exclusive mode and it is unusable. However, you can run batch reporting jobs between transaction log restorations or Database Console Commands (DBCC) checks to continuously verify the integrity of the standby server. For applications such as decision support servers that require continuous processing on a database server, log shipping is not an appropriate option.
The latency on the standby server is based on how frequently the transaction log backups are taken at the primary server and then applied at the standby server. If the primary server fails, you may lose the changes that were made by the transactions that occurred after your most recent transaction log backup.
For example, if transaction log backups are taken every 10 minutes, transactions during the most recent 10 minutes may be lost. This does not necessarily mean that the data updates that are made to the primary server during the latency period will be lost. Typically, new updates in the primary transaction log can be recovered and applied at the warm standby server with only a small delay in switching from the primary server to the standby server. The main purpose of log shipping is to maintain a warm standby server. If maintaining a warm standby server is your main objective, log shipping is likely to be more appropriate than the other solutions that this article discusses.
Advantages and disadvantages of using log shipping
- You can recover all database activities. The recovery includes any objects that were created such as tables and views. It also includes security changes such as the new users who were created and any permission changes.
- You can restore the database faster. The restoration of the database and the transaction log is based on low-level page formats. Therefore, log shipping speeds up the restoration process and results in the fast recovery of data.
- The database is unusable during the restoration process because the database is in exclusive mode on the standby server.
- There is a lack of granularity. During the restoration process, all the changes in the primary server are applied at the standby server. You cannot use log shipping to apply changes to a few tables and to reject the remaining changes.
- There is no automatic failover of applications. When the primary server fails because of a disaster, the standby server does not failover automatically. Therefore, you must explicitly redirect the applications that connect to the primary server to the standby (failover) server.
For more information about how to set up a warm standby server by using log shipping, click the following article numbers to view the articles in the Microsoft Knowledge Base:
Transactional replicationYou can also use transactional replication to maintain a warm standby server. Transactional replication replicates the data on one server (the publisher) to another server (the subscriber) with less latency than log shipping. You can implement transactional replication at the database object level such as the table level. Therefore, Microsoft recommends that you use transactional replication when you have less data to protect, and you must have a fast recovery plan.
You can use a push subscription to enforce transactional replication between two servers with the primary server as the publisher and the standby server as the subscriber. Transactional replication ensures data replication. When the publisher fails, the subscriber can be used.
This solution is vulnerable to the failure of the publisher and the subscriber at the same time. In such a scenario, you cannot protect the data. In all other scenarios such as the failure of a distributor or a subscriber, it is best to resynchronize the data in the subscriber with the data in the publisher.
You should use transactional replication to maintain a warm standby server only when you do not implement schema changes or you do not implement other changes to the database such as security changes that replication does not support.
Note Replication is not designed for the maintenance of warm standby servers. With replication, you can use replicated data at the subscriber to generate reports. You can also use replication for other general uses without having to perform processing on the relatively busy publisher.
Advantages and disadvantages of using transactional replication
- You can read data on a subscriber while you apply changes.
- Changes are applied with less latency.
Note This advantage may not be applicable if either of the following is true:
- Replication agents are not set to Continuous.
- Replication agents are stopped because of errors that may occur during replication.
- Schema changes or security changes that are performed at the publisher after establishing replication will not be available at the subscriber.
- The distributor in transactional replication uses an Open Database Connectivity (ODBC) connection or an OLE Database (OLEDB) connection to distribute data. However, log shipping uses the RESTORE TRANSACTION low-level Transact-SQL statement to distribute the transaction logs. A RESTORE TRANSACTION statement is much faster than an ODBC connection or an OLEDB connection.
- Typically, switching servers erases replication configurations. Therefore, you have to configure replication two times: When you switch to the subscriber.When you switch back to the publisher.
- If a disaster occurs, you must manually switch servers by redirecting all the applications to the subscriber.
Backup and restore featureThe Backup and Restore feature of SQL Server provides an important safeguard to help protect critical data that you store in SQL Server databases. You can create a copy of a database (a backup copy) by using the Backup and Restore feature, and then store the copy of the database in a location that is protected from the potential failure of the server that runs the instance of SQL Server. If you experience a database system failure or database corruption, you can then use the backup copy to re-create the database or to restore the database.
When you plan disaster recovery by using the Backup and Restore feature, also determine how critical the data in the database is. Additionally, determine the restoration requirements for the database. For example, determine the following restoration requirements:
- The point that you restore the database to. You have to decide which of the following two you want to do:Restore the database to the condition of the night before the failure.Restore the database to the condition of a point of time as close as possible to the time of failure.
- How long the database can be unavailable. Whether you must restore the database immediately.
You can only restore a database to the condition of the point of time where you performed the most recent backup. Transactions that occurred after that backup may be lost. Therefore, Microsoft recommends that you use the Backup and Restore feature only for non-mission-critical database applications.
Advantages and disadvantages of using the backup and restore feature
- You can back the database up to removable media to help protect against disk failures.
- You do not have to depend on the network as you do when you use failover clustering or log shipping.
- When you back up the database, you cannot perform operations such as table creation, index creation, database shrinking, or nonlogged operations.
- If a failure occurs, you may lose your most recent data.
- If a disaster occurs, you must manually restore the database.
For more information about the Backup and Restore feature, click the following article numbers to view the articles in the Microsoft Knowledge Base:
Disk redundancy of data by using a redundant array of independent disks (RAID)A RAID stores redundant data on multiple disks to provide greater reliability and less downtime for servers. RAID levels 0, 1, and 5 are generally used as recovery options for SQL Server. The RAID technologies that are mentioned allow for the failure and the consequent replacement of a single disk without the server going offline. If multiple disk failures occur, data may not be recoverable. Therefore, Microsoft recommends that you combine redundant data management with a Backup and Restore procedure to help make sure that you do not lose data if a hardware failure or other disaster occurs.
RAID 0 uses striping technology for faster access whereas RAID 1 uses mirroring technology for data reliability. A common technique used in relational database management involves using RAID 0 and RAID 1 together. In this technique, two identical striped arrays of drives are constantly updated so that the information that is stored on both the arrays is the same. If one array fails, the other array automatically takes over until the original array is brought back online.
RAID 5 (also known as striping with parity) uses a single striped disk array with parity bits written together with the data. When any one disk fails, the parity bits can be used to calculate the missing data until you replace the disk. When you replace the disk, you can use the parity information and the remaining data to re-create the data from the failed disk and to copy the re-created data to the new disk. All these operations occur without database system downtime. A RAID provides many other options and features for you to help make sure that your database systems experience as little downtime as possible.
Advantage and disadvantages of using a RAID
- It may take a long time to recover the data.
- If multiple disks fail, you may not be able to recover valuable data.
Article ID: 822400 - Last Review: 07/03/2012 02:59:00 - Revision: 6.0
- kbdisasterrec kbreplication kbreplmgr kbclustering kbinfo KB822400