You need to consider the following five factors when setting up database mirroring in SQL Server environment. These factors not only affect the overall performance of the systems involved in database mirroring but also limit the number of databases that can be mirrored on the servers.
- Amount of RAM available on the principal and mirror servers (RAM).
- Processing power of the principal and mirror servers (CPU).
- Bandwidth available for the I/O subsystem on the principal and mirror servers
- Network bandwidth between the two servers.
- Volume of transactions on the principal database that generate transaction log records (i.e. transactions that change the database in any way).
Each of the above issues is discussed in detail below:
- RAM: For further information on how the available physical memory affects the performance of SQL Server refer to the following articles in Microsoft Knowledge Base.
- CPU: Database mirroring is limited by the number of threads available in SQL Server which in turn is directly dependent on the computing power of the machine. By default in SQL Server 2005 and above, the “max worker threads” is 0 which means a dynamic value. This value is automatically adjusted by SQL Server when the computing power changes. SQL Server uses the following formula when calculating max number of worker threads
For 32 bit operating system:
- Total available logical CPU’s <= 4 : max worker threads = 256
- Total available logical CPU’s > 4 : max worker threads = 256 + ((logical CPUS’s - 4) * 8)
For 64 bit operating system:
- Total available logical CPU’s <= 4 : max worker threads = 512
- Total available logical CPU’s > 4 : max worker threads = 512 + ((logical CPUS’s - 4) * 16)
You can determine the amount of worker threads available on your SQL Server using the sys.dm_os_sys_info (column max_workers_count) dynamic management view (DMV).
You can determine the amount of worker threads used on your SQL Server using the sys.dm_os_schedulers (sum of rows in column current_workers_count) dynamic management view (DMV).
Database mirroring has the following requirements for worker threads
- Principal server: 1 global thread and 2 threads per each of the mirrored databases.
- Mirror server:
- 64 bit architectures: 1 global thread, 2 threads per mirrored databases and one additional thread for each mirrored database for every 4 processor cores.
- On a 4 CPU server it requires 1 global thread+ (2+ 1) threads /mirrored db.
- On a 6-8 CPU server it requires 1 global thread + (2+2) threads /mirrored db.
- For 32 bit architectures: 1 global thread + 10 threads per mirrored database.
- Witness Server: 2 global threads.
In summary database mirroring requires 2 threads on the principal server and at least 3 to 4 worker threads on the mirror server for each of the mirrored database in addition to 1 global thread on each of them. It is important to note that these calculations only reflect the number of worker threads required for administering database mirroring and additional threads are required for the core database server activity. If you put everything discussed above into perspective, on an x64 system with 32 CPUs we have:
- Max worker threads = 512+ (32 - 4) * 16 = 960
- Max number of mirror databases (The following calculation assumes that you mirror all the databases on one primary and one mirror server. Since the number of thread needed on mirror server are greater than those on the primary, we need to use the mirror server values in our calculations).
- # of threads required per mirrored database = (2+32/4) = 10 threads per database
- Max number of databases that can be mirrored = 960/10 = 96
However the full amount of 960 threads are not available to perform mirroring activities as some threads are assigned to core database activity on the server. From the above calculations it should be clear that
- The more activity you have on the server, the fewer will be the number of databases you are able to mirror.
- The more the number of mirror databases, the fewer will be the number of worker threads that are available to service core database requests and hence the lower will be the overall performance of the system.
As you can see this limit will even be lower on 32 bit systems, since the total number of worker threads available on the same kind of system is just 480.
- HDD IO: If transactions are not hardened ont he log drive on the mirror fast enough and you are using high safety, the principal might have to wait for the mirror to acknowledge hardening of log records before transactions can commit, resulting in degraded performance. Also if you decide to mirror a database between two SQL Server instances on the same machine the amount of writes will double thus a disk queue might form (the ideal disk queue length is 0), SQL Server will have to wait for I/Os to complete before transactions can commit, and in extreme cases you might also notice events related to this issue being recorded inside SQL Server errorlogs :
2008-04-22 16:30:02.140 spid6s SQL Server has encountered 2 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [F:\sql data files\xxx.MDF] in database [xxx] (5).
You can determine the amount disk I/O bottlenecks on your SQL Server in the DMV sys.dm_os_schedulers (column pending_disk_io_count).
Tuning the I/O subsystem to allow increased throughput can help alleviate this (e.g. by using a RAID level that offers higher performance, separating databases onto separate RAID arrays), depending on the sophistication of the I/O subsystem. Also make sure that your system is not under memory pressure as this will generate added I/O pressure. Also make sure that your system is not under memory pressure as this will generate added IO pressure.
- NETWORK: In order to decide whether the amount of activity is pushing the limits of network bandwidth you need to setup mirroring and evaluate the network bandwidth needs with different scenarios (high safety, high performance) and load. The following Performance Monitor counters can be used in order to establish if the bandwidth is enough or fully utilized:
- Network Interface Object.
- SQLServer:DatabaseMirroring Log Bytes Sent/sec counter on the principal database. This is the rate of log being sent.
- SQLServer:DatabaseMirroring Log Compressed Bytes Sent/sec counter on the principal database (New in 2008). This is. compressed size bytes sent and is a subset of the Log Bytes Sent/sec counter.
- SQLServer:DatabaseMirroring Log Send Queue KB counter on the principal database. This is the amount of log yet to be sent to the mirror.
- SQLServer:DatabaseMirroring Log Harden Time (ms) counter on the mirror database (New in 2008). This is the time to write the received log to disk.
- SQLServer:DatabaseMirroring Log Bytes Received/sec counter on the mirror database. This is the rate of log being received.
- SQLServer:DatabaseMirroring Log Compressed Bytes Rcvd/sec counter on the mirror database (New in 2008). This is compressed size bytes received and is a subset of the Log Bytes Received/sec counter.
- SQLServer:DatabaseMirroring Redo Bytes/sec counter on the mirror database. This is the rate that log is being redone.
- SQLServer:DatabaseMirroring Redo Queue KB counter on the mirror database. This is the amount of log that is yet to be redone
- SQLServer:DatabaseMirroring Send/Receive Ack Time counter on the mirror database.
- SQLServer:Databases Transactions/sec counter on the mirror database.
Here's a little more explanation and what you can use these new performance counters to troubleshoot (Reference: Paul Randal's blog)
- Log Compressed Bytes Sent/sec
- These are simple counters and can be used with the Log Bytes Sent/sec and Rcvd/sec counters to determine the compression ratio. Log stream compression is another enhancement in SQL Server 2008 and it is enabled by default.
- Log Harden Time (ms)
- This measures the delay between the mirror server receiving a chunk of transaction log and it being hardened on the mirror database's log disk (i.e. the delay before the chunk of transaction log becomes part of the redo queue on the mirror server).
- If this number is higher than normal it means the mirror database's log disk is more heavily loaded and may be becoming saturated.
- Send/Receive Ack Time
- As Books Online mentions, this can be used to measure network latency between the principal and mirror servers.
- If this value is larger than normal it means that there is a network bottleneck between the principal and mirror servers
You can also find some more information on some of these performance counters in SQL Server 2005 technical article on database mirroring.
If the implementation of mirroring for testing purposes is not an option you can use one of the following methods to get an idea about the performance of your network.
Use a third party tool like Iperf to evaluate your network's performance. The instructions to simulate database mirroring performance with this tool are as follows:
To get help with this tool run:iperf --help
On the server (principal) run:iperf -s -p5001 -fKB -w65536
- On the client (mirror) run:iperf -c <hostname or ip address of server> -d -t60 -w65536
Disclaimer: The information and the solution in this document represents the current view of Microsoft Corporation on these issues as of the date of publication. This solution is available through Microsoft or through a third-party provider. Microsoft does not specifically recommend any third-party provider or third-party solution that this article might describe. There might also be other third-party providers or third-party solutions that this article does not describe. Because Microsoft must respond to changing market conditions, this information should not be interpreted to be a commitment by Microsoft. Microsoft cannot guarantee or endorse the accuracy of any information or of any solution that is presented by Microsoft or by any mentioned third-party provider.
Microsoft makes no warranties and excludes all representations, warranties, and conditions whether express, implied, or statutory. These include but are not limited to representations, warranties, or conditions of title, non-infringement, satisfactory condition, merchantability, and fitness for a particular purpose, with regard to any service, solution, product, or any other materials or information. In no event will Microsoft be liable for any third-party solution that this article mentions.
Verify that there is sufficient headroom (less than 66 percent utilized, as a rule) on both the CPU and DISK IO on the production server, and the same levels can be supported by the eventual mirror database server.
Check the network latency using the ping command and explore ways to reduce the same when possible.
Periodically check the value of is_send_flow_controlled column in the sys.dm_db_mirroring_connection dynamic management view (DMV). If you notice that this value is 1 most of the time, then it is a very good indication that the network latency is affecting mirroring throughput.
To mitigate the impact of network latency on the database mirroring throughput, you should consider the following:
- Gigabit network cards
- Dedicated NICs for mirroring.
- Using the 'Log stream compression' feature in SQL Server 2008 environments.(which is enabled by default). This will cause the mirroring traffic to be compressed before being sent over the network. Though this reduces network bandwidth requirements for mirroring it will cause the CPU utilization to increase as it will spend time in compression and decompression, and possibly a heavier workload on the principal.
- DATABASE ACTIVITY: The amount of activity on the SQL Server affects the following:
- Number of worker threads that are currently in use (CPU).
- Network bandwidth(NETWORK).
- Amount of writes/reads on the storage media (HDDIO).
- Available memory on the system.(RAM)
As you can see the database activity affects all the core components that the database mirroring depends on and hence it is imperative that you factor this in to your planning when setting up database mirroring in your environment. It is further recommended that you minimize index maintenance activity on the databases that are mirrored by spreading the maintenance tasks over a number of days since it typically generates a backlog of log send queue on the principal database.
Steps to reproduce:
Mirror a database in a virtual environment with only 1 CPU and monitor the amount of used worker threads. You will notice that because there is no activity on the server you will be able to mirror more than 20 databases. Mirror as many databases as possible and then simulate load on the server using a benchmark utility for SQL Server. You can then use DMV’s (performance dashboard) and/or performance monitor counters to evaluate the impact of having so many mirrored databases and active users on the server. You can also remove some of the mirroring to see how performance improves as worker threads are freed.
SQL Server 2008:
SQL Server 2005