SQL Server makes use of a cost-based optimizer that can be extremelysensitive to statistical information that is provided on tables andindexes. Without correct and up-to-date statistical information, SQL Servercan be challenged to determine the best execution plan for a particularquery.
Statistics maintained on each table in SQL Server to aid the optimizerin cost-based decision making include the:
- Number of rows in the table.
- Number of pages used by the table.
- Number of modifications made to the keys of the table since the last update to the statistics.
Additional information is stored for indexes, including (for each index):
- An equi-height histogram on the first column.
- Densities on all column prefixes.
- Average key length.
Statistics on indexes are automatically created whenever a new index isbuilt. In addition, it is now possible to create and maintain statistics onother columns as well.
To maintain the statistical information in an as up-to-date fashion aspossible, SQL Server introduces AutoStat, which, through SQL Server'smonitoring of table modifications, is capable of automatically updating thestatistics for a table when a certain change threshold has been reached.Additionally, SQL Server introduces auto-create-statistics, whichcauses the server to automatically generate all statistics required for theaccurate optimization of a specific query.
Determining when AutoStat generation is imminent
As mentioned above, AutoStat will automatically update the statistics fora particular table when a "change threshold" has been reached. Thesysindexes.rowmodctr column maintains a running total of all modificationsto a table that, over time, can adversely affect the query processor'sdecision making process. This counter is updated each time any of thefollowing events occurs:
- A single row insert is made.
- A single row delete is made.
- An update to an indexed column is made.
: TRUNCATE TABLE does not update rowmodctr.
After table statistics have been updated, the rowmodctr value is reset to 0and the table's statistics schema version is updated.
Further, in situations in which a stored procedure's execution plan istaken from cache and that plan is sensitive to statistics, the statisticsschema version will be compared to the current version. If there are newstatistics available, the plan for the stored procedure will be recompiled.
The basic algorithm for auto update statistics is:
- If the cardinality for a table is less than six and the table is in the tempdb database, auto update with every six modifications to the table.
- If the cardinality for a table is greater than 6, but less than or equal to 500, update status every 500 modifications.
- If the cardinality for a table is greater than 500, update statistics when (500 + 20 percent of the table) changes have occurred.
- For table variables, cardinality changes does not trigger auto update statistics.
: In this strictest sense, SQL Server counts cardinality as the number of rows in the table.NOTE
: In addition to cardinality, the selectivity of the predicate also affects AutoStats generation. This means that statistics may not be updated afer every 500 modifications if cardinality were < 500 or for every 20% of changes if cardinality were > 500. A scale up factor (value ranges from 1 to 4, 1 and 4 inclusive) is generated depending on the selectivity and a product of this factor and the number of changes as obtained from the algorithm would be the actual number of modifications required for AutoStats generation.
The above algorithm can be summarised in the form of a table:
_________________________________________________________________________________ Table Type | Empty Condition | Threshold When Empty |Threshold When Not Empty _________________________________________________________________________________ Permanent | < 500 rows | # of Changes >= 500 | # of Changes >= 500 + (20% of Cardinality)___________________________________________________________________________ Temporary | < 6 rows | # of Changes >= 6 | # of Changes >= 500 + (20% of Cardinality)___________________________________________________________________________TableVariables | Change in cardinality does not affect AutoStats generation.___________________________________________________________________________
The following are two examples to help demonstrate this concept:
Consider the authors table in the pubs database, which contains 23 rows andhas two indexes. The unique clustered index, UPKCL_auidind, is indexed onone column, au_id, and a composite nonclustered index, aunmind, has beencreated on the au_lname and au_fname columns. Because this table containsfewer than 500 rows, AutoStat will begin after 500 changes to the tabledata have occurred. The changes can be one of 500 or more inserts, deletes,changes to an indexed column such as au_lname, or any combination thereof.
You can, therefore, predict when UPDATE STATISTICS will be initiated bymonitoring the sysindexes.rowmodctr value, which will be incremented uponeach update. When it reaches or exceeds 500, you can expect UPDATESTATISTICS to be started.
Consider a second table, t2, that has a cardinality of 1,000. For tableswith greater than 500 rows, SQL Server will UPDATE STATISTICS when (500+ 20 percent) changes have been made. Doing the math, 20 percent of 1,000is 200, so you can expect to see AutoStat start after approximately 700modifications have been made to the table.
Automating Autostats determination
To automate the determination of when AutoStat will be run, you can pollthe sysindexes table and identify when table modifications are reaching thestarting point. The following is a basic algorithm for doing so:
if (sysindexes.rows > 500) if (sysindexes.rows * 0.20 >= sysindexes.rowmodctr && production hours) //500 change leeway begin disable autostats log autostats disable end else begin stats ok end else if (sysindexes.rowmodctr >= 425) //75 change leeway begin disable autostats log autostats disable end
You could later schedule a job to do the following:
- Run UPDATE STATISTICS against all tables for which you were forced to disable them during the day.
- Re-enable AutoStat, because each table's modification counter will have been reset to 0 when UPDATE STATISTICS was run.
Controlling whether UPDATE STATISTICS are run against a table
The most obvious solution to this question, when AutoStat has proven to beProblematic, is to disable auto statistic generation, thereby leaving thedatabase administrators free to schedule UPDATE STATISTICS during lessintrusive times. You can do this either by using the UPDATE STATISTICSstatement or the sp_autostats stored procedure. The syntax for the UPDATESTATISTICS statement is:
UPDATE STATISTICS <table>...with NORECOMPUTE
The syntax for the sp_autostats stored procedure is:
sp_autostats <table_name>, <stats_flag>, <index_name>
where <stats_flag> is either "on" or "off".
You can also use sp_dboption to disable the automatic occurrence of UPDATESTATISTICS or CREATE STATISTICS on a per-database level:
sp_dboption <dbname>,'auto update statistics', <on | off>
sp_dboption <dbname>,'auto create statistics', <on | off>
Controlling the number of concurrent UPDATE STATISTICS processes
Currently, short of disabling AutoStat for specific tables, it is notpossible to configure the number of automatic UPDATE STATISTICS statementsthat are being run concurrently (DCR 51539 has been filed for this). Theserver does, however, limit the number of concurrent UPDATE STATISTICSprocesses to four per processor.
Determining when Autostats are being run
You can use trace flag 205 to report when a statistics-dependent storedprocedure is being recompiled as a result of AutoStat. This trace flagwill write the following messages to the error log:
1998-10-15 11:10:51.98 spid9 Recompile issued : ProcName: sp_helpindex
LineNo: 75 StmtNo: 29
When trace flag 205 is enabled, the following message will also bracket theAutoStat message from 8721 when statistics are updated. The openingmessage of the bracket can be distinguished by the RowModCnt value, whichwill be greater than 0. The closing bracket, after the UPDATE STATISTICS,will have a RowModCnt value of 0:
1998-10-15 11:38:43.68 spid8 Schema Change: Tbl Dbid: 7 Objid:
133575514 RowModCnt: 60500 RowModLimit: 60499
For this message, "RowModCnt" is the total number of modifications to thetable. "RowModLimit" is the threshold which, when exceeded, results in anUPDATE STATISTICS statement execution for the table.
It is also possible to enable trace flag 8721, which will dump informationinto the error log when AutoStat has been run. The following is anexample of the type of message that you can expect to see:
1998-10-14 16:22:13.21 spid13 AUTOSTATS: UPDATED Tbl: [authors]
Rows: 23 Mods: 501 Bound: 500 Duration: 47ms UpdCount: 2
For this message, "Mods" is the total number of modifications to the table."Bound" is the modification threshold, "Duration" is the amount of timethat the UPDATE STATISTICS statement required to complete, and "UpdCount"is the count of updated statistics.
You can also use the SQL Server Profiler to identify when UPDATE STATISTICSstatements are being run. To do this, perform the following steps:
- On the Profiler menu, click Tools, and then click Options.
- On the General tab, go to Events, and then select All Event Classes.
- Define a new trace, and under Events, select Misc, select the Auto-Update Stats sub-event.
NOTE: If many statistics are being updated by AutoStat, a great number ofmessages can be written to the error log. Thoroughly experiment with thesetrace flags before using them on any production or otherwise criticalserver.
SQL Server employs two types of schema locks, both of which are takenwhen it updates the statistics for a table:
Sch-S: Schema Stability Lock ---------------------------- This lock ensures that a schema element, such as a table or index, will not be dropped while any session holds a schema stability lock on the schema element. Sch-M-UPD-STATS: Schema Modification Lock ----------------------------------------- This is a non-blocking lock that is used by the system to ensure that only one automatic UPDATE STATISTICS process is run against a table at any given point in time. The sp_lock stored procedure will report this lock has having a type = TAB, resouce = UPD-STATS and mode = SCH-M.
You can view these locks by running sp_lock or by selecting from thesyslockinfo table.