FIX: Cardinality estimates are too low when value is outside histogram

This article was previously published under Q319173
This article has been archived. It is offered "as is" and will no longer be updated.
BUG #: 356671 (SHILOH_BUGS)
The cardinality estimate for a predicate that uses a literal or a parameter value may be too low (typically 1.0) under the following conditions:
  • There are more rows in the table than there were when the statistics were last updated.
  • The value from the predicate is less than the first value or greater than the last value stored in the existing histogram.
  • The pattern of inserts is such that for this index or statistics collection, the values that correspond to newly added rows always tend to go at either the beginning or the end of the existing histogram steps as opposed to being evenly distributed through the existing range of data values. Some examples of this are an index on a column with the IDENTITY property, or on a datetime column that stores the time the row was inserted.
You may observe some of the following behaviors as a result of this problem:
  • This table might be accessed early in the join order because the cardinality estimates are low, where a later order may be more appropriate.
  • In scenarios where there is a multi-column index and the filter predicate is the leading column of this index, a seek that uses that index may seek by using only the first column when it can use additional columns in the seek predicate.
To resolve this problem, obtain the latest service pack for Microsoft SQL Server 2000. For more information, click the following article number to view the article in the Microsoft Knowledge Base:
290211 How to obtain the latest SQL Server 2000 service pack
To work around this behavior, schedule a job that runs update statistics against the affected index so that the statistics remain up to date.
Microsoft has confirmed that this is a problem in the Microsoft products that are listed in the "Applies to" section.

This problem was first corrected in SQL Server 2000 Service Pack 3.
SQL Server statistics include:
  • A count of the number of rows that were in the table when a set of statistics were created.
  • Density information.
  • A histogram that represents the distribution of the values in the table.
When the optimizer uses a histogram, it compares the number of rows currently in the table with the number of rows that existed when the statistics were created. If the numbers differ, it tries to update what the histogram might look like given the current number of rows in the table. The changes to the histogram include the creation of two new histogram steps at the low and high ends of the histogram, which reflect the new rows that have been added to the table.

The exact distribution of values after a series of updates is not known, so SQL Server guesses as to how many rows would have been inserted in the middle of the existing data distribution versus at the beginning or end of the histogram steps. If the nature of the data is such that new rows for this index typically appear at or before or after the existing values, the cardinality estimate will be too low, typically 1.0.

With this fix, the method for estimating how many rows fall outside the histogram changes so that the estimates are higher. Because the cardinality estimate affects the estimate cost of the query, this also influences the types of query plans that the optimizer considers and what order tables may be accessed.
ascending key

Article ID: 319173 - Last Review: 01/17/2015 05:26:00 - Revision: 5.1

Microsoft SQL Server 2000 Standard Edition

  • kbnosurvey kbarchive kbhotfixserver kbqfe kbsqlserv2000sp3fix kbbug kbfix kbsqlserv2000presp3fix KB319173