Decreased performance for SQL Server when you run a TOP, MAX or MIN aggregating clause on columns other than the partitioning column

Applies to: Microsoft SQL Server 2005 Enterprise EditionMicrosoft SQL Server 2005 Standard EditionMicrosoft SQL Server 2005 Standard X64 Edition More

Symptoms


Assume that you have partitioned tables in Microsoft SQL Server. When you run a TOP, MAX or MIN aggregating clause on columns of the tables, you may experience decreased performance.

Note This issue only does not occur on the partitioning column.

Workaround


To work around this issue, craft a query that collects the TOP N elements of each partition. Then, find the TOP N elements from that collection of elements. 

For example, you have a table T1 that has four partitions, and the partition function is PF1. The table is partitioned on column PCOL and has index idx_c1 on T1.c1. You may encounter the performance issue when you run the following query:

SELECT TOP 3 T1.c1, T1.c2

FROM dbo.T1

ORDER BY T1.c1

To work around this issue, follow these steps:
  1. Find the top 3 elements of a given partition <partition_number>:

    SELECT TOP 3 T1.c1, T1.c2
    FROM dbo.T1
    WHERE $PARTITION.PF1(PCOL) = <partition_number> AS A(c1, c2)
    ORDER BY T1.c1;
  2. Find the top 3 elements of all four partitions: 

    SELECT TOP 3 A.c1, A.c2
    FROM (VALUES((1),(2),(3),(4)) AS P( partition_number )
    CROSS APPLY ( SELECT TOP 3 (T1.c1, T2.c2)
    FROM dbo.T1
    WHERE $PARTITION.PF1(T1.PCOL) = P.partition_number
    ORDER BY T1.c1 ) AS A
    ORDER BY A.c1
  3. Unfortunately, if the table is repartitioned, you have to rewrite these queries in order to use the new number of partitions. However, you can also obtain the number of partitions from sys.partitions. Therefore, instead of using a constant list of partitions, you can use the following SQL Script:

    SELECT TOP 3 A.c1, A.c2
    FROM sys.partitions AS P
    CROSS APPLY ( SELECT TOP 3 T1.c1, T2.c2)
    FROM dbo.T1
    WHERE $PARTITION.PF1(T1.col1) = P.partition_number
    ORDER BY T1.c1 ) AS A
      WHERE P.object_id = OBJECT_ID('dbo.T1')  
      AND P.index_id = INDEXPROPERTY( OBJECTID('dbo.T1'), 'idx_c1', 'INDEXID')
    ORDER BY a;

Note This article uses TOP N with an order by clause as the example. MAX and MIN clauses have similar issues. Therefore, they can be worked around by turning them into TOP 1 queries, with the order set to either ascending or descending.

More Information


When you query for the TOP N rows of an indexed column on a non-partitioned table, generally the query has very good performance. This is because the query plan scans an index to determine what the top n elements are.  

However, for a partitioned table, this is currently not the case, as the indexes may also be partitioned. This means that you cannot merely query the indexes to determine the top N elements. Those elements may be distributed across all partitions. For example, consider the following case in which you have a table "a" with two partitions P0 and P1 that are partitioned around 0:
PartitionKeyValue
P0-21
P0-11
P0012
P111
P121
P1315

Because each index is partitioned, SQL Server cannot scan the index all at the same time to determine the maximum value. Instead, it scans each element of the table to determine the max value. In a table that has millions of rows, this process can be very inefficient.

Status


Microsoft has confirmed that this is a problem in the Microsoft products that are listed in the "Applies to" section.