This article discusses a problem that occurs during a query of a clustered columnstore index in Microsoft SQL Server 2014. This article provides a resolution to this problem.
When you use a query that scans a clustered columnstore index in Microsoft SQL Server 2014, you may, under rare conditions, receive partial query results.
This problem occurs when the following operation is run.
A Transact-SQL statement [INSERT or BULK-INSERT] inserts data into a table that has clustered columnstore index. During this operation, the following conditions apply:
When the Transact-SQL statement reaches the rowgroup threshold, it closes rowgroup R1 that has segment S1.
Segment S1 points to local dictionary D1.
The statement continues to insert rows to new rowgroup R2.
When rowgroup R1 is closed, the local dictionary D1 does not also have to be closed. If dictionary D1 still has available space, you can leave it open and reuse it for the new rowgroup R2.
If the Transact-SQL statement is ended abnormally or canceled before it closes the new rowgroup R2, the following conditions apply:
Columnstore metadata changes occur in subtransactions that commit independently of the outer transaction.
At this point, rowgroup R1 persists in the system table in an "under construction" or INVISIBLE state, and segment S1 references dictionary D1.
There is no row created in the system table for dictionary D1. This is because the Transact-SQL statement never has an opportunity to close the existing row. Therefore, the existing row persists.
In a typical situation, if the tuple mover background task starts after the Transact-SQL statement ends, the background task removes the invisible rowgroup R1 and the segment S1. If a new Transact-SQL statement is started now and creates rowgroup R3 that has a new segment S3 that requires a new local dictionary, you cannot reuse the internal ID of dictionary D1. This is because the in-memory state of the columnstore keeps track of the dictionary IDs that are used. Therefore, segment S3 will reference new dictionary D2.
Note The condition in this step is a common condition. Therefore, no corruption occurs.
If SQL Server loses the in-memory state of dictionary D1 before the tuple mover task takes effect (and performs as described in Step 3), the problem that is described in this article occurs.
This event occurs for any of the following reasons:
SQL Server experiences memory overload, and the in-memory contents of dictionary D1 are evicted from memory.
The instance of SQL Server is restarted.
The database that contains the clustered columnstore index goes offline and then comes back online.
After any one of these events occur and SQL Server reloads the in-memory structures, there is no record that a dictionary D1 and its internal ID existed. This is because dictionary D1 was not retained in the system tables when the Transact-SQL statement was ended or conceled.
If the tuple mover background task starts at this point, no errors occur because the conditions that are described in Step 3 apply.
If a new rowgroup R3 is created before the tuple mover background task starts (per the previous bullet item), SQL Server assigns the same internal ID to new dictionary D1, and it references dictionary D1 for segment S3 in rowgroup R3.
When the tuple mover background task starts after the previous action, it drops invisible rowgroup R1 and its segments S1 together with new dictionary D1. This occurs because the tuple mover considers that new dictionary D1 and the original dictionary D1 that S1 references are the same.
Note When this condition occurs, you cannot query the contents of rowgroup R3.
The issue was first fixed in the following cumulative updates for SQL Server:
In a currently affected database, if you run DBCC CHECKDB after you apply this fix, you receive following error message:
Msg 5289, Level 16, State 1, Line 1 Clustered columnstore index 'cci' on table 't' has one or more data values that do not match data values in a dictionary. Restore the data from a backup.
In a currently affected database, when you run a query that scans the affected tables after you apply this fix, you receive the following error message:
Msg 5288, Level 16, State 1, Line 1 Columnstore index has one or more data values that do not match data values in a dictionary. Please run DBCC CHECKDB for more info.
If you receive these errors, you can save the uncorrupted data by bulk exporting the data of unaffected columns/rowgroups and then reloading the data after you drop or create the clustered columnstore index. You should enable Trace flag 10207 to suppress the 5288 error and revert to the old behavior of skipping corrupted rowgroups.
Note Error messages 5288 and 5289 are generated for this rowgroup R3 that has segment S3. Trace flag 10207 is used to extract the segments of rowgroup R3 that are not affected by the missing dictionary D1.
Query for affected databases
To determine whether the database that contains columnstore indexes is already affected by this problem, run the following query:
select object_name(i.object_id) as table_name, i.name as index_name, p.partition_number, count(distinct s.segment_id) as damaged_rowgroups from sys.indexes i join sys.partitions p on p.object_id = i.object_id and p.index_id = i.index_id join sys.column_store_row_groups g on g.object_id = i.object_id and g.index_id = i.index_id and g.partition_number = p.partition_number join sys.column_store_segments s on s.partition_id = p.partition_id and s.segment_id = g.row_group_id where i.type in (5, 6) and s.secondary_dictionary_id <> -1 and g.state_description = 'COMPRESSED' and s.secondary_dictionary_id not in ( select dictionary_id from sys.column_store_dictionaries d where d.hobt_id = p.hobt_id and d.column_id = s.column_id ) group by object_name(i.object_id), i.name, p.partition_number
You have to run this query against every database that contains columnstore indexes on the server that is running SQL Server. An empty result set indicates that the database is not affected.
Execute this query during a period when there is no activity that will create new rowgroups or change the state of existing rowgroups. For example, the following activities can modify the state of rowgroups: index build, index reorganize, bulk insert, tuple mover compressing delta stores.
Before you execute the query you can disable the background tuple mover task by using the trace flag 634. Use this command to disable the background task: DBCC TRACEON ( 634 , -1 ). After the query finishes executing, remember to re-enable the background task by using the command: DBCC TRACEOFF ( 634 , -1 ).
Also make sure there are no BULK INSERT/BCP/SELECT-INTO commands inserting data into the tables that use columnstore index while this query is running.
It is recommended to use these steps to prevent the query from returning false-positives.
Microsoft has confirmed that this is a problem in the Microsoft products that are listed in the "Applies to" section.