SQL Server full-text index on XML column doesn't index attribute values for the inner nodes
This article helps you resolve the problem where SQL Server full-text index on XML column doesn't index attribute values for the inner nodes.
Original product version: SQL Server
Original KB number: 2513181
Symptoms
In Microsoft SQL Server, when indexing an XML column using full-text, only node values and attribute values for the top node are indexed. Attribute values for any of the inner nodes aren't indexed.
Cause
This behavior happens because the XML word breaker that is shipped with SQL Server doesn't return attribute values for any of the inner nodes (xmlfilt.dll - Version: 12.0.9735.0).
Resolution
This issue can be resolved using the XML word breaker that ships with Windows operating system when SQL Server is running on either Windows 7 or Windows Server 2008 R2 (version shipped with Windows use different file name - xmlfilter.dll). If you are running SQL Server on a lower version of Windows, you first need to upgrade to these operating systems and then use the following procedure to resolve this issue.
Procedure to resolve the problem on SQL Servers running on Windows Server 2008 R2 and Windows 7 environments:
Important
This section, method, or task contains steps that tell you how to modify the registry. However, serious problems might occur if you modify the registry incorrectly. Therefore, make sure that you follow these steps carefully. For added protection, back up the registry before you modify it. Then, you can restore the registry if a problem occurs. For more information about how to back up and restore the registry, see How to back up and restore the registry in Windows.
Note
You need to restart SQL Server Service after going through the following procedure for the changes to come into effect.
Navigate to the following registry hive on your SQL Server machine and save it as SQLMSSearch.reg.
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\<instance>\MSSearch\CLSID
Note
Replace the "<instance>" in the path with the relevant instance ID of SQL Server/SQL Server.
For example, "MSQL10.MSSQLSERVER" for a default instance or "MSSQL10.KATMAI" for a named instance "Katmai".
For more information, see File Locations for Default and Named Instances of SQL Server.
Edit SQLMSSearch.reg file with Notepad and replace all occurrences of xmlfilt.dll by
C:\Windows\system32\xmlfilter.dll
and then save the changes.Note
- This assumes your Windows folder is located at
C:\Windows
. - You need to enter each backslash in the new path twice!
- This assumes your Windows folder is located at
Select the file SQLMSSearch.reg to import the content into the registry.
Run the following T-SQL commands to enable the new setting in SQL Server:
EXEC sp_fulltext_service 'load_os_resources', 1 EXEC sp_fulltext_service 'verify_signature ', 0 EXEC sp_fulltext_service 'update_languages' EXEC sp_fulltext_service 'restart_all_fdhosts'
Check the availability of the new settings with this T-SQL command:
EXEC sp_help_fulltext_system_components 'fullpath','c:\windows\system32\xmlfilter.dll'
Note
You need to adapt the above command to suit the correct Windows system path on your system. Also, make sure that all the languages (respectively their lcid), that should show the new behavior are listed in the column "componentname" in the output.
More information
Steps to reproduce the problem:
Create a simple table as follows:
CREATE TABLE [dbo].[testFTS]([idField] [int] PRIMARY KEY, [xmlField] [xml] NOT NULL)
Enable FTS on the xmlField field (the word breaking language that you select here is not relevant to the repro).
Insert one record on that table, as follows:
INSERT INTO testFTS (idField, xmlField) values (1, '<TopNode TopFirstAtt="TopAttValue" TopSecondAtt="Second Value"> <MiddleNode FirstAtt="ValueOne" SecondAtt="ValueTwo" ThirdAtt="ValueThree"> <BottomNode BottomAtt="BottomValue"> <InnerMostNode>Innermost Text</InnerMostNode> </BottomNode> </MiddleNode> <MiddleNode FirstAtt="ValueFour" SecondAtt="ValueFive" ThirdAtt="ValueSix"> <BottomNode BottomAtt="OtherBottomValue" /> </MiddleNode> <MiddleNode FirstAtt="ValueSeven" SecondAtt="ValueEight" ThirdAtt="ValueNine"> <BottomNode BottomAtt="LastValue" /> </MiddleNode> <NodeWithValue>Testing whether this value will be indexed</NodeWithValue> </TopNode>')
By querying
fts_index_keywords
for this particular table -select * from sys.dm_fts_index_keywords(db_id(),object_id('testFTS'))
, the following results are obtained:keyword display_term column_id document_count 0x0069006E00640065007800650064 indexed 2 1 0x0069006E006E00650072006D006F00730074 innermost 2 1 0x007300650063006F006E0064 second 2 1 0x00740065007300740069006E0067 testing 2 1 0x0074006500780074 text 2 1 0x0074006F007000610074007400760061006C00750065 topattvalue 2 1 0x00760061006C00750065 value 2 1 0x0077006800650074006800650072 whether 2 1 0xFF END OF FILE 2 1 These display terms appear to be restricted to:
- Node values, regardless of how deeply they are nested within the XML document.
- Attribute values only for the top node and not from any of the inner nodes that would be expected.
Feedback
https://aka.ms/ContentUserFeedback.
Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see:Submit and view feedback for