SharePoint - Sites Collections under managed paths are not crawled

Article ID: 2728313 - View products that this article applies to.
Expand all | Collapse all

Symptoms

Consider the following scenario:

You configure a SharePoint web application with the following information
  • Default zone with one of the authentication mode available
  • Managed path added (wildcard or explicit inclusion) or default Sites managed path used to create new sites collections
  • Client Integration is disabled for the zone

You create the following sites collections

  • First site collection as root (e.g. http://myserver)
  • Second site collection created under 'Sites' managed path (e.g. http://myserver/sites/secondsc)

A full crawl of the content sources is performed.

In this scenario, you observe the following behavior:

  • Data stored in the root site collection (i.e.) http://myserver is searchable. However, there are no results retrieved for data stored in the http://myserver/sites/secondsc site collection
  • There is no error message in the SharePoint ULS logs

Cause

Disabling the "Enable Client Integration" option on the 'Manage Web Applications' page in Central Administration "just" removes 'MicrosoftSharePointTeamServices' from the HTTP Response Headers for the corresponding IIS Web Site.

With enabled client integration the HTTP response headers look like this:


Collapse this imageExpand this image
2785721



After disabling the client integration the HTTP response headers look like this:

Collapse this imageExpand this image
2785722



What happens during a crawl:

When SharePoint crawls a start-address of a SharePoint type content source it will receive a response from the SharePoint server and then look into the HTTP response header.

If it does not find the 'MicrosoftSharePointTeamServices' entry in the response header field which happens with the Client Integration disabled, it will use the HTTP-WebSite protocol handler instead of the SharePoint Site protocol handler.

So if there is no link to subsites on the crawled SharePoint root page, SharePoint will not crawl the subsites – this is how the HTTP Web site Protocol Handler works.

Resolution

  1. Extend the existing web application to another zone – keep the client integration for that web application enabled and change the start address for the crawler to the new zone accordingly.
  2. Enable the Client Integration option for the zone.

Note This is a "FAST PUBLISH" article created directly from within the Microsoft support organization. The information contained herein is provided as-is in response to emerging issues. As a result of the speed in making it available, the materials may include typographical errors and may be revised at any time without notice. See Terms of Use for other considerations.

Properties

Article ID: 2728313 - Last Review: November 20, 2012 - Revision: 4.1
Applies to
  • Microsoft Office SharePoint Server 2007
  • Microsoft SharePoint Server 2010
Keywords: 
KB2728313

Give Feedback

 

Contact us for more help

Contact us for more help
Connect with Answer Desk for expert help.
Get more support from smallbusiness.support.microsoft.com