SharePoint 2010: Only the Start address URL of a PHP based Web site content source is being crawled by FAST search connector

Symptoms

When trying to crawl a PHP based web site, the SharePoint crawler (with FAST backend) only processes the first page and will not follow any links thereafter. The "normal" SharePoint crawler (not connected to a FAST backend) crawls this site without error.

Cause

The PHP extension is not added to the list of extended connector (FAST connector) property.

Resolution

Add the file extension PHP for the FAST connector using 'Set-SPEnterpriseSearchExtendedConnectorProperty'.

First get the current value for the extended connector (FAST connector) property using the following PowerShell command:

Get-SPEnterpriseSearchExtendedConnectorProperty –SearchApplication $searchApp –identity ExtensionsToFilter 
where $searchApp is the Fast connector Search Service Application (SSA).

The value returned would be something like “;ascx;asp;aspx;htm;html;jhtml;jsp;”.

Then set the value using for the extended connector (FAST connector) property using the following PowerShell command

Set-SPEnterpriseSearchExtendedConnectorProperty –SearchApplication $searchApp –identity ExtensionsToFilter –Value “;ascx;asp;aspx;htm;html;jhtml;jsp;php;”

More Information

- Set-SPEnterpriseSearchExtendedConnectorProperty http://technet.microsoft.com/en-us/library/ff608013.aspx
- About Windows PowerShell cmdlets (FAST Search Server 2010 for SharePoint) http://technet.microsoft.com/en-us/library/ff393782.aspx
Properties

Article ID: 2550268 - Last Review: Jun 4, 2011 - Revision: 1

Feedback