Article ID: 188340 - Last Review: July 16, 1999 - Revision: 1.0 Search HTML Filter Ignores UTF-8 Character EncodingThis article was previously published under Q188340 SYMPTOMS
Search does not index text on HTML pages that have been UTF-8 encoded.
CAUSE
The HTML filter that ships with Site Server 3.0 is not capable of handling
UTF-8 character encoding.
RESOLUTION
To resolve this problem, apply the latest Site Server 3.0 service pack.
STATUS
Microsoft has confirmed this to be a problem in Site Server version 3.0.
This problem has been corrected in the latest U.S. service pack for
Microsoft Site Server version 3.0. For information about obtaining the
service pack, query on the following word in the Microsoft Knowledge Base
(without the spaces):
S E R V P A C K
MORE INFORMATION
The HTML filter has been updated to support UTF-8 encoding. Also, the
language and codepage tables have been updated.
UTF-8 is not automatically detected. Only documents explicitly tagged with:
<meta http-equiv=content-type content="text/html; charset=utf8">
are interpreted as UTF-8.
UTF-7 and UTF-16 are not supported. | Article Translations
|

Back to the top
