Article ID: 184891 - Last Review: July 7, 2008 - Revision: 5.1

Server.HTMLEncode Garbles Extended Characters

System TipThis article applies to a different operating system than the one you are using. Article content that may not be relevant to you is disabled.
This article was previously published under Q184891
We strongly recommend that all users upgrade to Microsoft Internet Information Services (IIS) version 7.0 running on Microsoft Windows Server 2008. IIS 7.0 significantly increases Web infrastructure security. For more information about IIS security-related topics, visit the following Microsoft Web site:
http://www.microsoft.com/technet/security/prodtech/IIS.mspx (http://www.microsoft.com/technet/security/prodtech/IIS.mspx)
For more information about IIS 7.0, visit the following Microsoft Web site:
http://www.iis.net/default.aspx?tabid=1 (http://www.iis.net/default.aspx?tabid=1)
Expand all | Collapse all

SYMPTOMS

ASP script using Server.HTMLEncode produces garbled extended characters in single-byte character set (SBCS) code pages other than code page 1252 (U.S. ANSI).

This affects Eastern European languages, such as Czechoslovakian, Russian, and Hungarian.

CAUSE

This problem may occur due to both of the following causes:
  • HTMLEncode incorrectly writes numeric character entities using SBCS codepoint values instead of the Unicode values. In HTML, &#xxx; entities represent Unicode values, not SBCS codepoints. When HTMLEncode determines that a numeric entity must be written, it must simply write the raw Unicode value in decimal, not the local code page equivalent.

    -and-
  • HTMLEncode uses the wrong logic in determining when to write out a numeric character entity versus simply inserting the raw character. It appears to do the opposite of what is expected; it writes the incorrect character entities for characters that can be represented in Session.Codepage, and writes the nearest precomposed equivalent for other characters.
Instead, it should write the raw SBCS character for characters that can be represented in Session.Codepage, and the Unicode numeric character entity for characters that cannot.

RESOLUTION

To resolve this problem, obtain the latest service pack for Windows NT 4.0 or Windows NT Server 4.0, Terminal Server Edition. For additional information, click the following article number to view the article in the Microsoft Knowledge Base:
152734  (http://support.microsoft.com/kb/152734/EN-US/ ) How to Obtain the Latest Windows NT 4.0 Service Pack

STATUS

Microsoft has confirmed this to be a problem in Internet Information Server version 4.0. This problem was first corrected in Windows NT 4.0 Service Pack 4.0 and Windows NT Server 4.0, Terminal Server Edition Service Pack 4.

APPLIES TO
  • Microsoft Windows NT Server 4.0, Terminal Server Edition
  • Microsoft Internet Information Server 4.0
Keywords: 
kbhotfixserver kbqfe kbbug kbfix kbwinnt400sp4fix KB184891