Article ID: 315580
When you attempt to use versions 3.0 or later of the MSXML parser to parse XML documents that contain certain low-order non-printable ASCII characters (that is, characters below ASCII 32), you may receive the following error message:
An Invalid character was found in text content.
Versions 3.0 and later of the MSXML parser strictly enforce the valid XML character ranges that are defined by the World Wide Web Consortium (W3C) XML language specification. XML documents that are parsed using versions 3.0 or later of MSXML cannot contain characters that fall outside the defined valid XML character ranges. The low-order non-printable ASCII characters in the ranges that are listed in the "More Information" section are not valid XML characters. An XML document that contains instances of these characters is not conformant with the W3C specifications and cannot be parsed successfully with versions 3.0 and later of MSXML.
To resolve this problem, either remove instances of the low-order non-printable ASCII characters, or replace the characters with an alternate valid character such as the space character (ASCII 32, hex #x20). This solution makes the XML document compliant with the W3C specifications. However, removing or replacing instances of these characters may affect other applications that use the data and to which the characters are significant. Such additional impact can only be identified by testing and will need to be addressed by implementing a fix or workaround that is appropriate for a specific situation.
Versions 2.6 and earlier of the MSXML parser permit XML documents to contain low-order non-printable ASCII characters that fall outside the W3C valid XML character ranges. However, the design of versions 3.0 and later of the MSXML parser has been changed to strictly enforce the valid XML character ranges that are defined in the W3C XML language specification. This design change is required to be able to identify non-conformant XML documents.
The following are the valid XML characters and character ranges (hex values) as defined by the W3C XML language specifications 1.0:
The following are the character ranges for low-order non-printable ASCII characters that are rejected by MSXML versions 3.0 and later:
This design change may affect the following users and applications:
For additional information on other known causes and workarounds for the error message that is specified in the 'Symptoms' section, click the article numbers below to view the articles in the Microsoft Knowledge Base:
(http://support.microsoft.com/kb/238833/EN-US/ )PRB: XML Parser: Invalid Character Was Found in Text Content
(http://support.microsoft.com/kb/275883/EN-US/ )INFO: XML Encoding and DOM Interface Methods