PRB: XML Parser: Invalid Character Was Found in Text Content

Article translations Article translations
Article ID: 238833 - View products that this article applies to.
This article was previously published under Q238833
Expand all | Collapse all


When parsing XML that contains "special characters" using the Microsoft XML parser (MSXML), the parser may report the following error message at the line and position of the first special character:
An Invalid character was found in text content.


The XML document is not marked with the proper character encoding scheme.


Specify the proper encoding scheme in the XML processing instruction.

- or -

Re-encode the XML data as proper UTF-8.


This behavior is by design.


"Special character" refers to any character outside the standard ASCII character set range of 0x00 - 0x7F, such as Latin characters with accents, umlauts, or other diacritics. The default encoding scheme for XML documents is UTF-8, which encodes ASCII characters with a value of 0x80 or higher differently than other standard encoding schemes.

Most often, you see this problem if you are working with data that uses the simple "iso-8859-1" encoding scheme. In this case, the quickest solution is usually the first listed prior in the RESOLUTION section. For example, use the following XML declaration:
   <?xml version="1.0" encoding="iso-8859-1" ?>
   ...XML data...
Alternatively, you can encode each of those characters using the numeric entity reference. For example, you can take the special character á, use <test>&#225;</test> (decimal version) or <test> &#x00E1;</test> (hex version).


Article ID: 238833 - Last Review: July 18, 2003 - Revision: 2.2
  • Microsoft Internet Explorer 5.0
  • Microsoft Internet Explorer 5.5
  • Microsoft XML Parser 3.0
  • Microsoft XML Parser 3.0 Service Pack 1
  • Microsoft XML Core Services 4.0
kbintl kbintldev kbprb kbfaq KB238833

Give Feedback


Contact us for more help

Contact us for more help
Connect with Answer Desk for expert help.
Get more support from