Text Files Encoded in Big Endian are Incorrectly Displayed when Opened in WordPad

Applies to: Microsoft Windows XP Home EditionMicrosoft Windows XP ProfessionalMicrosoft Windows XP Starter Edition More

Symptoms


When you open a text file in WordPad which has been encoded in Big Endian, the document will display numerous Asian characters, squares or junk characters.

Cause


This is expected behavior as WordPad does not provide the capability of interpreting UCS-2 Big Endian. When the file is opened it will default to Unicode encoding. Hence incorrect information displayed when viewed.

Note: Big-endian and little-endian references which bytes are most significant in multi-byte data types and describe the order in which a sequence of bytes are stored. In a big-endian system, the most significant value in the sequence are stored at the lowest storage address (i.e., first). In a little-endian system, the least significant value in the sequence are stored first.


Resolution


To work around the issue, you will need to use a text editor which allows the ability to interpret Big-Endian encoding. Notepad is a low level text editor which can interpret text files encoded in Big-Endian, but does not allow the use of different language characters or any formatting. Unicode encoding is similar to Little-endian.

If the text file requires the use of different language characters or additional formatting, it is recommended to use Microsoft Word or some third party text editing application that provides this support.


Steps to Reproduce Behavior
Open Notepad
Enter some text i.e. “This is a test.”
Select Save As from the File drop-down menu
Choose a location, file name, and Save as type (leave the default Text Documents (*.txt)
Change the Encoding by clicking the drop down arrow and choose Unicode big endian
And click Save

Open the file in WordPad
The result will display numerous Asian characters
吀栀椀猀 椀猀 愀 琀攀猀琀

As opposed to
This is a test



More Information


What is the difference between Big Endian and Little Endian Unicode?
http://blogs.msdn.com/b/michkap/archive/2005/02/09/369958.aspx

WordPad
http://en.wikipedia.org/wiki/Word_pad

Byte Order Mark (BOM)
http://en.wikipedia.org/wiki/Byte_order_mark