INFO: UTF8 Support

Article translations Article translations
Article ID: 175392 - View products that this article applies to.
This article was previously published under Q175392
Expand all | Collapse all


UTF8 is a code page that uses a string of bytes to represent a 16-bit Unicode string where ASCII text (<=U+007F) remains unchanged as a single byte, U+0080-07FF (including Latin, Greek, Cyrillic, Hebrew, and Arabic) is converted to a 2-byte sequence, and U+0800-FFFF (Chinese, Japanese, Korean, and others) becomes a 3-byte sequence.

The advantage is that most ASCII text remains unchanged and almost all editors can read it.

Windows NT4.0 supports Unicode<->UTF8 translation via MultiByteToWideChar()/WideCharToMultiByte(), using CP_UTF8 for the CodePage parameter, but it only works when none of the flags are set for dwFlags (therefore, you need to specify 0 for dwFlags).

Also, UTF8 is not a valid encoding for command line arguments for Windows NT 4.0 or 5.0, and it is not supported on Windows 95.


Article ID: 175392 - Last Review: July 11, 2005 - Revision: 1.1
  • Microsoft Platform Software Development Kit-January 2000 Edition
kbinfo kbintl kbintldev KB175392

Give Feedback


Contact us for more help

Contact us for more help
Connect with Answer Desk for expert help.
Get more support from