Unicode: A numeric character-encoding system that is defined by the
Unicode Consortium and used by Microsoft Windows and some other computer
Unicode is a 16-bit encoding that encompasses many
characters that are used in general text interchange throughout the world. Each
Unicode index refers unambiguously to a given character. Unicode allows a
larger range of characters to be addressed than is possible by using a
single-byte character encoding. All Unicode values are double-byte, which
simplifies the way that a Unicode-based system reads a string of text. In
comparison, a double-byte system must determine which values in a string are
single-byte character codes and which are double-byte character codes.
Unicode provides a unique number for every character, regardless of
which platform, program, or language is being used. For each character that is
defined in Unicode, you find an assigned code point: a hexadecimal number
(range 0x0000 to 0xFFFF) that is used to represent that character in computer
You may not find the character in what you think is the
obvious place. Although the characters in Unicode are grouped into blocks, this
is only a rough grouping, because characters can be categorized many different
ways. In particular, punctuation and symbols are applicable across a very wide
range of usages and scripts (writing systems).
The fundamental idea behind Unicode is to be
language-independent, which helps conserve space in the character map. No
single character is assumed to identify a language in itself. Just as the
character "a" can be a French, German, or English "a", even if they have
different meanings, a particular Han ideograph may map to a character that is
used in Chinese, Japanese, and Korean.
For more information about
the Unicode Standard 2.1, please browse to the following Web site:
Why is the file size of the Arial Unicode MS font so large?
The file size of the Arial Unicode MS font is 22 MB
because it is a complete Unicode font. It contains all the characters in Arial
plus full fonts for Japanese, Chinese, Korean, Arabic, and Hebrew, plus all of
the different symbol characters and character ranges.
divided into numeric ranges of similar characters. (A numeric range is a range
of numerical values that are available for encoding characters.) For example,
all of the Cyrillic characters are located in the same numeric range. The
following ranges are included in the Arial Unicode MS font.
Spacing Modifier Letters
Combining Diacritical Marks
Latin Extended Additional
Superscripts and Subscripts
Combining Diacritical Marks for Symbols
Optical Character Recognition
CJK Symbols and Punctuation
Hangul compatibility Jamo
Enclosed CJK Letters and Months
CJK Unified Ideographs
CJK Compatibility Ideographs
Alphabetic Presentation Forms
Arabic Presentation Forms-A
Combining Half Marks
CJK Compatibility Forms
Special Form Variants
Arabic Presentation Forms-B
Halfwidth and Fullwidth Forms
NOTE: To see the specific characters that are contained in any of the
ranges, follow these steps:
On the Insert menu in Microsoft Word, click Symbol.
On the Symbols tab in the Symbol dialog box, change the Subset box to the range that you want.
The Arial Unicode MS font supports characters that are
defined in many different code pages. A code page is a coded character set in
which each character is assigned a numeric code. Code pages are usually defined
to support specific languages or groups of languages that share common writing
systems. For example, code page 1253 provides the character codes that are
required in the Greek writing system. Characters from the following code pages
are included in the Arial Unicode MS font.
1252 Latin 1
1250 Latin 2: East Europe
1257 Windows Baltic
936 Chinese:Simplified characters
949 Korean Wansung
950 Chinese:Traditional characters
1361 Korean Johab
Macintosh Character Set (US Roman)
Windows OEM Character Set
869 IBM Greek
866 MS-DOS Russian
865 MS-DOS Nordic
863 MS-DOS Canadian French
861 MS-DOS Icelandic
860 MS-DOS Portuguese
857 MS-DOS IBM Turkish
855 IBM Cyrillic; primarily Russian
852 Latin 2
775 MS-DOS Baltic
708 Arabic; ASMO 708
850 WE/Latin 1
When should I use the Arial Unicode MS font?
The Arial Unicode MS font is intended for use when you
open a document that is formatted with a different language, and you do not
have the specific language font(s) installed on your computer system. If you
work primarily with documents that were created in different languages, you
should install the specific fonts and proofing tools for those languages.
Because of its considerable size and the typographic compromises
that are required to make such a font, the Arial Unicode MS font should only be
used when you cannot use multiple fonts that are tuned for different writing
NOTE: It is recommended that you do not set the Arial Unicode MS font as the default font in
How do I install the Arial Unicode MS font? The Arial Unicode MS font is installed as part of the Microsoft
Office Setup and is part of the International Support features. To install the
Arial Unicode MS font, follow these steps:
Click Start, point to Settings, and then click Control Panel.NOTE: In Microsoft Windows XP, click Start and then click Control Panel.
In Control Panel, click Add/Remove Programs.
Do one of the following.
In Microsoft Windows 98, Microsoft Windows
Millennium Edition (Me), or Microsoft Windows NT 4.0:
On the Install/Uninstall tab, click Microsoft Office XP (or Microsoft Word 2002), and then click Add/Remove. -or-
In Microsoft Windows 2000 or Microsoft Windows XP:
Click Change or Remove Programs, click Microsoft Office XP (or Microsoft Word 2002), and then click Change.
In the Features to install window, click Next.
Click to expand Office Shared Features.
Click to expand International Support.
Click the icon next to Universal Font, and then click Run all from My computer on the
Click Update to complete the installation of the Universal Font (Arial Unicode
MS) to your computer.